Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] New architecture: wasm #1552

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

[RFC] New architecture: wasm #1552

wants to merge 1 commit into from

Conversation

alban
Copy link
Contributor

@alban alban commented Aug 26, 2024

cilium/ebpf currently does not compile to wasm with tinygo. This patch makes it compile.

I am working on a website where users can submit ebpf binaries (either in ELF format or as an Inspektor Gadget export) and the website will parse the ebpf binary client-side with wasm. The wasm code is written in Go, using cilium/ebpf and is compiled with tinygo.

My wasm code uses ebpf.LoadCollectionSpecFromReader() to display information about the ebpf binary. But it will not call ebpf.NewCollection() because the wasm/javascript environment in the browser cannot interact with the Linux kernel.


The website PoC looks like this:

image

Since cilium/ebpf is running in the browser, I can use Chrome DevTools to profile the performance:

Screenshot-Chrome-DevTools-cilium-ebpf

internal/io.go Outdated Show resolved Hide resolved
cilium/ebpf currently does not compile to wasm with tinygo. This patch
makes it compile.

I am working on a website where users can submit ebpf binaries (either
in ELF format or as an Inspektor Gadget export) and the website will
parse the ebpf binary client-side with wasm. The wasm code is written in
Go, using cilium/ebpf and is compiled with tinygo.

My wasm code uses ebpf.LoadCollectionSpecFromReader() to display
information about the ebpf binary. But it will not call
ebpf.NewCollection() because the wasm/javascript environment in the
browser cannot interact with the Linux kernel.

Signed-off-by: Alban Crequy <[email protected]>
@alban alban marked this pull request as ready for review August 26, 2024 16:27
@alban alban requested review from dylandreimerink and a team as code owners August 26, 2024 16:27
@alban
Copy link
Contributor Author

alban commented Aug 26, 2024

I cleaned up the code and marked the PR as ready.

Wasm is little endian and has a page size of 64 KiB according to its spec.

Copy link
Collaborator

@ti-mo ti-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Alban, very cool project! Note that we don't test wasm builds so this can regress at any time in the future. Feel free to add wasm builds to CI, and maybe create an issue to get (part of) the test suite to run on wasm.

For ebpf-on-windows, we're looking at the opposite: there are no CollectionSpecs on windows, so no ELF loader needed. (They ship bpf in signed winPE instead) Tests that load an ELF and put it into the kernel need to be skipped there. We'll need some generic mechanism to skip tests like we do for kernel versions etc.


func Getpagesize() int {
// A WebAssembly page has a constant size of 65,536 bytes, i.e., 64KiB
return 64 * 1024
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know much about tinygo, but some Googling reveals that it implements a subset of the Go "os" package, and it looks like it implements os.Getpagesize() as well for the arches it targets. Why is it hardcoded here? This is the crux of the PR but isn't mentioned anywhere.

In any case, I'd like this to go into the respective stdlib, either in Google's implementation or tinygo if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the following error if I try to use os.Getpagesize:

$ tinygo build -o wasm.wasm main.go
# github.com/cilium/ebpf/internal
../../cilium/ebpf/internal/page_wasm.go:8:12: undefined: os.Getpagesize

According to https://pkg.go.dev/github.com/tinygo-org/tinygo/src/os#Getpagesize, it was added in tinygo 0.24.0 and I use 0.30.0, but it still does not work.

Issue: tinygo-org/tinygo#1285

I agree it should ideally be fixed in tinygo or Go... I could try to make a PR there. But I would like to have a workaround in cilium/ebpf meanwhile. Btw, syscall.Getpagesize() works, presumably thanks to tinygo-org/tinygo#2856; could that be an option to use that in cilium/ebpf as a workaround?

In tinygo-org/tinygo#1285, they suggest to return -1... That would not work for cilium/ebpf when used in unix.Mmap but we wouldn't call those methods from wasm context. And for internal.NewBufferedSectionReader(n=-1), it seems the implementation would use a buffer of 16 bytes when given a negative number, so that would work (but might be slow).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could try to make a PR there. But I would like to have a workaround in cilium/ebpf meanwhile.

If we merge a workaround, there's no guarantee you will get around to making that upstream contribution. 😉 I really don't want to hardcode non-bpf-related platform constants, that's definitely the runtime/stdlib's responsibility.

syscall.Getpagesize() works

Underlyingly, os.Getpagesize() simply calls syscall.Getpagesize() (https://cs.opensource.google/go/go/+/refs/tags/go1.23.0:src/os/types.go;l=13), so sounds like a trivial fix in tinygo? Might be a good exercise to understand the underlying tech a bit better.

I prefer to make only the absolute minimum changes to the lib to force us to address issues in the underlying layers. This is good for the health of the platform and potentially enables more software to run correctly out-of-the-box.

if runtime.GOOS == "js" {
// In Javascript environments, BPF objects are not going to be
// loaded to a kernel, so we can use dots without testing.
dotAllowed = true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this logic in objNameAllowsDot instead? Also, instead of hardcoding a check on runtime.GOOS, we should let the syscall fail and conclude that it means a dot can be used.

Copy link
Collaborator

@ti-mo ti-mo Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some reading on this last week, apparently ENOSYS is returned by much of the tinygo syscall package for wasm. This would be a good sentinel to conclude that a syscall can't be made and we need to fall back to a sane default. We can make things on the Windows side behave similarly, so high-level logic only needs to check for ENOSYS.

cc @lmb

@lmb
Copy link
Collaborator

lmb commented Oct 17, 2024

I blown away that this works, it's a very cool idea!

Note that we don't test wasm builds so this can regress at any time in the future. Feel free to add wasm builds to CI, and maybe create an issue to get (part of) the test suite to run on wasm.

Before going down this route: adding wasm to CI kind of implies that we will start maintaining a wasm "port". I personally don't have the capacity to do that. I really like this kind of work, but adding wasm support feels like a very steep price to pay to (essentially) get access to the elf reader.

Maybe a more maintainable solution would be to split the elf reader into a separate, cross-platform package? This is in line with the design goals we have for it. It would also mean moving a lot of code around unfortunately, and might get stuck in circular dependency hell.

@ti-mo
Copy link
Collaborator

ti-mo commented Oct 24, 2024

Before going down this route: adding wasm to CI kind of implies that we will start maintaining a wasm "port". I personally don't have the capacity to do that. I really like this kind of work, but adding wasm support feels like a very steep price to pay to (essentially) get access to the elf reader.

If it largely works as-is with a few lines of changes, is there a real downside? If we get some tests to run on tinygo (you just added the infra for it! 😉), that could be a good way of keeping the elf reader hermetic. (e.g. no touching /proc during ELF->CollectionSpec parsing, see recent example: 7f04cae, this would've broken Alban's use case as well)

This pales in comparison to, let's say, adding ebpf-for-windows support. 😉

Maybe a more maintainable solution would be to split the elf reader into a separate, cross-platform package? This is in line with the design goals we have for it.

Not sure what you mean in practice, but we could potentially move some ELF reader logic into platform-specific internal packages. If you mean moving CollectionSpec and friends into a subpackage, that ship has long sailed I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants