Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] - wrappers for type-safe ebpf.Variable #1543

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

smagnani96
Copy link
Contributor

@smagnani96 smagnani96 commented Aug 8, 2024

Status

Proposal

Supposing to have an ebpf program that defines the following two global variables:

__u64 pkt_count			= 0;
const __u32 ro_variable	= 1;
char var_msg[]			= "Hello World!";

The steps performed during the Go code generation are as follows:

  1. An alias with the variable name is generated in the Go file:
type PktCountType 	= uint64
type RoVariable 	= uint32
type VarMsgType 	= [16]int8
  1. A new type definition (non-alias, need new methods) for the variables is performed:
type PktCount	ebpf.Variable
type RoVariable	ebpf.Variable
type VarMsg		ebpf.Variable
  1. The .Get methods for all types are generated:
func (v *PktCount) Get() (PktCountType, error) {
	var ret PktCountType
	return ret, (*ebpf.Variable)(v).Get(&ret)
}

func (v *RoVariable) Get() (RoVariableType, error) {
	var ret RoVariableType
	return ret, (*ebpf.Variable)(v).Get(&ret)
}

func (v *VarMsg) Get() (VarMsgType, error) {
	var ret VarMsgType
	return ret, (*ebpf.Variable)(v).Get(&ret)
}
  1. The .Set methods for the non read-only variables are generated:
func (v *PktCount) Set(val PktCountType) error {
	return (*ebpf.Variable)(v).Set(val)
}

func (v *VarMsg) set(val VarMsgType) error {
	return (*ebpf.Variable)(v).Set(ret)
}
  1. In case the type alias is supported by the atomic package, and then
    also supported by ebpf.Variable (int32, uint32, int64, uint64), then
    an additional AtomicRef method is created for the non read-only
    variables to get an reference to the underlying data.
func (v *PktCount) AtomicRef() (*ebpf.Uint64, error) {
	ret, _ := (*ebpf.Variable)(v).AtomicUint64()
	return ret
}

From the user perspective, ignoring the error catching, the following
operations can be performed:

...
objs := bpfObjects{}
err := loadBpfObjects(&objs, nil)
...

// ref kept and can be used multiple times
aPktCount := objs.PktCount.AtomicRef()
vPktCount := aPktCount.Load()
aPktCount.Store(rand.Uint64())
...

vRoVariable, _ := objs.RoVariable.Get()
...

varMsg, _ := objs.VarMsg.Get()
varMsg[0] = 'B'
err := objs.VarMsg.Set(varMsg)

Suggestions/Comments are very welcome.

@smagnani96 smagnani96 force-pushed the pr/relocated_vars_userspace branch from bbe1aba to 799fef7 Compare August 8, 2024 15:57
@github-actions github-actions bot added the breaking-change Changes exported API label Aug 8, 2024
@smagnani96 smagnani96 force-pushed the pr/relocated_vars_userspace branch from 799fef7 to d978138 Compare August 8, 2024 18:05
@github-actions github-actions bot removed the breaking-change Changes exported API label Aug 8, 2024
@smagnani96 smagnani96 force-pushed the pr/relocated_vars_userspace branch 2 times, most recently from fa76e79 to 71e3029 Compare August 14, 2024 10:18
@smagnani96 smagnani96 force-pushed the pr/relocated_vars_userspace branch from 71e3029 to 5137cbe Compare September 4, 2024 11:06
@smagnani96 smagnani96 force-pushed the pr/relocated_vars_userspace branch 3 times, most recently from 2cf61df to 9cf6f9f Compare September 5, 2024 12:51
@smagnani96 smagnani96 changed the title support for interacting with global/static vars add bpf2go support and wrappers for ebpf.Variable generation Sep 5, 2024
@smagnani96
Copy link
Contributor Author

smagnani96 commented Sep 5, 2024

This PR has been rebased on https://github.com/ti-mo/ebpf/tree/tb/ebpf-variables.
After discussing, @ti-mo proceeds with the Map.Memory() api and the definition of ebpf.Variable, while me (this PR) should provide the needed support for bpf2go.

  1. 59808a1 is required but should be introduced in ti-mo updates (open for discussion).
  2. f6a94c5 is the commit under review.

@smagnani96 smagnani96 force-pushed the pr/relocated_vars_userspace branch from 9cf6f9f to 0f69c00 Compare September 5, 2024 13:05
@smagnani96 smagnani96 force-pushed the pr/relocated_vars_userspace branch from 0f69c00 to cf717a0 Compare September 6, 2024 20:54
@smagnani96 smagnani96 force-pushed the pr/relocated_vars_userspace branch 2 times, most recently from 8486b44 to f6a94c5 Compare September 6, 2024 21:31
This commit introduces support and wrappers around ebpf.Variable.
An ebpf.Variable does not define per-se wrappers method to automatically
infer the underlying data type and handle typed data in Go. In fact, it
relies on parameters passed to the `Variable.Get` and `Variable.Set` APIs.
This commit allows `bpf2go` to emit `VariableSpec` and `Variable` in the
generated code. In addition, a new set of wrapper methods are created
to provide support for typed method around variables. To do so, this
commit enables emitting `Dataspec`, which so far was not emitted. However,
it modifies the current implementation so that for each variable in `Dataspec`
the needed support is provided.
Supposing to have an ebpf program that defines the following two global variables:

```c
__u64 pkt_count			= 0;
const __u32 ro_variable	= 1;
char var_msg[]			= "Hello World!";
```

The steps performed during the Go code generation are as follows:

1. An alias with the variable name is generated in the Go file:

```go
type PktCountType 	= uint64
type RoVariable 	= uint32
type VarMsgType 	= [16]int8
```

2. A new type definition (non-alias, need new methods) for the variables is performed:

```go
type PktCount	ebpf.Variable
type RoVariable	ebpf.Variable
type VarMsg		ebpf.Variable
```

3. The `.Get` methods for all types are generated:

```go
func (v *PktCount) Get() (PktCountType, error) {
	var ret PktCountType
	return ret, (*ebpf.Variable)(v).Get(&ret)
}

func (v *RoVariable) Get() (RoVariableType, error) {
	var ret RoVariableType
	return ret, (*ebpf.Variable)(v).Get(&ret)
}

func (v *VarMsg) Get() (VarMsgType, error) {
	var ret VarMsgType
	return ret, (*ebpf.Variable)(v).Get(&ret)
}
```

4. The `.Set` methods for the non read-only variables are generated:

```go
func (v *PktCount) Set(val PktCountType) error {
	return (*ebpf.Variable)(v).Set(val)
}

func (v *VarMsg) set(val VarMsgType) error {
	return (*ebpf.Variable)(v).Set(ret)
}
```

4. In case the type alias is supported by the atomic package, and then
    also supported by `ebpf.Variable` (int32, uint32, int64, uint64), then
    an additional `AtomicRef` method is created for the non read-only
	variables to get an reference to the underlying data.

```go
func (v *PktCount) AtomicRef() (*ebpf.Uint64, error) {
	ret, _ := (*ebpf.Variable)(v).AtomicUint64()
	return ret
}
```

From the user perspective, ignoring the error catching, the following
operations can be performed:

```go
...
objs := bpfObjects{}
err := loadBpfObjects(&objs, nil)
...

// ref kept and can be used multiple times
aPktCount := objs.PktCount.AtomicRef()
err := aPktCount.Load()
aPktCount.Store(rand.Uint64())
...

vRoVariable, _ := objs.RoVariable.Get()
...

varMsg, _ := objs.VarMsg.Get()
varMsg[0] = 'B'
err := objs.VarMsg.Set(varMsg)
```

Signed-off-by: Simone Magnani <[email protected]>
@smagnani96 smagnani96 force-pushed the pr/relocated_vars_userspace branch from f6a94c5 to fb547c3 Compare September 6, 2024 21:53
@mejedi
Copy link
Contributor

mejedi commented Sep 26, 2024

It probably makes sense to land basic bpf2go variable support first:

  • VariableSpecs included in generated xxxSpecs; do we need to segregate them similar to today's xxxProgramSpecs/xxxMapSpecs?
  • Variables included in generated xxxObjects.

I see huge value in a type-safe get/set API you are proposing. It would be nice to have not only Variables covered, but probably VariableSpecs, and Maps, and MapSpecs as well.

Is it possible to devise an approach that works with Variables initially, but can be extended for other object kinds later? Can we generate less code by using generics?

@smagnani96
Copy link
Contributor Author

It probably makes sense to land basic bpf2go variable support first:

  • VariableSpecs included in generated xxxSpecs; do we need to segregate them similar to today's xxxProgramSpecs/xxxMapSpecs?
  • Variables included in generated xxxObjects.

I see huge value in a type-safe get/set API you are proposing. It would be nice to have not only Variables covered, but probably VariableSpecs, and Maps, and MapSpecs as well.

Is it possible to devise an approach that works with Variables initially, but can be extended for other object kinds later? Can we generate less code by using generics?

Do you suggest to change this PR to provide basic VariableSpecs support in bpf2go or should I open a new one?

Concerning the type-safe API, I can keep exploring solutions. I tried with generics with no luck, but maybe there's still a way...AFAIU CollectionSpec would also need to be a generic holding maps and variables each of different generic types.

@mejedi
Copy link
Contributor

mejedi commented Sep 30, 2024

@smagnani96

Do you suggest to change this PR to provide basic VariableSpecs support in bpf2go or should I open a new one?

It is probably worthwhile to keep this one around as a RFC and to showcase the end goal. One way to handle it is to structure commits such that basic support is introduced first, followed by type-safe APIs. Once @ti-mo changes are merged, I'd suggest doing a separate PR to introduce basic variable support in bpf2go. Keep rebasing this one as needed.

Concerning the type-safe API, I can keep exploring solutions. I tried with generics with no luck, but maybe there's still a way...

A clever use of reflect.Value.ConvertibleTo by the way:)

It looks like bpf2go won't have to generate Set and Get methods if we could rely on generic TypedVariable in ebpf package, e.g.:

// in ebpf package
type TypedVariable[T any] Variable
func (v *TypedVariable[T]) Set(val T) error { return (*Variable)(v).Set(val) }
...

AFAIU CollectionSpec would also need to be a generic holding maps and variables each of different generic types.

What I had in mind was xxxSpecs bpf2go generates. It is similar to xxxObjects, but for the specs. VariableSpecs should be included in generated xxxSpecs. They are useful to initialise constants, and it would be nice to do it in a type-safe manner as well.

@smagnani96
Copy link
Contributor Author

It is probably worthwhile to keep this one around as a RFC and to showcase the end goal. One way to handle it is to structure commits such that basic support is introduced first, followed by type-safe APIs. Once @ti-mo changes are merged, I'd suggest doing a separate PR to introduce basic variable support in bpf2go. Keep rebasing this one as needed.

Perfect, I'll keep this one open just as reference and will track #1572. Once merged, I'll push basic support for VariableSpecs and Variables.

It looks like bpf2go won't have to generate Set and Get methods if we could rely on generic TypedVariable in ebpf package, e.g.:

// in ebpf package
type TypedVariable[T any] Variable
func (v *TypedVariable[T]) Set(val T) error { return (*Variable)(v).Set(val) }
...

A wrapper would certainly be a good choice, so we can statically define and test methods rather than generating them at runtime. An approach could be as follows, and then in bpf2go we just output the type accordingly.

type ROVariable[T any] struct {
	Variable
}

type RWStruct[T any] struct {
	ROVariable[T]
}

type AtomicVariable[T int32 | int64 | uint64 | uint32] struct {
	RWVariable[T]
}

func (a *ROVariable[T]) Get() T {
	...
}

func (a *RWVariable[T]) Set(val T) {
	...
}

func (a *AtomicVariable[T]) AtomicReference() atomic.T { //somehow
	...
}

Would you consider such a wrapper also for VariableSpecs (and in case it can be applied also to MapSpecs) as well?

@ti-mo
Copy link
Collaborator

ti-mo commented Oct 3, 2024

@smagnani96 I've pushed up a few changes to #1572 today. Note that all 'atomic' methods are gone. It wasn't a great design to begin with and @mejedi suggested mmapping over the Go heap so we can hand out arbitrary pointers instead. The pointer stuff is not part of the initial proposal, this work is currently in a separate branch here: https://github.com/ti-mo/ebpf/commits/tb/atomic-memory.

tl;dr:

// VariablePointer returns a pointer to a value of type T that has its
// underlying memory mapped to the memory representing the Variable. Accessing a
// constant Variable is not supported.
//
// T must be a fixed-size type according to [binary.Size]. Types containing Go
// pointers are not valid.
//
// When accessing structs, embedding [structs.HostLayout] may help ensure the
// layout of the Go struct matches the one in the BPF C program.
func VariablePointer[T any](v *Variable) (*T, error)

This works for atomics, structs, scalars, etc. Currently no types containing pointers are accepted.

Note that currently, the plan is to not populate Collection.Variables in kernels <5.5, which would be a problem if the 'all objects' struct generated by bpf2go includes Variables by default. We can decide to keep these in a separate object, but that would mean the caller needs to load the whole Collection and then Assign() the variables in a separate step, which is not ideal. Curious to hear your opinions on this.

The alternative is to emit Variables, but disable Set, Get and VariablePointer (make them return errors), which is also far from ideal. If we go down this route, we'd need to introduce a compatibility layer that implements Get() using a map lookup, but then we need to carry a *Map in Variable. This makes the implementation quite a bit more complex.

@mejedi
Copy link
Contributor

mejedi commented Oct 4, 2024

@ti-mo

Note that currently, the plan is to not populate Collection.Variables in kernels <5.5, which would be a problem if the 'all objects' struct generated by bpf2go includes Variables by default. We can decide to keep these in a separate object, but that would mean the caller needs to load the whole Collection and then Assign() the variables in a separate step, which is not ideal. Curious to hear your opinions on this.

Globals landed in 5.2. Call me biased, but my preference would be to inconvenience (a presumably small number of) users requiring at least 5.2 that can't bump the minimum version to 5.5. As of today, bpf2go emits xxxObjects made of xxxPograms and xxxMaps. Should xxxVariables join the party, they can declare and use their own xxxMyObjects embedding just xxxPrograms and xxxMaps.

@ti-mo
Copy link
Collaborator

ti-mo commented Oct 4, 2024

@mejedi That sounds reasonable, thanks for sharing your perspective. The downside is this is kind of a breaking change; rebuilding the same code base with a newer version of b2g and ebpf-go will no longer work as-is on some systems.

@mejedi
Copy link
Contributor

mejedi commented Oct 4, 2024

@smagnani96
Not sure about inheritance, will make the job of CollectionSpec.LoadAndAssign harder. With the latest changes from @ti-mo every rw Variable is capable of giving out the pointer, so there are fewer variations left:

  • ROVariableGet
  • VariableGet, Set, Addr

(Not sure how we should name these generics, TypedVariable or VariableT or something else?)

The overlap is just the Get method, we can afford to implement it twice for 2 distinct types.

type ROVariableT[T any] Variable
type VariableT[T any] Variable

In return, CollectionSpec.LoadAndAssign can use reflect.Value.ConvertibleTo and avoid further complication.

Note: .Addr is how reflect names a method to get a pointer to the Value. If you ask me, .AtomicReference is way too long.

Couple further questions to consider:

  1. Should Get return T or (T, error)? Similarly for Set.

    It will be more convenient if user didn't have to handle errors. Errors happen because T fails to marshal but in bpf2go we are generating these types and we know that they will marshal cleanly.

  2. A subset of types has a matching type in sync/atomic, such as atomic.Uin64, etc. Should .Addr return a pointer to type from sync/atomic if available?

var objects struct { foo *VariableT[uint64, atomic.Uint64] `ebpf:"foo"` }
objects.foo.Addr().Load()

v.s.

var objects struct { foo *VariableT[uint64] `ebpf:"foo"` }
atomic.LoadUint64(objects.foo.Addr())

First option guarantees that value is accessed atomically. It also involves less typing.

Second option is more consistent. Get, Set and Addr types are consistent. Memory is accessed the same whether it is a standalone value or a compound type:

type compound { bar uint64 }
var objects struct { foo *VariableT[compound] `ebpf:"foo"` }
atomic.LoadUint64(objects.foo.Addr().bar)

Would you consider such a wrapper also for VariableSpecs (and in case it can be applied also to MapSpecs) as well?

Should be easier since we have just one flavour of VariableSpec.

@ti-mo
Copy link
Collaborator

ti-mo commented Oct 24, 2024

I've updated #1572 to emit a Variable to the Collection for every VariableSpec in the CollectionSpec. Get(), Set() and VariablePointer() will return ErrNotSupported on kernels without BPF_F_MMAPABLE. This ensures LoadAndAssign will continue to work as usual after upgrades, but users opting in to interactions with Variable can only expect that to work on newer kernels. I think that's more user-friendly than failing LoadAndAssign.

@ti-mo
Copy link
Collaborator

ti-mo commented Nov 7, 2024

@smagnani96 FYI #1572 has been merged, and #1607 was opened to keep track of the off-heap Go pointers, though keep in mind this might take a while to be resolved upstream. Then it's at least another 6 months of Go releases until ebpf-go can start depending on the behaviour.

I suggest we shelve the atomics use case in bpf2go for now, or we special-case atomic.(U)int* in Variable.Set() and .Get(). Only loads and stores would be supported, but those may cover the majority of use cases. If really needed, we can consider implementing Variable.Add(), but that gets weird when used on structs. Let's start with the basics, we can expand as we go.

@smagnani96
Copy link
Contributor Author

smagnani96 commented Nov 13, 2024

Apologies for the absence.
Thanks both @ti-mo and @mejedi for landing the new API.
I like that the ErrNotSupported solution for older kernels, no changes to the usual workflow.

bpf2go basic support

I just opened #1610.
It introduces simple support to emit Variables and VariableSpecs, no type-safe methods (will continue the discussion here).
There seems to be a lot of changes, but they're mainly due to the third commit (run make), where I regenerated the targets for the further testing and update of the tcx example.
Let me know if anything else is missing, or if we prefer to split the PR.

Type-safe API -- New Types

type ROVariableT[T any] Variable
type VariableT[T any] Variable

Better than inheritance, but we would still need to re-implement all the other methods, such as Variable.String(), Variable.Size() etc.. This is because these are new types, therefore do no share the method set of Variable. No big deal, we could simply:

func (v *ROVariableT[T]) String() string { return (*Variable)(unsafe.Pointer(v).String() }

To all those methods who do not need particular changes. This would also allow us to remove methods such as Variable.ReadOnly() to types like ROVariableT[T], which is pretty self explanatory.

Otherwise, a way to avoid reimplementing things would be to move all these "shared" methods to a base struct that will be inherited by both Variable and the generics:

type baseVariable struct{size, mm, ...}
type Variable struct {
  baseVariable
}
type ROVariableT[T any] Variable
type VariableT[T any] Variable

At this point, ROVariableT does not have the Variable methods (set and get), but will inherit all the baseVariable ones. Same can be applied for xxxSpec and Map. The only drawback here is that all the current creations of these object (also in Cilium and all other projects using this library) would require a slight change to:

ms := &VariableSpec {
   baseVariableSpec:  baseVariableSpec{
       // all the previous VariableSpec fields moved now to baseVariableSpec
  }
} 
// or we define constructors to hide the baseVariableSpec
ms := NewVariableSpec(....)

That said, in Assign/LoadAndAssign we would just need to change:

func (cs *CollectionSpec) Assign(to interface{}) error {
	getValue := func(typ reflect.Type, name string) (interface{}, error) {
		switch typ {
                ......
		case reflect.TypeOf((*VariableSpec)(nil)):
                        return handleVariableSpec()
		default:
			if typ.ConvertibleTo(reflect.TypeOf((*VariableSpec)(nil))) {
				return handleVariableSpec()
			}
                        // Same for MapSpec
			return nil, fmt.Errorf("unsupported type %s", typ)
		}
	}
	return assignValues(to, getValue)
}

and:

func assignValues(to interface{},
        .....
	for _, field := range fields {
		......
		if !field.value.CanSet() {
			return fmt.Errorf("field %s: can't set value", field.Name)
		}
		
		t := reflect.ValueOf(value)
		if field.Type == t.Type() {
			field.value.Set(reflect.ValueOf(value))
			assigned[e] = field.Name
			continue
		}

		if t.Type().ConvertibleTo(field.Type) {
			field.value.Set(t.Convert(field.Type))
			assigned[e] = field.Name
			continue
		}
		return fmt.Errorf("Unable to assign/convert/blabla")

	}

	return nil
}

Couple further questions to consider:

  1. Should Get return T or (T, error)? Similarly for Set.

I would say, if it was something purely on bpf2go we could skip the error check and return just T to user. But since I think we don't want bpf2go random code but rather an API for the whole library (e.g. ROVariableT[T]), it makes sense to return the error. We can't trust a user to use `ROVariableT[int64] with an underlying int32 variable.

Type-safe API -- Inheritance/Embedding

Plain embedding would not require further changes when declaring untyped VariableSpec, MapSpec, Map, Variable, but it won't be possible to unset the inherited class methods:

type ROVariableT[T any] struct {
	Variable
}
func (v *ROVariableT[T]) Get() (T, error) {
	var ret T
	return ret, v.Variable.Get(&ret)
}

ROVariable[T].Set(...) would still exists, no good. In Collection, we could just check:

if _, ok := typ.Elem().FieldByName("Variable"); ok {
	assignedVars[name] = true
	return loader.loadVariable(name)
}

and in assignValues():

func assignValues(to interface{},
	..........
	for _, field := range fields {
		........
		if _, ok := field.Type.Elem().FieldByName(t.Type().Elem().Name()); !ok {
			x := reflect.New(field.Type.Elem())
			x.Elem().FieldByName(t.Type().Elem().Name()).Set(t.Elem())
			field.value.Set(x)
			assigned[e] = field.Name
		}
		return fmt.Errorf("unable to set/cast/assign")
	}
	return nil
}

MapMemory and Variable Atomic Ref

  1. A subset of types has a matching type in sync/atomic, such as atomic.Uin64, etc. Should .Addr return a pointer to type from sync/atomic if available?
var objects struct { foo *VariableT[uint64, atomic.Uint64] `ebpf:"foo"` }
objects.foo.Addr().Load()

v.s.

var objects struct { foo *VariableT[uint64] `ebpf:"foo"` }
atomic.LoadUint64(objects.foo.Addr())

I wouldn't go for the 1st option. In that case, I would just write our support for all the atomic methods (or a subset):

type VariableT[T] Variable

func(v *Variable[int32]) Load() T {}
func(v *Variable[int32]) Store() T {}
func(v *Variable[int32]) Swap() T {}
func(v *Variable[int32]) Add() T {}
func(v *Variable[int32]) CompareAndSwap() T {}

But this would require 5 (methods) * 4 (types, int32 int64 uint32 uint64) * 2 (ROVariableT and VariableT) = 40 additional methods.

However, I really like your 2nd option, With Timo's .Addr() method. We wouldn't have to write additional code. and is up to the user to use the correct atomic.XXX method -- but the type T would be available, so is user's fault in case of error.

@smagnani96 smagnani96 changed the title add bpf2go support and wrappers for ebpf.Variable generation [RFC] - wrappers for type-safe ebpf.Variable Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants