Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve decoding performance #92

Closed
vincentbernat opened this issue Aug 22, 2022 · 5 comments
Closed

Improve decoding performance #92

vincentbernat opened this issue Aug 22, 2022 · 5 comments

Comments

@vincentbernat
Copy link

Hey!

I am looking at improving performance on an app using heavily geoip lookup and I am wondering if there was something between using reflect and a custom deserializer (as introduced in #74). reflect can be slow and it should be possible to use something like DecodeFieldAsString(offset uintptr, fields ...string) string.

I didn't try to use the deserializer interface as it is not really documented. An alternative would be an example on how to use it. The associated tests are strictly technical, so it is hard to understand how difficult this is.

@oschwald
Copy link
Owner

For my use cases, decoding to a minimal struct that only has the fields I am interested in has been fast enough. Have you tried this already? It still involves reflection, but less of it and most of the time ends up being spent in doing the lookup in the binary tree.

As far as the deserializer interface goes, there is an example of how it is used in mmdbwriter. This sounds like it is likely a bit heavier than what you want here.

Similarly, there is this recent PR. I haven't had a chance to really look at it yet. From glancing at it, it seems basically equivalent to the deserializer; I am not completely sure what it adds and I am a bit reluctant to expand the public API that much.

As far as your particular suggestion goes, it sounds like you want something similar to the data lookup functions in libmaxminddb. I am not opposed to that necessarily, although I am reluctant to add a method per data type. Perhaps something could be done with generics, although it is hard to see how that would work without any reflection.

@vincentbernat
Copy link
Author

vincentbernat commented Aug 22, 2022 via email

@damz
Copy link

damz commented Aug 23, 2022

I am looking at improving performance on an app using heavily geoip lookup and I am wondering if there was something between using reflect and a custom deserializer (as introduced in #74). reflect can be slow and it should be possible to use something like DecodeFieldAsString(offset uintptr, fields ...string) string.

@vincentbernat My PR #91 gives you pretty much that.

We are using this to decode data stored in a maxmind database to protobuf-defined structs.

From a protobuf message like this:

message ASN {
    uint32 number = 1;
    string name = 2;
}

Our code generator generates:

type ASN struct {
	// [...]

	Number uint32 `protobuf:"varint,1,opt,name=number,proto3" json:"number,omitempty"`
	Name   string `protobuf:"bytes,2,opt,name=name,proto3" json:"name,omitempty"`
}

func (m *ASN) DecodeMaxMind(d *maxmind.Decoder) error {
	return d.DecodeMap(func(key string, valueDecoder *maxmind.Decoder) error {
		switch key {
		case "number":
			{
				var err error
				m.Number, err = valueDecoder.DecodeUInt32()
				if err != nil {
					return err
				}
			}
		case "name":
			{
				var err error
				m.Name, err = valueDecoder.DecodeString()
				if err != nil {
					return err
				}
			}
		}
		return nil
	})
}

The implementation supports all the underlying types in a reasonable way (i.e. scalars decode to their corresponding Go types, maps and structs you can iterate on), and is fully tested.

@vincentbernat
Copy link
Author

@damz Oh, understood! Yes, this interface seems simpler than the other one.

OTOH, after providing my own structures with far less fields, the performance using reflect is good enough for me as maxminddb is now a very minor contributor in the pprof profile.

@oschwald
Copy link
Owner

oschwald commented Jan 1, 2025

Given the above discussion, I am going to close this issue. v2 also includes a DecodePath that may be useful if you just need a single value.

@oschwald oschwald closed this as completed Jan 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants