Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pulley: Implement 16x8 arithmetics #9904

Merged
merged 4 commits into from
Dec 28, 2024
Merged

Conversation

eagr
Copy link
Contributor

@eagr eagr commented Dec 27, 2024

Helps #9783

@eagr eagr requested review from a team as code owners December 27, 2024 04:31
@eagr eagr requested review from cfallin and dicej and removed request for a team December 27, 2024 04:31
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator pulley Issues related to the Pulley interpreter labels Dec 27, 2024
Copy link

Subscribe to Label Action

cc @fitzgen

This issue or pull request has been labeled: "cranelift", "pulley"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: pulley

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

let b = self.state[operands.src2].get_u16x8();
for (a, b) in a.iter_mut().zip(&b) {
// rounding average
*a = (*a & *b) + ((*a ^ *b) >> 1) + ((*a ^ *b) & 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this perhaps be written as *a = (u32::from(a) + u32::from(b) + 1) / 2? Basically the more classical version of an average which explicitly takes advantage of wider-precision arithmetic to avoid overflow issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instantly convinced :))

@alexcrichton alexcrichton added this pull request to the merge queue Dec 28, 2024
Merged via the queue into bytecodealliance:main with commit 01a43ed Dec 28, 2024
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift Issues related to the Cranelift code generator pulley Issues related to the Pulley interpreter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants