Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebIDL Binder does not support the full range of values for unsigned integers #22134

Open
isaac-mason opened this issue Jun 23, 2024 · 6 comments

Comments

@isaac-mason
Copy link

isaac-mason commented Jun 23, 2024

Is it possible to opt-in to representing an unsigned int using HEAPU32 with the webidl binder?

When using the webidl binder and binding an attribute or function that uses unsigned int, the max unsigned int value 0xffffffff is returned as -1 in javascript.

It appears this is because unsigned integers are treated as signed integers.

This behaviour can be seen in the webidl binder tests:

return unsigned long -1

Looking at some past commits, it appears this behaviour is intentional for performance reasons: f1c42f4

Looking now, it worked fine except for one HEAPU32 which should be HEAP32 (for performance; there is never a point to using the unsigned 32 bit heap unless you really really must).

I am using the webidl binder to create bindings for a library which uses 0xffffffff as a constant with a special meaning. Right now to work around this issue, in javascript I check for the value -1. This is a fine workaround for my case but feels like a hack, and would stop working if I needed to check for other values in the upper unsupported range.


Version of emscripten/emsdk:

emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.61 (67fa4c16496b157a7fc3377afd69ee0445e8a6e3)
clang version 19.0.0git (https:/github.com/llvm/llvm-project 7cfffe74eeb68fbb3fb9706ac7071f8caeeb6520)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /Users/isaacmason/Development/emsdk/upstream/bin
@sbc100
Copy link
Collaborator

sbc100 commented Jun 23, 2024

Im not aware of any performance issues with associated with HEAPU32. @kripken perhaps you can elaborate?

@kripken
Copy link
Member

kripken commented Jun 25, 2024

Historically some JS engines optimize JS Numbers when they are "small integers", which are typically 32-bit signed integers. Using HEAP32 ensures we fall into that range, and the unsigned heap might cause the optimization to fail and return to modeling that variable as a Number (double). I am actually not sure how significant this is these days - perhaps optimizations have improved?

In any case, @isaac-mason , in general when you care about the difference between signed and unsigned values then you need to cast on the boundary. Using (X) >>> 0 will turn a value into an unsigned 32-bit integer.

@sbc100
Copy link
Collaborator

sbc100 commented Jun 25, 2024

Historically some JS engines optimize JS Numbers when they are "small integers", which are typically 32-bit signed integers. Using HEAP32 ensures we fall into that range, and the unsigned heap might cause the optimization to fail and return to modeling that variable as a Number (double). I am actually not sure how significant this is these days - perhaps optimizations have improved?

In any case, @isaac-mason , in general when you care about the difference between signed and unsigned values then you need to cast on the boundary. Using (X) >>> 0 will turn a value into an unsigned 32-bit integer.

Will reading values larger than 2^31 from HEAPU32 still require >>> 0 in order to get an unsigned value?

@sbc100
Copy link
Collaborator

sbc100 commented Jun 25, 2024

Historically some JS engines optimize JS Numbers when they are "small integers", which are typically 32-bit signed integers. Using HEAP32 ensures we fall into that range, and the unsigned heap might cause the optimization to fail and return to modeling that variable as a Number (double). I am actually not sure how significant this is these days - perhaps optimizations have improved?
In any case, @isaac-mason , in general when you care about the difference between signed and unsigned values then you need to cast on the boundary. Using (X) >>> 0 will turn a value into an unsigned 32-bit integer.

Will reading values larger than 2^31 from HEAPU32 still require >>> 0 in order to get an unsigned value?

I confirmed you don't need the >>> 0 if you read from HEAPU32

@isaac-mason
Copy link
Author

isaac-mason commented Jun 26, 2024

Thanks @kripken @sbc100 for the information!

I misunderstood the -1 return as some behaviour where the upper range was being lost. Thanks for clearing this up for me 🙂

I'm happy to contribute a change to the webidl binder docs to explain this behaviour.


I am actually not sure how significant this is these days - perhaps optimizations have improved?

Are there suitable existing benchmarks in the emscripten repo that we could try answering this with?

@kripken
Copy link
Member

kripken commented Jun 26, 2024

Hmm, it's hard to measure this as it depends on a bunch of heuristics JS engines have. I don't think we have any good benchmarks for it. In general "use 31-bit integers" was a JS best practice for performance back in the day, but even then it was not entirely reliable.

Doc improvements would be welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants