-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Does NPU or GPU support accumulator data type INT32 in convolution? #28193
Comments
Hi, @wenxuanxie Both NPU and GPU hardware generally support INT32 for accumulation in convolution operations. However, they often optimize performance using lower precision types (e.g., INT8) for the weights and activations, while ensuring that INT32 is used for accumulation to avoid overflow. You correctly calculated the expected output for the convolution of two tensors filled with 127 as 127 * 127 * 11111 = 179209319. However, the actual behavior in OpenVINO and hardware configurations can vary slightly due to rounding, internal optimizations, or hardware limitations. Can you share the error when configuring with NPU? |
Hi @Aznie-Intel, yes that's exactly the question: OpenVINO compiles the graph and floating-point values are introduced somewhere in between. I wonder if there is a way to achieve INT8 * INT8 => INT32, without any FP16 or FP32. The error for the configuration
|
The error you're seeing happens because the NPU plugin does not directly support INT32 tensors for convolution operations. Can you try to modify your code as below: device, dtype = ('NPU', np.int8) |
Yes, I did it correctly. The error log above is for the configuration |
Apologies for the oversight on my part. I will forward this issue to the relevant team for further review and will keep you updated with any progress. |
OpenVINO Version
2024.6.0
Operating System
Windows System
Device used for inference
NPU
Framework
None
Model used
Custom (1 conv + 1 convert, see below)
Issue description
I want to perform integer-based convolution and have designed a simple test case for this purpose. The network consists of a single convolution layer, where both the input tensor and the weight tensor have the shape
[1, 11111, 1, 1]
and are filled with the value127
. The output is then converted toINT32
. The expected output of this operation should be127 * 127 * 11111 = 179209319
. However, I have tested 3 different configurations as shown below, and none of them produced the expected output.Note that
('NPU', np.int8)
appears to be an invalid configuration and will raise errors.Step-by-step reproduction
pip install openvino==2024.6.0
.test_case.py
.python test_case.py
.65504
,179199248
, or179209312
, depending on the configuration.Relevant log output
No response
Issue submission checklist
The text was updated successfully, but these errors were encountered: