Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Does NPU or GPU support accumulator data type INT32 in convolution? #28193

Open
3 tasks done
wenxuanxie opened this issue Dec 24, 2024 · 5 comments
Open
3 tasks done
Assignees
Labels
bug Something isn't working PSE support_request

Comments

@wenxuanxie
Copy link

OpenVINO Version

2024.6.0

Operating System

Windows System

Device used for inference

NPU

Framework

None

Model used

Custom (1 conv + 1 convert, see below)

Issue description

I want to perform integer-based convolution and have designed a simple test case for this purpose. The network consists of a single convolution layer, where both the input tensor and the weight tensor have the shape [1, 11111, 1, 1] and are filled with the value 127. The output is then converted to INT32. The expected output of this operation should be 127 * 127 * 11111 = 179209319. However, I have tested 3 different configurations as shown below, and none of them produced the expected output.

Note that ('NPU', np.int8) appears to be an invalid configuration and will raise errors.

import openvino as ov
from openvino.runtime import opset15
import numpy as np

# configurations
device, dtype = ('NPU', np.int32)    # output: 65504
# device, dtype = ('GPU', np.int32)  # output: 179199248
# device, dtype = ('GPU', np.int8)   # output: 179209312

shape = (1, 11111, 1, 1)
np_array = np.full(shape, 127, dtype=dtype)

input = opset15.parameter(shape, dtype)
weight = opset15.constant(np_array)
conv = opset15.convolution(input, weight, (1, 1), (0, 0), (0, 0), (1, 1))
conv = opset15.convert(conv, np.int32)

model = ov.Model(conv, [input], "conv_model")
compiled_model = ov.Core().compile_model(model, device)

output_layer = compiled_model.output(0)
output_data = compiled_model(np_array)[output_layer]
print(output_data.item())

Step-by-step reproduction

  1. Prepare OpenVINO package. For example, pip install openvino==2024.6.0.
  2. Save the code snippet above as a local file, e.g., test_case.py.
  3. python test_case.py.
  4. A single integer will appear in the command line: either 65504, 179199248, or 179209312, depending on the configuration.

Relevant log output

No response

Issue submission checklist

  • I'm reporting an issue. It's not a question.
  • I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
  • There is reproducer code and related data files such as images, videos, models, etc.
@Aznie-Intel
Copy link

Aznie-Intel commented Dec 26, 2024

Hi, @wenxuanxie Both NPU and GPU hardware generally support INT32 for accumulation in convolution operations. However, they often optimize performance using lower precision types (e.g., INT8) for the weights and activations, while ensuring that INT32 is used for accumulation to avoid overflow.

You correctly calculated the expected output for the convolution of two tensors filled with 127 as 127 * 127 * 11111 = 179209319. However, the actual behavior in OpenVINO and hardware configurations can vary slightly due to rounding, internal optimizations, or hardware limitations. Can you share the error when configuring with NPU?

@wenxuanxie
Copy link
Author

Hi @Aznie-Intel, yes that's exactly the question: OpenVINO compiles the graph and floating-point values are introduced somewhere in between. I wonder if there is a way to achieve INT8 * INT8 => INT32, without any FP16 or FP32.

The error for the configuration ('NPU', np.int8) is as follows.

[ERROR] 16:01:58.794 [vpux-compiler] Got Diagnostic at loc(fused<{name = "Convolution_4", type = "Convolution"}>["Convolution_4"]) : 'IE.Convolution' op operand #0 must be ranked tensor of 16-bit float or 32-bit float or QuantizedType or 32-bit signed integer values, but got 'tensor<1x11111x1x1xsi8>'
loc(fused<{name = "Convolution_4", type = "Convolution"}>["Convolution_4"]): error: 'IE.Convolution' op operand #0 must be ranked tensor of 16-bit float or 32-bit float or QuantizedType or 32-bit signed integer values, but got 'tensor<1x11111x1x1xsi8>'
Traceback (most recent call last):
File "", line 1, in
File "c:\Users\wenxuanxie\AppData\Local\miniconda3\envs\debug\Lib\multiprocessing\spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\wenxuanxie\AppData\Local\miniconda3\envs\debug\Lib\multiprocessing\spawn.py", line 131, in _main
prepare(preparation_data)
File "c:\Users\wenxuanxie\AppData\Local\miniconda3\envs\debug\Lib\multiprocessing\spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "c:\Users\wenxuanxie\AppData\Local\miniconda3\envs\debug\Lib\multiprocessing\spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 287, in run_path
File "", line 98, in _run_module_code
File "", line 88, in _run_code
File "C:\Work\Code\OpenVINO-Test\test_case.py", line 18, in
compiled_model = ov.Core().compile_model(model, device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\wenxuanxie\AppData\Local\miniconda3\envs\debug\Lib\site-packages\openvino\runtime\ie_api.py", line 543, in compile_model
super().compile_model(model, device_name, {} if config is None else config),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Exception from src\inference\src\cpp\core.cpp:107:
Exception from src\inference\src\dev\plugin.cpp:53:
Exception from src\plugins\intel_npu\src\plugin\src\plugin.cpp:717:
Exception from src\plugins\intel_npu\src\compiler_adapter\src\ze_graph_ext_wrappers.cpp:389:
L0 pfnCreate2 result: ZE_RESULT_ERROR_INVALID_ARGUMENT, code 0x78000004 - generic error code for invalid arguments . Failed to create a valid MLIR module for the IR model

@Aznie-Intel
Copy link

The error you're seeing happens because the NPU plugin does not directly support INT32 tensors for convolution operations. Can you try to modify your code as below:

device, dtype = ('NPU', np.int8)

@wenxuanxie
Copy link
Author

Yes, I did it correctly. The error log above is for the configuration device, dtype = ('NPU', np.int8)

@Aznie-Intel
Copy link

Apologies for the oversight on my part. I will forward this issue to the relevant team for further review and will keep you updated with any progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working PSE support_request
Projects
None yet
Development

No branches or pull requests

3 participants