Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Model inference on Windows LNL NPU for openai/clip-vit-large-patch14 is not working #28171

Open
3 tasks done
azhuvath opened this issue Dec 20, 2024 · 3 comments
Open
3 tasks done
Assignees
Labels
bug Something isn't working category: NPU OpenVINO NPU plugin

Comments

@azhuvath
Copy link

azhuvath commented Dec 20, 2024

OpenVINO Version

2024.6

Operating System

Windows

Device used for inference

NPU

Framework

None

Model used

openai/clip-vit-large-patch14

Issue description

Model inference on Windows LNL NPU for openai/clip-vit-large-patch14 is not working. Error observed is as follows.

[ERROR] 05:26:28.301 [vpux-compiler] Got Diagnostic at loc(fused<{name = "__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution", type = "Convolution"}>["__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution"]) : Channels count of input tensor shape and filter shape must be the same: -9223372036854775808 != 3

loc(fused<{name = "__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution", type = "Convolution"}>["__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution"]): error: Channels count of input tensor shape and filter shape must be the same: -9223372036854775808 != 3
LLVM ERROR: Failed to infer result type(s).

Step-by-step reproduction

Create Environment

python -m venv npu_env
./npu_env/Scripts/activate
python -m pip install --upgrade pip
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install pillow scikit-learn requests transformers openvino

Code to execute. Change CPU to NPU

import requests
import numpy as np
import openvino as ov
from scipy.special import softmax
from PIL import Image
from pathlib import Path
from transformers import CLIPProcessor, CLIPModel

model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

classes = ["a photo of a cat", "a photo of a dog"]
inputs = processor(text=classes, images=image, return_tensors="pt", padding=True)

outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
predicted_idx = probs.argmax().item()
print(classes[predicted_idx])

ov_model_path = "clip-vit-large-patch14-fp32.xml"
fp32_model_path = Path(ov_model_path)
model.config.torchscript = True

ov_model = ov.convert_model(model, example_input=dict(inputs))
ov.save_model(ov_model, fp32_model_path, compress_to_fp16=False)

device = 'NPU'
core = ov.Core()
compiled_model = core.compile_model(ov_model_path, device)
inputs = dict(inputs)
outputs = compiled_model(inputs)[0]
probs = softmax(outputs, axis=1)
[predicted_idx] = np.argmax(probs, axis=1)
print(classes[predicted_idx])

Relevant log output

[ERROR] 05:26:28.301 [vpux-compiler] Got Diagnostic at loc(fused<{name = "__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution", type = "Convolution"}>["__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution"]) : Channels count of input tensor shape and filter shape must be the same: -9223372036854775808 != 3
loc(fused<{name = "__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution", type = "Convolution"}>["__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution"]): error: Channels count of input tensor shape and filter shape must be the same: -9223372036854775808 != 3
LLVM ERROR: Failed to infer result type(s).

Issue submission checklist

  • I'm reporting an issue. It's not a question.
  • I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
  • There is reproducer code and related data files such as images, videos, models, etc.
@azhuvath azhuvath added bug Something isn't working support_request labels Dec 20, 2024
@ilya-lavrenov ilya-lavrenov added the category: NPU OpenVINO NPU plugin label Dec 20, 2024
@mlyashko
Copy link

There is a new version of Linux driver available, please use this driver: https://github.com/intel/linux-npu-driver/releases/tag/v1.10.1

@azhuvath
Copy link
Author

azhuvath commented Dec 21, 2024

There is a new version of Linux driver available, please use this driver: https://github.com/intel/linux-npu-driver/releases/tag/v1.10.1

This is a Windows system and I have the latest drivers. I might have made a mistake while filing the bug by choosing Ubuntu. Please read it as Windows as mentioned in the description.

@avitial
Copy link
Contributor

avitial commented Dec 23, 2024

@azhuvath we tested on a MTL system and we see some issues with this model as well, we have captured this as a possible bug. Will share more details as we have them.

Exception from src/plugins/intel_npu/src/compiler_adapter/src/ze_graph_ext_wrappers.cpp:389:
L0 pfnCreate2 result: ZE_RESULT_ERROR_UNKNOWN, code 0x7ffffffe - an action is required to complete the desired operation . Check 'min_val == max_val' failed at src/core/src/partial_shape.cpp:129:
get_shape() must be called on a static shape

Ref. 159900

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working category: NPU OpenVINO NPU plugin
Projects
None yet
Development

No branches or pull requests

5 participants