[Performance]: Why are the inference results in Python different from those in C++? #28188

hujhcv · 2024-12-24T08:36:43Z

OpenVINO Version

No response

Operating System

Windows System

Device used for inference

CPU

OpenVINO installation

PyPi

Programming Language

C++

Hardware Architecture

x86 (64 bits)

Model used

mobilenet v2

Model quantization

No

Target Platform

No response

Performance issue description

In Python, I used the MobileNet model to infer an image, and both PyTorch and OpenVINO results were: Samoyed: 83.0%.


from torchvision.io import read_image
from torchvision import transforms
from torchvision.models import mobilenet_v2, MobileNet_V2_Weights
import requests, PIL, io, torch

img = PIL.Image.open("E:/dog.jpg")

preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])


weights = MobileNet_V2_Weights.DEFAULT
model = mobilenet_v2(pretrained=True)
model.eval()
batch = preprocess(img).unsqueeze(0)


prediction = model(batch).squeeze(0).softmax(0)
class_id = prediction.argmax().item()
score = prediction[class_id].item()
category_name = weights.meta["categories"][class_id]
print(f"{category_name}: {100 * score:.1f}% (with PyTorch)")

# OpenVINO model preparation and inference with the same post-processing

import openvino as ov
compiled_model = ov.compile_model(ov.convert_model(model, example_input=batch))

prediction = torch.tensor(compiled_model(batch)[0]).squeeze(0).softmax(0)
class_id = prediction.argmax().item()
score = prediction[class_id].item()
category_name = weights.meta["categories"][class_id]
print(f"{category_name}: {100 * score:.1f}% (with OpenVINO)")

I exported the PyTorch MobileNet model to an OpenVINO IR file using the following Python code.


import torchvision
import torch, PIL
from torchvision import transforms
import openvino as ov

model = torchvision.models.mobilenet_v2(pretrained=True)
model.eval()
ov_model = ov.convert_model(model, example_input=torch.rand(1, 3, 224, 224))
output_filenmae = "mobilenet_v2.xml"
ov.save_model(ov_model, output_filenmae)

Then, using the following C++ code for inference, the result was: Samoyed: 68.7778%.


int main(int argc, char* argv[])
{
	try
	{

		ov::Core core; // OpenVINO core object
		std::shared_ptr<ov::Model> model = core.read_model("E:/mobilenet_v2.xml");

		// If the model has dynamic shapes, reshape it to the specified input shape
		if (model->is_dynamic())
		{
			model->reshape({ 1, 3, static_cast<long int>(224), static_cast<long int>(224) });
		}

		ov::preprocess::PrePostProcessor ppp = ov::preprocess::PrePostProcessor(model);
		ppp.input().tensor()
			.set_element_type(ov::element::u8)
			.set_layout("NHWC")
			.set_color_format(ov::preprocess::ColorFormat::BGR);
		ppp.input().preprocess()
			.convert_element_type(ov::element::f32)
			.convert_color(ov::preprocess::ColorFormat::RGB)
			.scale({ 255.0f, 255.0f, 255.0f })
			.mean({ 0.485f, 0.456f, 0.406f })
			.scale({ 0.229f, 0.224f, 0.225f });

		ppp.input().model().set_layout("NCHW");
		ppp.output().tensor().set_element_type(ov::element::f32);

		model = ppp.build(); // Build the preprocessed model

		// Compile the model for inference
		ov::CompiledModel compiled_model = core.compile_model(model, "CPU");
		ov::InferRequest inference_request = compiled_model.create_infer_request(); // Create inference request

		short width, height;

		// Get input shape from the model
		const std::vector<ov::Output<ov::Node>> inputs = model->inputs();
		const ov::Shape input_shape = inputs[0].get_shape();
		cv::Size model_input_shape = cv::Size(input_shape[2], input_shape[1]);

		// Get output shape from the model
		const std::vector<ov::Output<ov::Node>> outputs = model->outputs();
		const ov::Shape output_shape = outputs[0].get_shape();
		int classesNum = output_shape[1];


		cv::Mat img = cv::imread("E:/dog.jpg");

		cv::Mat resizedImage;
		cv::resize(img, resizedImage, cv::Size(256, 256));
		int centerX = resizedImage.cols / 2;
		int centerY = resizedImage.rows / 2;
		int cropSize = 224;
		int startX = centerX - cropSize / 2;
		int startY = centerY - cropSize / 2;
		cv::Mat croppedImage;
		resizedImage(cv::Rect(startX, startY, cropSize, cropSize)).copyTo(croppedImage);

		float* input_data = (float*)croppedImage.data; // Get pointer to resized frame data
		const ov::Tensor input_tensor = ov::Tensor(compiled_model.input().get_element_type(), compiled_model.input().get_shape(), input_data); // Create input tensor
		inference_request.set_input_tensor(input_tensor); // Set input tensor for inference

		inference_request.infer();

		const ov::Tensor& output_tensor = inference_request.get_output_tensor();
		const float* tensorData = inference_request.get_output_tensor().data<const float>();

                // softmax
		double softmaxData[1000];
		{
			float max_val = 0;
			for (int i = 0; i < 1000; i++)
			{
				if (max_val < tensorData[i])
				{
					max_val = tensorData[i];
				}
			}

			double sum_exp = 0.0;
			for (size_t i = 0; i < 1000; ++i)
			{
				softmaxData[i] = std::exp(tensorData[i] - max_val);
				sum_exp += softmaxData[i];
			}

			for (size_t i = 0; i < 1000; ++i)
			{
				softmaxData[i] /= sum_exp;
			}
		}


		double maxScore = 0;
		int maxScoreIndex = 0;
		for (int i = 0; i < classesNum; i++)
		{
			if (maxScore < softmaxData[i])
			{
				maxScore = softmaxData[i];
				maxScoreIndex = i;
			}
		}

		std::cout << "Max: " << maxScore*100 << "%  Index: " << maxScoreIndex << std::endl;


		return 0;

	}
	catch (const std::exception& ex)
	{
		std::cerr << ex.what() << std::endl;
		return 1;
	}
}

The difference in results is quite large. Is there a problem with my C++ code?

Step-by-step reproduction

No response

Issue submission checklist

I'm reporting a performance issue. It's not a question.
I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
There is reproducer code and related data files such as images, videos, models, etc.

The text was updated successfully, but these errors were encountered:

adminaccount001 · 2024-12-26T06:03:25Z

you should use PIL resize instead of OpenCV

hujhcv · 2024-12-27T01:07:35Z

you should use PIL resize instead of OpenCV

I also suspect that the difference in the input data to the model might be caused by different scaling interpolation algorithms. However, I used a plain gray image (where every pixel has a value of 198) for inference. In this case, regardless of the scaling interpolation algorithm used, the final input data to the model should be the same. Unfortunately, the results from Python and C++ still show discrepancies.

hujhcv added performance Performance related topics support_request labels Dec 24, 2024

ilya-lavrenov assigned mlukasze Dec 24, 2024

ilya-lavrenov added the category: Python API OpenVINO Python bindings label Dec 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance]: Why are the inference results in Python different from those in C++? #28188

[Performance]: Why are the inference results in Python different from those in C++? #28188

hujhcv commented Dec 24, 2024 •

edited

Loading

adminaccount001 commented Dec 26, 2024

hujhcv commented Dec 27, 2024

[Performance]: Why are the inference results in Python different from those in C++? #28188

[Performance]: Why are the inference results in Python different from those in C++? #28188

Comments

hujhcv commented Dec 24, 2024 • edited Loading

OpenVINO Version

Operating System

Device used for inference

OpenVINO installation

Programming Language

Hardware Architecture

Model used

Model quantization

Target Platform

Performance issue description

Step-by-step reproduction

Issue submission checklist

adminaccount001 commented Dec 26, 2024

hujhcv commented Dec 27, 2024

hujhcv commented Dec 24, 2024 •

edited

Loading