Gemini 2.0 Flash spatial reasoning severly degraded with Image size 1024 #383
Labels
component:other
Issues unrelated to examples/quickstarts
status:triaged
Issue/PR triaged to the corresponding sub-team
type:bug
Something isn't working
Description of the bug:
I was trying to reproduce results I get with the spatial reasoning applet found in the AI studio, using the spatial reasoning notebook. However, I noticed for the same prompt and same image, applet produces far better results with 2.0 Flash.
Upon further investigation, it seems that in the notebook the image is resized to 1024, but in the applet the image is resized to 640. Using images with size of 1024 significantly degrades the quality of the boxes generated. E.g a lot of the bounding boxes are misplaced and detected incorrectly. Changing the sizing to use maximum 640 for width in the notebook fixes the issue.
Any intuition on why image sizes 1024 degardes the spatial reasoning?
Actual vs expected behavior:
No response
Any other information you'd like to share?
No response
The text was updated successfully, but these errors were encountered: