-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to get text coordinates (bbox) from phi-3 vision #123
Comments
@ChenRocks thoughts on the above feature? |
To achieve this, you can use the ONNX Runtime with the Phi-3 vision model. Here’s a general approach:
Here is a simplified example of how you might set this up in Python: import onnxruntime as ort
import numpy as np
from PIL import Image
# Load the model
session = ort.InferenceSession("path_to_phi3_model.onnx")
# Preprocess the image
image = Image.open("path_to_image.jpg")
input_data = np.array(image).astype(np.float32)
# Run the model
outputs = session.run(None, {"input": input_data})
# Extract bounding boxes from the output
bounding_boxes = outputs[0] # Assuming the first output contains the bounding boxes
for box in bounding_boxes:
x, y, width, height = box
print(f"Bounding box: x={x}, y={y}, width={width}, height={height}") Source Code Examples & ONNX Models: |
Thank you for getting back to me. Have you tested this on your side? It's not working on my side. |
Thanks @ladanisavan for your inquiry. Unfortunately, BBox support is currently not available in Phi-3.x-vision. We appreciate this feedback and will discuss this feature request for future versions. In the meanwhile, I personally recommend Florence-2. |
This issue is for a: (mark with an
x
)Hello,
First, thank you for the incredible work you have shared with the phi community. I am wondering if there is a way to obtain the text coordinates (bounding boxes) from the phi-3 vision generated output for an input image? This feature would be immensely beneficial for various applications that rely on precise text positioning.
Thank you for considering this request.
The text was updated successfully, but these errors were encountered: