-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert from arrow into Zero Copy but Copy On Write for Dora like memory #5
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 tasks
Looks awesome! Thanks Enzo :) |
Hennzau
added a commit
that referenced
this pull request
Sep 9, 2024
# Objective This PR adds a new datatype for **BBox** to ensure that everything work well and it's easy to add a new datatype. # Datatypes & Usage - [x] Image ```Rust use crate::image::Image; let flat_image = (0..27).collect::<Vec<u8>>(); let image = Image::new_rgb8(flat_image, 3, 3, Some("camera.test")).unwrap(); let final_image = image.into_bgr8().unwrap(); let final_image_data = final_image.data.as_u8().unwrap(); let expected_image = vec![ 2, 1, 0, 5, 4, 3, 8, 7, 6, 11, 10, 9, 14, 13, 12, 17, 16, 15, 20, 19, 18, 23, 22, 21, 26, 25, 24, ]; assert_eq!(&expected_image, final_image_data); use crate::image::Image; let flat_image = vec![0; 27]; let original_buffer_address = flat_image.as_ptr(); let bgr8_image = Image::new_bgr8(flat_image, 3, 3, None).unwrap(); let image_buffer_address = bgr8_image.as_ptr(); let arrow_image = bgr8_image.into_arrow().unwrap(); let new_image = Image::from_arrow(arrow_image).unwrap(); let final_image_buffer = new_image.as_ptr(); assert_eq!(original_buffer_address, image_buffer_address); assert_eq!(image_buffer_address, final_image_buffer); ``` - [x] BBox ```Rust use crate::bbox::BBox; let flat_bbox = vec![1.0, 1.0, 2.0, 2.0]; let confidence = vec![0.98]; let label = vec!["cat".to_string()]; let bbox = BBox::new_xyxy(flat_bbox, confidence, label).unwrap(); let final_bbox = bbox.into_xywh().unwrap(); let final_bbox_data = final_bbox.data; let expected_bbox = vec![1.0, 1.0, 1.0, 1.0]; assert_eq!(expected_bbox, final_bbox_data); use crate::bbox::BBox; let flat_bbox = vec![1.0, 1.0, 2.0, 2.0]; let original_buffer_address = flat_bbox.as_ptr(); let confidence = vec![0.98]; let label = vec!["cat".to_string()]; let xyxy_bbox = BBox::new_xyxy(flat_bbox, confidence, label).unwrap(); let bbox_buffer_address = xyxy_bbox.data.as_ptr(); let arrow_bbox = xyxy_bbox.into_arrow().unwrap(); let new_bbox = BBox::from_arrow(arrow_bbox).unwrap(); let final_bbox_buffer = new_bbox.data.as_ptr(); assert_eq!(original_buffer_address, bbox_buffer_address); assert_eq!(bbox_buffer_address, final_bbox_buffer); ``` # Quick Fixes - I also improved readability and consistency with Rust formatting, following the guidelines mentioned in [this comment](#1 (comment)). - Fix Arrow Array extraction from Union inside #3 - Fix Dora compatibility (by viewing objects when it's not possible to own them) inside #5 - I improved the structure of the library, with separated packages and with some `features` as one might want to use `fastformat` only with ndarray/arrow. #6
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Objective
This PR adds support for
viewing
an object as Rust's native types fromarrow::array::ArrayData
without consuming it. The initial issue arose with Dora because, in my first approach (#3), I extracted the buffer for each field in theUnionArray
to take ownership inside aVec
. However, this is not always possible, particularly with Dora, because when theUnionArray
passes through the shared memory server, all fields are allocated in the same buffer, making it impossible to take ownership (of sub-buffers).Solution
The solution was to use
std::borrow::Cow
to represent array fields infastformat::Image
(orBBox
).Now, you can have both an
Owned
and aBorrowed
BBox, with access to the exact same methods. However, if those methods need to mutate data, they will clone it first.User Usage
To implement this, I created a new
struct
:FastFormatArrowRawData
, which can be built fromArrayData
, and then either consumed into anOwned Object
or viewed as aBorrowed Object
.Developer Usage
Now it's super easy to create a
FastFormat
type and make it compatible with Arrow. You should create a few functions for your new type:raw_data
function that will consume theArrayData
and return aFastFormatArrowRawData
:from_raw_data
function that will consume theFastFormatArrowRawData
and return your type:view_from_raw_data
function that will borrow theFastFormatArrowRawData
to return a Borrowed object:Benchmarks
I benchmarked these functions with Dora. I compared passing raw data from a
Vec<u8>
of different sizes to passing afastformat::Image
type, including the entire pipeline (creating the Image, converting it to Arrow, and converting it back to an Image).(Benchmark on a laptop, 32GB of RAM and a Ryzen 7 4800H)
Raw Vec
For this benchmark, I sent 1000 raw
Vec<u8>
of different sizes:FastFormat
For this benchmark, I sent 1000 Image objects of different sizes:
Conclusion
As you can see, there is no notable difference (which is expected, as we don’t copy any large data). See dora-benchmark.