Skip to content

Commit

Permalink
sammy/to-arrow-docs
Browse files Browse the repository at this point in the history
  • Loading branch information
samster25 authored and sagiahrac committed Oct 7, 2024
1 parent a509010 commit bdc7b43
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 1 deletion.
5 changes: 4 additions & 1 deletion src/daft-core/src/array/ops/as_arrow.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,10 @@ use crate::{
pub trait AsArrow {
type Output;

// Retrieve the underlying concrete Arrow2 array.
/// Retrieve the underlying internal Arrow2 array.
/// This does not correct for the logical types and will just yield the physical type of the array.
/// For example, a TimestampArray will yield an arrow Int64Array rather than a arrow Timestamp Array.
/// To get a corrected arrow type, see `.to_arrow()`.
fn as_arrow(&self) -> &Self::Output;
}

Expand Down
6 changes: 6 additions & 0 deletions src/daft-core/src/series/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,12 @@ impl PartialEq for Series {
}

impl Series {
/// Exports this Series into an Arrow arrow that is corrected for the Arrow type system.
/// For example, Daft's TimestampArray is a logical type that is backed by an Int64Array Physical array.
/// If we were to call `.as_arrow()` or `.physical`on the TimestampArray, we would get an Int64Array that represented the time units.
/// However if we want to export our Timestamp array to another arrow system like arrow2 kernels or python, duckdb or more.
/// We should convert it back to the canonical arrow dtype of Timestamp rather than Int64.
/// To get the internal physical type without conversion, see `as_arrow()`.
pub fn to_arrow(&self) -> Box<dyn arrow2::array::Array> {
self.inner.to_arrow()
}
Expand Down

0 comments on commit bdc7b43

Please sign in to comment.