Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration test for arrow record batch <-> postgres row round trip conversion #38

Conversation

Sevenannn
Copy link
Contributor

@Sevenannn Sevenannn commented Aug 9, 2024

What

A integration test to validate the arrow record batch to postgres row round trip conversion support the following arrow types:

Null, all Int types, all Float types, Time32, Time64, Timestamp (with/without TZ), Date32, Date64, Duration, Interval, Binary/LargeBinary/FixedSizeBinary, Utf8/LargeUtf8, List/FixedSizeList/LargeList, Struct, Decimal128/Decimal256

Why

  • This test ensure datafusion table provider has full support for the arrow types mentioned in arrow -> sql rows conversion and sql row -> arrow recordbatch conversion
  • This test provide a generalizable framework that can be used for testing arrow to sqlite / duckdb / mysql round trip conversion.
  • The integration test can replace the unit tests in testing row -> recordbatch type conversion: for example: test_chrono_naive_time_to_time64nanosecond, which only partially test parsing logic due to inability to generate Postgres Row struct directly from tokio-postgres crate
  • This test involves the functionality of several modules within the datafusion table provider, and is thus structured as integration test, according to Rust test organization philosophy

How

  • Run dockerized postgres as the test Postgres instance
  • Generate RecordBatch of different arrow types
  • Test the arrow RecordBatch to Postgres Row conversion by creating insert statement with InsertBuilder using generated arrow record batch
  • Test the Postgres Row to arrow RecordBatch conversion by querying Postgres through datafusion .sql method.
  • Add Makefile target for this integration test

@Sevenannn Sevenannn marked this pull request as ready for review August 9, 2024 22:02
@phillipleblanc phillipleblanc changed the title Integrartion test for arrow record batch <-> postgres row round trip conversion Integration test for arrow record batch <-> postgres row round trip conversion Aug 10, 2024
@phillipleblanc phillipleblanc merged commit 140b67c into datafusion-contrib:main Aug 14, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants