-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cli: Add option to export to DuckDB #94
Comments
Hello @mwip 👋🏻 If I were to add such functionality, I'd add new functions internally next to the existing ones:
And new ones:
For this use case, the The easiest way to do this would be to copy quackosm/quackosm/pbf_file_reader.py Line 431 in 5d19b3c
quackosm/quackosm/pbf_file_reader.py Line 533 in 5d19b3c
And just replace Geoparquet to GeoDataFrame transform with the The direct streaming of data to DuckDB could be achieved but would require more sophisticated changes in the process (mainly because of the duplicated rows removal when multiple PBF files are parsed at once). Things to keep in mind:
For the CLI changes, I have few ideas:
quackosm monaco-latest.osm.pbf --duckdb
Finished operation in 0:00:00
files/monaco-latest_nofilter_noclip_compact.duckdb::osm_features
quackosm monaco-latest.osm.pbf --output monaco.duckdb
Finished operation in 0:00:00
monaco.duckdb::osm_features
quackosm monaco-latest.osm.pbf --output monaco.duckdb --output-table-name osm
Finished operation in 0:00:00
monaco.duckdb::osm Some additional CLI errors should probably be thrown if there are illogical combinations of parameters (such as the output file name with the You are welcome to try and implement it, I'll try to give you more hints and help with docs (examples) and tests if needed. |
Thanks considering this and providing the relevant starting points. I am a little busy, but plan on working on this eventually and provide a PR. If anyone beat me to this, that is fine with me. 😄 |
I was finally able to start working on this and hope I'm able to provide a PR soon. |
I love your project enabling next-gen OSM analyses with the power of DuckDB. Wouldn't it be cool, to enable specifying a DuckDB file (and optionally table) to store the data in? I know DuckDB has a native OSM reader. But you abstracted to the convenience of Points, Lines and Polygons.
What I think would be a cool possible outcome is, e.g.
quackosm monaco-latest.osm.pbf --duckdb monaco_osm.duckdb 'osm'
instead of what's currently possible:
If you think this is an interesting idea, I'd love to contribute. Maybe we could discuss possible entry points for where to start.
The text was updated successfully, but these errors were encountered: