Skip to content

Command line interface based RDF processing toolkit to run sequences of SPARQL statements ad-hoc on RDF datasets, streams of bindings and streams of named graphs with support for processing JSON, CSV and XML using function extensions

License

Notifications You must be signed in to change notification settings

SmartDataAnalytics/RdfProcessingToolkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RDF Processing Toolkit (RPT)

RPT makes RDF/SPARQL workflows on the command line easy. The RDF Processing Toolkit (RPT) integrates several of our tools into a single CLI frontend: It features commands for running SPARQL-statements on triple and quad based data both streaming and static. SPARQL extensions for working with CSV, JSON and XML are included. So is an RML toolkit that allows one to convert RML to SPARQL (or TARQL). RPT embeds several SPARQL engines, including Jena’s ARQ and TDB, as well as one of our own for SPARQL-based batch processing using Apache Spark.

News

Previous entries

Example Use Cases

  • Lodservatory implements SPARQL endpoint monitoring uses these tools in this script called from this git action.
  • Linked Sparql Queries provides tools to RDFize SPARQL query logs and run benchmark on the resulting RDF. The triples related to a query represent an instance of a sophisticated domain model and are grouped in a named graph. Depending on the input size one can end up with millions of named graphs describing queries amounting to billions of triples. With ngs one can easily extract complete samples of the queries' models without a related triple being left behind.

License

The source code of this repo is published under the Apache License Version 2.0. Dependencies may be licensed under different terms. When in doubt please refer to the licenses of the dependencies declared in the pom.xml files. The dependency tree can be viewed with Maven using mvn dependency:tree.

Acknowledgements

History

  • 2023-05-19 New quality of life features: cpcat command and the canned queries tree.rq and gtree.rq.
  • 2023-04-04 Release v1.9.5! RPT now ships with sansa (Apache Spark based tooling) and rmltk (RML Toolkit) features. A proper GitHub release will follow once Apache Jena 4.8.0 is out as some code depends on its latest SNAPSHOT changes.
  • 2023-03-28 Started updating documentation to latest changes (ongoing)

About

Command line interface based RDF processing toolkit to run sequences of SPARQL statements ad-hoc on RDF datasets, streams of bindings and streams of named graphs with support for processing JSON, CSV and XML using function extensions

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published