Bedrock is a high-level text pre-processing API, written in Python and can run on Spacy as its backend. It allows you to quickly perform the text processing groundwork without having. It does the menial work, so you don't have to.
Use this library if you find the following highlights useful:
- Fast prototyping
- Switching between different backends
- Work in batches, rather than writing loops
- Support for DataFrame and CAS xmi inputs/outputs
Install bedrock in a jiffy:
pip install bedrock
bedrock download de
Now you can run
from bedrock.pipeline import Pipeline
Pipeline(language='de').parse_text("Hello world").get_docs()
Congrats! 🎉
Currently bedrock supports spacy as its background engine.
And the following languages and corresponding download arguments:
- English ('en' or 'english')
- German ('de', 'german' or 'deutsch')
- German ('fr' or 'french')
Package installation
pip install bedrock
Install support for all languages:
bedrock download all
Install support only for English:
bedrock download en
Install support for German:
bedrock download de
Install support for French:
bedrock download fr
Import modules from package in your code:
from bedrock.pipeline import Pipeline # Processing texts
from bedrock.annotator.annotator import Annotator # Annotator interface
from bedrock.annotator.dictionary_annotator import DictionaryAnnotator # Prebuilt dictionary annotator
from bedrock.annotator.regex_annotator import RegexAnnotator # Prebuild regex annotator