This AWS Lambda function is part of Mapping Prejudice's Deed Machine application. This component looks for racial and other terms to flag as potential racial covenants. For each term in covenant_flags, check OCR JSON file received from previous step for existance of term. This version uses the Python 'regex' module as opposed to the built-in 're' module to allow for fuzzy searching to a given tolerance. Terms, fuzziness tolerance settings, and mandatory suffixes (usually spaces) are loaded from a CSV. Some of the terms are actually exceptions rather than covenant hits, and once they reach the Django stage, will be used to mark this page as exempt from consideration as being considered as a racial covenant. Common examples of exceptions include birth certificates and military discharges, which often contain racial information but are not going to contain a racial covenant. This is the fourth Lambda in the Deed Machine initial processing Step Function.
The Deed Machine is a multi-language set of tools that use OCR and crowdsourced transcription to identify racially restrictive covenant language, then map the results.
The Lambda components of the Deed Machine are built using Amazon's Serverless Application Model (SAM) and the AWS SAM CLI tool.
- License
- Component documentation
- Documentation home
- Downloadable Racial covenants data
- Mapping Prejudice main site
- Pipenv (Can use other virtual environments, but will require fiddling on your part)
- AWS SAM CLI
- Docker
- Python 3
To build the application:
pipenv install
pipenv shell
sam build
To rebuild and deploy the application:
sam build && sam deploy
To run tests:
pytest