This repository contains the source files of Talend Data Quality libraries.
Project | Description |
---|---|
dataquality-common | Abstractions of data analysis, and low-level utilities such as East Asian text pattern recognition |
dataquality-email | Email validation library |
dataquality-libraries | Parent pom aggregating other library projects, devops tools |
dataquality-record-linkage | Record Matching algorithms, blocking key calculation and T-Swoosh |
dataquality-sampling | Reservoir sampling, data masking, data duplication |
dataquality-semantic-model | Definition of semantic category related objects |
dataquality-semantic | API for semantic category analysis |
dataquality-standardization | Standardization library based on Apache Lucene |
dataquality-statistics | API for data analysis and statistics (require JDK1.8) |
dataquality-wordnet | Content validation API based on WordNet dictionary |
Talend Open Studio for Data Quality can be download from the Talend website.
- All project are maven based.
- The parent pom builds all the libraries.
Copyright (c) 2006-2016 Talend
Licensed under the Apache Licence v2