If you use the code, please cite the following paper:
@inproceedings{kalo2019iswc,
title={Knowledge Graph Consolidation by Unifying Synonymous Relationships},
author={Kalo, Jan-Christoph and Ehler, Philipp and Balke, Wolf-Tilo},
booktitle="Proceedings of the 18th International Semantic Web Conference on The Semantic Web",
series = {ISWC '19},
pages={},
year={2019}
}
This is our repository containing all tools and scripts used for:
- Creating input files from .nt files.
- Training knowledge embeddings using Tensorflow and OpenKE
- Detecting synonymous relations using the embeddings
- Evaluating detected synonyms and creating plots
- Linux
- minimum 32GB RAM
- a CUDA-enabled GPU with at least 11GB memory (the software runs also on CPU, but the training is extremely slow)
- OpenKE (included in thirdParty directory)
- Python2
- Python3
- Java 8
- PyPi Packages
- matplotlib
- numpy
- pandas
- pyfpgrowth
- pyspark
- pyspark-utils
- tensorflow | tensorflow-gpu
$ cd ./thirdParty/OpenKE
$ ./make.sh
$ cd ./
$ python2 -m pip install -r requirements_py2.txt
$ python3 -m pip install -r requirements_py3.txt
See requirements_py3.txt to select whether to install tensorflow or tensorflow-gpu.
This python module wraps an embedding into a single class.
Example usage:
from thirdParty.OpenKE import models
from embedding import Embedding
emb = Embedding(benchmark, embedding, models.MODEL, embedding_dimensions=100)
This class provides access to methods like lookup_{entity, relation} to lookup an entity or relation by its id. Furthermore, it provides access to the embedding parameters.
This script starts a training for a given benchmark and specified model type.
Example usage:
$ python3 -m train_embedding --epoch-count 1000 --batch-count 10 transh benchmarks/FB15K/ embeddings/FB15K_transh/
This will load the dataset located at benchmarks/FB15K/
into OpenKE and will start training with model type TransH.
The resulting embedding is saved in this directory: embeddings/FB15K_transh/
For more options, see:
$ python3 -m train_embedding -h
This script will start the detection of synonymous relations.
Example usage:
$ python3 -m synonym_analysis -g transh benchmarks/FB15K/ embeddings/FB15K_transh/ experiments/FB15K_transh/
This will load the TransH knowledge embedding located in embeddings/FB15K_transh/
and its corresponding benchmark in benchmarks/FB15K/
.
Every output of the detection will be stored in experiments/FB15K_transh/
.
Additionally, with the -g
option, ground-truth data of synonymous relations in the specified benchmark will be loaded (if available) and precision-recall diagrams will be plotted.
For more options, see:
$ python3 -m synonym_analysis -h
This script contains the code for our baseline method to detect synonymous relationships.
Example usage:
$ python2 -m baseline benchmarks/FB15K_2000_50/train2id.txt experiments/FB15K_2000_50_baseline/synonyms_minSup_0.02.txt 0.02
The first parameter is the input triples file, the second parameter ist the output file and the third parameter specifies the minimum support value.
This script calculates the precision-recall values for a given baseline.py output and gold-standard.
Example usage:
$ python3 -m baseline_evaluation experiments/FB15K_2000_50_baseline/synonyms_minSup_0.02.txt benchmarks/FB15K_2000_50/synonyms_id.txt
The output is placed in the directory where the baseline.py output is located.
(In the example above: experiments/FB15K_2000_50_baseline/
)
For more options, see:
$ python3 -m baseline_evaluation -h
This script is used to plot precision-recall diagrams summarizing precision and recall for every model of an embedding.
Example usage:
$ python3 -m plot_evaluation -e experiments/FB15K
This will look for all experiment directories of FB15K (i.e. experiments/FB15K_transe/
, experiments/FB15K_transh/
, ...) and create respective precision-recall plots as experiments/FB15K_l1.pdf
and experiments/FB15K_cos.pdf
.
This script is performing the evaluation of our manually evaluated DBpedia dataset (including precision@k diagrams) for a given (manually crafted) gold standard and a baseline.py output. This script internally contains the relevant classification files for each model and similarity function we used.
Example usage:
$ python3 -m evaluate_dbpedia experiments/dbpedia-201610N-1k-filtered_combined_approx_500_correct.txt experiments/dbpedia-201610N-1k-filtered_baseline/synonyms_minSup_0.001_uris.txt
For more options, see:
$ python3 -m evaluate_dbpedia -h
This script takes as input a file with ID pairs and calculates the corresponding file with URI pairs by looking them up in a given relation2id.txt file.
Example usage:
$ python3 -m id2uri experiments/dbpedia-201610N-1k-filtered_baseline/synonyms_minSup_0.001.txt benchmarks/dbpedia-201610N-1k-filtered/relation2id.txt
The URI pairs file is saved in the same directory as the ID pairs file.
For more options, see:
$ python3 -m id2uri -h
This Bash-script will select the GPU(s) to use for training.
Training scripts for our experiments. Trains embeddings for our benchmarks.
Performs synonym detection for our experiments with our method on each embedding.
Performs synonym detection for our experiments with our baseline method. Also plots all results.
In this subsection, we describe how to reproduce the results described in the evaluation section of the paper. First, we describe the synthetic synonym creation, followed by the training of the respective datasets. Afterwards, we synonym detection is performed and the results are evaluated and plots in .pdf format are created.
Our Freebase, DBpedia and Wikidata samples are available under: https://doi.org/10.6084/m9.figshare.8490134 The manually evaluated baseline is available under: https://doi.org/10.6084/m9.figshare.8188394
To create synthetic synonyms, it is necessary to change into the benchmarks
directory.
$ cd ./benchmarks/.
Here, we can access our existing benchmarks and the synonym_inject.py
script.
The script always saves the created synonym pairs in a text file inside the new benchmark folder which we can use as a ground-truth baseline.
The following command creates our copy of the original FB15K benchmark with synthetic synonyms, which randomly replaces the occurrences of the corresponding relations (occuring at least 2000 times) for 50% of the subjects.
This benchmark we called FB15K_2000_50
.
$ python3 -m synonym_inject --percentage-per-relation 0.5 --min-relation-occurence 2000 --func_inject_synonym inject_synonym_2 FB15K
We also created a copy of our Wikidata sample with the same parameters.
This benchmark we called wikidata-20181221TN-1k_2000_50
.
$ python3 -m synonym_inject --percentage-per-relation 0.5 --min-relation-occurence 2000 --func_inject_synonym inject_synonym_2 wikidata-20181221TN-1k
For the training, we have to change the working directory to the root of the repository (if not done yet) because we need the train_embedding.py
script.
$ cd ./
For all embeddings, we mainly tweaked the following parameters:
epoch-count
batch-count
learning-rate
Apart from that, we always used 100 dimensions and generally sticked to the default values for all other parameters (see python3 -m train_embedding -h
).
For more information, see the corresponding Bash scripts.
$ ./train_FB15K.sh
$ ./train_wikidata-20181221TN-1k.sh
$ ./train_dbpedia-201610N-1k-filtered.sh
Again, we need to change to the root of the repository, if not done yet. Additionally, it is a good idea to prevent tensorflow-gpu to load the embeddings into the GPU VRAM because we don't want to train anything.
$ cd ./
$ . ./select_gpu -2
For the synonym_analysis.py
script, note that we have to specify the correct number of embedding dimensions.
For more information, see the corresponding Bash scripts.
$ ./analyse_FB15K.sh
$ ./analyse_wikidata-20181221TN-1k.sh
Here, we excluded the --ground-truth-available
option, because the synonym pairs are unknown.
Thus, we had to manually evaluate the classified synonym pairs.
$ ./analyse_dbpedia-201610N-1k-filtered.sh
Again, we need to change to the root of the repository if not done yet.
$ cd ./
The following scripts will perform the baseline evaluation and plot all results calculated up until now. Note that the evaluation of our method is already performed in the analysis scripts in the previous section. Because of the slightly different evaluation approach in our DBpedia experiment, the overall evaluation is performed with the baseline.py output in this section.
For more information, see the corresponding Bash scripts.
$ ./baseline_FB15K.sh
$ ./baseline_wikidata-20181221TN-1k.sh
$ ./baseline_dbpedia-201610N-1k-filtered.sh