Skip to content

RJMillerLab/table-union-search-benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

Introduction

This benchmark provides a ground truth for evaluating table union search algorithms. All tables are synthesized from Open Data tables. For more details, please refer to:

F. Nargesian, E. Zhu, K. Pu, R. J. Miller, Table Union Search on Open Data

Resources

This suite consists of two benchmarks of size ~1,300 and ~5,000 tables. These benchmarks are available in sqlite databases:

Small:

Large:

The "base.sqlite" databases contain the base tables used for generating the benchmark and "benchmark.sqlite" databases contain unionable tables. For each benchmark, we provided a ground truth database in "groundtruth.sqlite". These databases contain three tables. Table "att_groundtruth" provides all mappings between unionable attributes. Table "alignment_groundtruth" provides the size of the max-alignment (c) between two tables. Finally, table "recall_groundtruth" provides the number of tables that are unionable with a benchmark table.

We also included a script in the benchmark that converts benchmark tables to csv files.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages