Skip to content

gkaradzhov/FactcheckingRANLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Fully Automated Fact Checking Using External Sources

Paper abstract:

Given the constantly growing proliferation of false claims online in recent years, there has been also a growing research interest in automatically distinguishing false rumors from factually-true claims. Here, we propose a general-purpose framework for fully-automatic fact checking using external sources, tapping the potential of the entire Web as a knowledge source to confirm or reject a claim. Our framework uses a deep neural network with LSTM text encoding to combine semantic kernels with task-specific embeddings that encode a claim together with pieces of potentially-relevant text fragments from the Web, taking the source reliability into account. The evaluation results show good performance on two different tasks and datasets:

  • rumor detection and
  • fact checking of the answers to a question in community question answering forums.

Authors:

Georgi Karadzhov, Preslav Nakov, Lluís Màrquez, Alberto Barrón-Cedeño, Ivan Koychev

Please, cite the following paper if you use the resources below:

@InProceedings{RANLP2017:factchecking,
  author    = {Georgi Karadzhov and Preslav Nakov and Llu\'{i}s M\`{a}rquez and Alberto Barr\'on-Cede\~no and Ivan Koychev},
  title     = {Fully Automated Fact Checking Using External Sources},
  booktitle = {Proceedings of the 2017 International Conference on Recent Advances in Natural Language Processing},
  month     = {September},
  year      = {2017},
  address   = {Varna, Bulgaria},
  series    = {RANLP~'17}
}

Resources

Code:

Version of the code is available in the repo. Cleaner(refactored) version of the code will be available soon(-ish).

Table:Rumour detection resources

Name Short description Link
Claims Claims from snopes.com, each of them is labeled with Rumour or Non-rumour. Download
Data splits Exact splits used for training and evaluation of factchecking system Train-Download Test-Download Development-Download
Website credibility Manually annotated list of websites. Possible labels(reputed-source, forum-type, others) Download
Web data Each claim augmented with automaticaly collected web data. Also includes all calculated similarities and avg. sentence vectors. Download
Best web resources Only web data, that has the highest similarity to the original claim. This is used to train the task-specific embeddings. Download
Task-specific embeddings Combined representation of a claim and the supporting web data. Download

Table:Factchecking in cQA resources

Name Short description Link
QAs Question and comments, from QatarLiving forum. Download
QAs-concatenated Question and comments, from QatarLiving forum, concatenated to represent a single entity. This data is used in the system. Download
Data splits Exact splits used for training and evaluation of factchecking system Train-Download Test-Download Development-Download
Website credibility Manually annotated list of websites. Possible labels(reputed-source, forum-type, others). For cQA dataset we also annotated wheter the website is Qatar related as it is relevant to the credibility of an answer Download
Web data Each QA-pair augmented with automaticaly collected web data. Also includes all calculated similarities and avg. sentence vectors. Download
Best web resources Only web data, that has the highest similarity to the original QA-pair. This is used to train the task-specific embeddings. Download
Task-specific embeddings Combined representation of a QA-pair and the supporting web data. Download

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published