Skip to content

Latest commit

 

History

History
70 lines (41 loc) · 1.53 KB

README.md

File metadata and controls

70 lines (41 loc) · 1.53 KB

Intents Labelling project

This package serves as basis for the paper "ORCAS-I: Queries Annotated with Intent using Weak Supervision"

Link to the paper: arXiv

DOI of the paper: https://doi.org/10.1145/3477495.3531737

DOI of the dataset: DOI

Installation

Create conda environment:

$ conda create --name intents_labelling python==3.8.12

Activate the environment:

$ source activate intents_labelling

Use pip to install requirements:

(intents_labelling) $ pip install -r requirements.txt

Install intents_labelling package for development

(intents_labelling) $ pip install -e .

Install spacy language model:

(intents_labelling) $ python -m spacy download en_core_web_lg

List of movie titles can be found here.

Put all data files in data/input/ directory.

Usage

Create a training set which will be a sample of ORCAS dataset. Filter out testset examples

(intents_labelling) $ python intents_labelling/create_train_file.py

Create snorkel annotations

(intents_labelling) $ python intents_labelling/main.py

Train Bert model

(intents_labelling) $ python intents_labelling/models/train_bert_classifier.py