Free Text Tagger

This repository contains all scripts to extract contextual information from a free text.

Given a free text, the script is able to extract information about 4 categories: activities, emotions, interactions and places. For each of these categories there is a dictionary, which contains a list of sub-categories.

Text given in input is parsed and then matched to the sub-categories by handwritten rules, which take into account syntactic information (lemmas, Parts-Of-Speech, dependency structure, ...).

Requirements

Requires Python 3.x
Requires the following Python libraries:
- spacy
- re

Input

Text (string)

-- choose how to pass string to the main script --

Output

For each category returns a matches list containing:

a numeric id for the matched sub-category
a number that states the point in the sentence where the match starts
a number that states the point in the sentence where the match ends

e.g. "We're playing games" will return this output:

[(5133706519360878345, 2, 3), (5133706519360878345, 2, 4), (5133706519360878345, 3, 4)]
5133706519360878345 is the id for the sub-category 'leisure'
2,3 is the span for 'playing'
2,4 is the span for 'playing games'
3,4 is the span for 'games'

! notice that in the span interval, the first number is included, the second one is NOT included

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
free_text_tagger_v2		free_text_tagger_v2
README.md		README.md
instructions.txt		instructions.txt
requirements.txt		requirements.txt
tagger.py		tagger.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Free Text Tagger

Requirements

Input

Output

About

Releases

Packages

Languages

biobeats/free_text_tagger

Folders and files

Latest commit

History

Repository files navigation

Free Text Tagger

Requirements

Input

Output

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages