nlvr_tau_nlp_final_proj

This repository contains the files and directories, as well as the needed data, used for our final project for NLP and advanced machine learning courses, in Tel Aviv University, spring semester 2017.

Repository structure:

This is a short description of the files and directories. Note that not all are listed here. more elaborate description is to be found in the documentation and in the project assignment paper.

seq2seqModel:

Directory containing all files and data with the impementation of the model.

.py files:

seq2seq.py : this is the only runnable file in the directory. It contains the tf implementation of the model architecture, and the functions for training and evaluating the model.
beam_search.py : our implementation of epsilon greedy randomized beam search.
beam_boosting.py : functions used for boosting the baseline performance of the beam search
partial_program.py : contains the class PartialProgram that is used to wrap the programs in the beam.
hyper_params : constants and boolean properties of the model. can be changed between runs.

data directories:

learnedWeightsPreTrain : weights learned from running the pre-training using generated sentences and annotations of certain common patterns.
learnedWeightsWeaklySupervised : weights learned using the weakly supervised model (learning from denotations). The current weights in the dir are those achieving the beat results so far on the dev and test data sets.
running_logs : directory for saving logs with results of running training or testing of the model. Right now contains the results by sentence of running our best model on the dev and test sets.
word2vec : word embeddings used by the model and the code used for creating them.

data:

Contains most of the data needed for the project, including the original data set and other data used or generated by us.

nlvr-data : the original CNLVR data set
logical forms : data for using the logical forms in the model
parsed sentences : contains patterns of sentences with their annotations, as well as the dataset for pre-train that was generated based on them.
sentence processing : data needed for (or aquired through) pre-processing of the sentences.

.py files in root dir:

data_manager.py : loads the needed data, processes it and return it as an object that is convenient to work with.
sentence processing.py : used by the data manager to preprocess the sentences in the data, in order to reduce noise (e.g. generated by spelling errors) in the data.
logical forms.py: the code for the functions that are run when executing a logical form on a structured representation of an image.
structured_rep.py: classes representing the structured representation of an image in the data set.

Name		Name	Last commit message	Last commit date
Latest commit History 266 Commits
data		data
pre-training		pre-training
seq2seqModel		seq2seqModel
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
data_manager.py		data_manager.py
definitions.py		definitions.py
display_images.py		display_images.py
general_utils.py		general_utils.py
logical_forms.py		logical_forms.py
sentence_processing.py		sentence_processing.py
structured_rep.py		structured_rep.py
structured_rep_enums.py		structured_rep_enums.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nlvr_tau_nlp_final_proj

Repository structure:

seq2seqModel:

data:

.py files in root dir:

About

Releases

Packages

Contributors 3

Languages

udiNaveh/nlvr_tau_nlp_final_proj

Folders and files

Latest commit

History

Repository files navigation

nlvr_tau_nlp_final_proj

Repository structure:

seq2seqModel:

data:

.py files in root dir:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages