Fast Feature Selection for Learning to Rank

This is the Python 2.7 code used in "Fast feature selection for Learning to Rank" (2016) by A. Gigli, C. Lucchese, R. Perego, F. Nardini.

You can clone it ad run in bash through the following command

0_prepare_data.sh <YOUR_DIRECTORY> <FEATURES_NUMBER>
1_run_fast_feature_selection.sh <YOUR_DIRECTORY>
2_run_competitors.sh <YOUR_DIRECTORY>
3_effectiveness_performance.sh <YOUR_DIRECTORY> <FEATURES_FILE>

where

<YOUR_DIRECTORY> is the path of the directory where you have cloned this repository.
<FEATURES_NUMBER> is the number of feature of the dataset.
<FEATURES_FILE> is the filename of the file containing the selected features. In the paper we compare different Feature Selection Algorithms (FSAs): GAS, NGAS, XGAS and HCAS. In order to make the performance measurement more flexible 1_run_fast_feature_selection.sh <YOU_DIRECTORY> generates 4 files: gas_selection_test.txt, ngas_selection_test.txt, xgas_selection_test.txt, hcas_selection_test.txt in the folder /output. Each file represents the selection obtained from a FSA and contains 7 feature selections (one for each features selection subset dimension, ie 5%, 10%, 20%, 30%, 40%, 50%, 75% ).

The code demo is ready to be tested on sample data extracted from the *istella LETOR dataset. The dataset contains 220 features in SVM format, so please set <FEATURES_NUMBER> equal to 220 when launching 0_prepare_data.sh <YOUR_DIRECTORY> <FEATURES_NUMBER>

HINT: If you simply want to run LambdaMART on the whole features set you can create a one-row file containing a sequence of integers from 1 to <FEATURES_NUMBER> separated by white spaces and then run from your /test_feature_selection_performance local directory

python test_performance_lambdamart.py <DATA_DIRECTORY> <FEATURES_FILE>

where

<DATA_DIRECTORY> is the directory where you have saved train, validation and test data in the appropriate format
<FEATURES_FILE> is the one-row file containg the sequence of integers

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
example_data		example_data
feature_relevance		feature_relevance
feature_selection		feature_selection
feature_selection_competitors		feature_selection_competitors
feature_similarity		feature_similarity
prepare_data		prepare_data
test_feature_selection_performance		test_feature_selection_performance
0_prepare_data.sh		0_prepare_data.sh
1_run_fast_feature_selection.sh		1_run_fast_feature_selection.sh
2_run_competitors.sh		2_run_competitors.sh
3_effectiveness_performance.sh		3_effectiveness_performance.sh
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fast Feature Selection for Learning to Rank

About

Releases

Packages

Languages

acalgan/FSA

Folders and files

Latest commit

History

Repository files navigation

Fast Feature Selection for Learning to Rank

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages