Personalized News Recommendation

This repository contains PyTorch implementations of several personalized news recommendation methods, created for my MSc thesis in Artificial Intelligence at University of Amsterdam.

Many news recommendation models follow the same general architecture with similar components: a news encoder, user encoder and click predictor. That is the perspective this repository relies on. The individual components are available as modules and through configuration they are combined to form the model.

This repository relies heavily on Hydra for configuration, so it is recommended to familiarize yourself with it.

Experiments

The setup/configuration for the experiments in my thesis can be found in the conf/experiments directory. The results are shown in the notebooks in the notebooks directory.

Available models

Name	Paper	Notes
NAML	Neural News Recommendation with Attentive Multi-View Learning
NRMS	Neural News Recommendation with Multi-Head Self-Attention
TANR	Neural News Recommendation with Topic-Aware News Representation
HieRec	HieRec: Hierarchical User Interest Modeling for Personalized News Recommendation	Only user encoder + click predictor
MINER	MINER: Multi-Interest Matching Network for News Recommendation	BERT news encoder performs poorly

Getting started

After cloning, install dependencies using Poetry:

poetry install

By default, a sampled subset of the MIND-large dataset is used. This is because the original dataset does not contain test labels. Either you need to use the MIND-large dataset and disable evaluation on test split (through adding data=mind_large eval_splits=[dev] to your commands) or you can sample the data through:

poetry run python src/sample_data.py

Training a model (e.g. NRMS):

poetry run python src/train_recommender.py +model=nrms

After training, files containing checkpoints and metrics can be found in the outputs/ directory.

Custom model configurations

It is possible to combine model components to create a custom model. A new entry could be added to the conf/model/ directory, or it could be done through command line arguments. For example, NRMS model with TANR user encoder:

poetry run python src/train_recommender.py +model=nrms model/user_encoder=additive_attention

There are some override presets for Hierarchical User Interest Modeling (from HieRec) and Multi-Interest User Modeling (from MINER). Example for using NRMS with Multi User-interest:

poetry run python src/train_recommender.py +model=nrms +model_overrides=multi_interest

Note: when mixing components, you are responsible for ensuring the necessary features for each component are selected. This can be done either through setting the features field in config. Or through CLI arguments, by for example adding features="[title, abstract, category, subcategory]" to your commands.

Acknowledgements

Credits to all the authors of the papers
Microsoft News Dataset (MIND), see https://msnews.github.io/.
NAML, NRMS and TANR are adapted from implementation of yusanshi, see https://github.com/yusanshi/news-recommendation.

Name		Name	Last commit message	Last commit date
Latest commit History 241 Commits
.github/workflows		.github/workflows
conf		conf
notebooks		notebooks
src		src
tests		tests
.gitignore		.gitignore
.project-root		.project-root
Makefile		Makefile
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Personalized News Recommendation

Experiments

Available models

Getting started

Custom model configurations

Acknowledgements

About

Languages

koengommers/news-recommendation

Folders and files

Latest commit

History

Repository files navigation

Personalized News Recommendation

Experiments

Available models

Getting started

Custom model configurations

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Languages