Swap graphs: Discovering the role of neural network components at scale

This library is an implementation of input swap graphs described in this post. It is a tool to uncover the role of neural networks component by using causal interventions.

The library is built on top of TransformerLens. The code base is in a very early stage and under active development. Feel free to contact me at [email protected] if you have questions about the code or want to make serious use of it.

I'd recommend starting with this colab demo. For a more advanced example that uses swap graphs to craft validation experiments, you can explore the tests directory.

You can also check out the nanoQA demo that demonstrates how to use swap graphs to investigate how GPT-2 small answer questions in-context!

Install

pip install git+https://github.com/aVariengien/swap-graphs.git

Scripts

We also provide scripts to handle swap graphs at scale.

compute_sgraphs.py is used to compute a swap graph for every component at a given position (often the last position in a sequence).
plot_semantic_maps.py uses the fiels created by compute_sgraphs.py to create the semantic maps visualisation.
sgraph_causal_scrubbing.py runs causal scrubbing experiments where all components up to layer L are scrubbed.
targetted_rewrite.py (only for the IOI dataset) runs targetted rewrite experiments for the senders and extended name mover heads.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
demo		demo
scripts		scripts
swap_graphs		swap_graphs
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Swap graphs: Discovering the role of neural network components at scale

Install

Scripts

About

Releases

Packages

Languages

License

aVariengien/swap-graphs

Folders and files

Latest commit

History

Repository files navigation

Swap graphs: Discovering the role of neural network components at scale

Install

Scripts

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages