Welcome to my analysis of the character interaction networks of Star Wars Episodes IV and VII!
A blog post and a Wired magazine blog article provide details and interpretation of the project. Here I'm documenting my work.
The interaction networks for 4 and 7 were obtained from Evelina Gabasova's Github account. They were generated by a program that linked characters together who appeared in the same scene with one another. Only speaking characters (and R2D2 and BB-8) were included.
The interaction networks I used are in the files:
data/starwars-episode-4-interactions.json
data/starwars-episode-7-interactions.json
You can find these in edgelist format (which is how I worked with them) at
swmapping_wired/interactions_4.edges
swmapping_wired/interactions_7.edges
I tried a lot of different approaches to aligning the networks. The results
that feature in my final analysis (in my blog and the Wired blog articles) can
be found in swmapping_wired/character_mappings.txt
In this file, the first column is Episode 4, the second column is Episode 7.
Notably, not all characters appear in the character mapping. When a character doesn't have a mapping, this just means that the algorithm didn't find a good match for it in the network.
I used the C-GRAAL network alignment method. There are many alignment methods out there, all with pros and cons. C-GRAAL felt very well suited to this task because it was guided entirely by the presence of edges. Many other methods assume that the networks contain specific kinds of things (e.g., proteins, genes) and so they make assumptions that simply don't make sense on a character interaction network.
To run the analysis, you'll need a couple tools/libraries:
Once you have these, you can run
xp run net_alignment.fx cgraal_align_nosim
The results will be generated in net_alignment_data
.
I had a lot of fun doing this analysis and would love to hear thoughts, critiques, and ideas. Please email me!