Skip to content

Latest commit

 

History

History

node2vec

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Generating network embeddings from Hi-C contacts

How to use

python hic_embeddings.py --infile connections.npy --outfile labelled_network_graph.html --name sample_name --k 2 --dimensions 2 --p 0.1 --q 0.5 --save_embeddings T

Input and output files

  • --infile: A symmetric numpy matrix with dimensions $n \times n$, where n is the number of scaffolds in the assembly, and each non-zero entry $a_{ij}$ and $a_{ji}$ indicates the presence of a connection between scaffolds i and j. Non-zero entries may be either actual counts or 1, but will be ignored by the script by default. The script hic_links.py in this repository may be used to generate a suitable matrix from a pair file generated by a scaffolding pipeline.

  • --outfile: Path to a html file to save the labelled network graph to

  • --save_embeddings: Whether to save a .npy array containing the embedding vectors (default = F, set to T to enable).

  • --name: Sample name, used to name various output files generated by the script.

  • Parameters

  • --k: The number of clusters to use to label the embeddings (default = 2)

  • --dimensions: The number of dimensions for the embedding vector.

  • --p: $p$ parameter for node2vec (default = 1)

  • --q: $q$ parameter for node2vec (default = 1)