Table of Contents
This code was developed to help with network analyses for various investigations that took place at Der SPIEGEL and Paper Trail Media.
The project uses Poetry to manage python dependencies. To install see the Poetry documentation for all options. To use the quick installer provided by Poetry run:
curl -sSL https://install.python-poetry.org | python3 -
- Clone the repo
git clone https://github.com/critocrito/graphctl.git
- Install the python dependencies
poetry install
- Make sure the command runs
poetry run graphctl --help
The graphctl
command takes a CSV as input and outputs again CSV files with the results of the computations. The input CSV needs to have a source
field and a target
field, describing the nodes and their connections. This would be an example for a network.csv
.
source,target
nodeA,nodeB
nodeB,nodeC
nodeA,nodeC
nodeC,nodeD
...
Every command takes a -g/--graph
option which selects either a directed or a undirected graph. It defaults to a undirected graph.
poetry run graphctl -g directed <computation>
Here is a list of all possible outputs that can be generated from the above network.
Compute all insights about a network in one go. This includes most of the below computations.
poetry run graphctl all network.csv out_dir
-
Basic
Compute a set of basic topological attributes to give a quick overview over the network.
poetry run graphctl topology basic network.csv topology.csv
-
Degree Centrality
Degree centrality assigns an importance score based simply on the number of links held by each node. In this analysis, that means that the higher the degree centrality of a node is, the more edges are connected to the particular node and thus the more neighbor nodes (communication partners) this node has. In fact, the degree of centrality of a node is the fraction of nodes it is connected to. In other words, it is the percentage of the network that the particular node is connected to meaning having communicated with.
poetry run graphctl centrality degree network.csv degree-centrality.csv
-
Betweeneess Centrality
Betweenness centrality measures the number of times a node lies on the shortest path between other nodes, meaning it acts as a bridge. In detail, betweenness centrality of a node is the percentage of all the shortest paths of any two nodes (apart from ), which pass through . Specifically, this measure is associated with the user’s ability to influence others. A user with a high betweenness centrality acts as a bridge to many users that are not friends and thus has the ability to influence them by conveying information (e.g. by posting something or sharing a post) or even connect them via the user’s circle (which would reduce the user’s betweeness centrality after).
poetry run graphctl centrality betweenness network.csv betweenness-centrality.csv
-
Closeness Centrality
Closeness centrality scores each node based on their ‘closeness’ to all other nodes in the network. For a node , its closeness centrality measures the average farness to all other nodes. In other words, the higher the closeness centrality of , the closer it is located to the center of the network.
poetry run graphctl centrality closeness network.csv closeness-centrality.csv
-
Eigenvector Centrality
Eigenvector centrality is the metric to show how connected a node is to other important nodes in the network. It measures a node’s influence based on how well it is connected inside the network and how many links its connections have and so on. This measure can identify the nodes with the most influence over the whole network. A high eigenvector centrality means that the node is connected to other nodes who themselves have high eigenvector centralities. The measure is associated with the users ability to influence the whole graph and thus the users with the highest eigenvector centralities are the most important nodes in this network.
poetry run graphctl centrality eigenvector network.csv eigenvector-centrality.csv
-
K-clique
poetry run graphctl community k-clique network.csv k-clique-communities.csv
-
Louvain
poetry run graphctl community louvain network.csv louvain-communities.csv
-
Label Propagation
poetry run graphctl community label-propagation network.csv label-propagation-communities.csv
-
Graph
Render the whole network graph.
poetry run graphctl plot graph --iterations 15 network.csv graph.png
-
Bridges
Render the graph and mark the bridges in the graph.
poetry run graphctl plot bridges network.csv bridges.png
-
Degree Centrality Distribution
Plot the distribution of degree centrality as a bar chart.
poetry run graphctl plot degree-centrality-distribution network.csv degree-centrality.png
-
Betweenness Centrality Distribution
Plot the distribution of betweenness centrality as a bar chart.
poetry run graphctl plot betweenness-centrality-distribution network.csv betweenness-centrality.png
-
Eigenvector Centrality Distribution
Plot the distribution of eigenvector centrality as a bar chart.
poetry run graphctl plot eigenvector-centrality-distribution network.csv eigenvector-centrality.png
-
Closeness Centrality Distribution
Plot the distribution of the closeness centrality as a bar chart.
poetry run graphctl plot closeness-centrality-distribution network.csv closeness-centrality.png
-
Label Propagation Community
Plot the graph with the label propagation computed communities.
poetry run graphctl plot label-propagation-community network.csv label-propagation.png
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the GPL-3.0 License. See LICENSE.txt
for more information.
Christo Buschek - @christo_buschek - christo.buschek@proton.me
Project Link: https://github.com/critocrito/graphctl