Use Python to explore a website's internal links. Then apply D3 to visualize those connections as an interactive network graph with scorecards.
- Git
- Python (When installing on Windows, make sure you check the "Add python 3.xx to PATH" box.)
- Install the above programs.
- Open a shell window (For Windows open PowerShell, for MacOS open Terminal & for Linux open your distro's terminal emulator).
- Clone this repository using
git
by running the following command:git clone git@github.com:devbret/website-internal-links.git
. - Navigate to the repo's directory by running:
cd website-internal-links
. - Install the needed dependencies for running the script by running:
pip install -r requirements.txt
. - Edit the app.py file on line 115, to include the website you would like to visualize.
- Run the script with the command
python3 app.py
. - To view the website's connections using the index.html file you will need to run a local web server. To do this run:
python3 -m http.server
. - Once the network map has been launched, hover over any given node for more information about the particular web page. By clicking on a node, you will be sent to the related URL address.
Generating visualizations for this app takes an unexpectedly large amount of processing power. It is thus advisable to initially experiment with mapping less than one hundred pages per launch.