TikTok StitchGraph is a collection of 36 graphs representing user and video relationships through TikTok's stitch feature. Built using the TikTok research API and web scraping, these graphs explore stitch patterns of users.
The TikTok API documentaiton describes how to use it. We have created a notebook to explore the use of this API.
To access TikTok API, you need a client_key
and client_secret
. These are to be put in a /secrets/
directory. To set this up, copy /secrets_template/
as such
cp -r secrets_template secrets
Fill out the /secrets/tiktok.json
file with your secrets, and it should work.
The script get_hashtag.py
scrapes TikTok videos (that are stitches) using a specific hashtag.
python src/get_hashtag.py HASHTAG_NAME
HASHTAG_NAME
: (Required) The hashtag to scrape.
The script scrapes videos between 2024-05-01
and 2024-05-31
and saves them as {hashtag}.json
in the data/
directory.
python src/get_hashtag.py cooking
The script get_edges.py
scrapes stitch relationships between TikTok videos using previously downloaded data.
python src/get_edges.py HASHTAG_NAME [START_INDEX]
HASHTAG_NAME
: (Required) The hashtag to use.START_INDEX
: (Optional) Index to resume scraping from.
The script processes videos from {hashtag}_.json
and outputs the edges (stitcher -> stitchee) to {hashtag}_edges.txt
.
To repair incomplete edges:
python src/get_edges.py HASHTAG_NAME repair
The get_targets.py script processes a list of TikTok video URLs to extract stitchee video IDs—the videos that have been stitched by other users (stitchers). It then collects detailed data about these stitchee videos over specified date intervals using the TikTok API. The aggregated data is saved into a JSON file for further analysis.
HASHTAG_NAME
(Required): Specifies the hashtag for which video data will be retrieved.BATCH_SIZE
(Optional): Specifies the number of video IDs to process in each API request batch. Defaults to 10,000
python get_targets.py cooking 5000
This will retrieve information for the stitchee videos associated with the cooking hashtag, processing 5,000 video IDs per API request batch.