Skip to content

Characterizing the structure of communication on TikTok using frequent subgraph mining, graph embeddings, and sentiment analysis.

Notifications You must be signed in to change notification settings

Marcus-Friis/StitchGraph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TikTok StitchGraph 🎵

TikTok StitchGraph is a collection of 36 graphs representing user and video relationships through TikTok's stitch feature. Built using the TikTok research API and web scraping, these graphs explore stitch patterns of users.

TikTok API

Getting started

The TikTok API documentaiton describes how to use it. We have created a notebook to explore the use of this API.

Setup TikTok API access

To access TikTok API, you need a client_key and client_secret. These are to be put in a /secrets/ directory. To set this up, copy /secrets_template/ as such

cp -r secrets_template secrets

Fill out the /secrets/tiktok.json file with your secrets, and it should work.

Get Hashtag stitches

The script get_hashtag.py scrapes TikTok videos (that are stitches) using a specific hashtag.

Usage

python src/get_hashtag.py HASHTAG_NAME
  • HASHTAG_NAME: (Required) The hashtag to scrape.

The script scrapes videos between 2024-05-01 and 2024-05-31 and saves them as {hashtag}.json in the data/ directory.

Example
python src/get_hashtag.py cooking

Stitch Edge Scraper

The script get_edges.py scrapes stitch relationships between TikTok videos using previously downloaded data.

Usage

python src/get_edges.py HASHTAG_NAME [START_INDEX]
  • HASHTAG_NAME: (Required) The hashtag to use.
  • START_INDEX: (Optional) Index to resume scraping from.

The script processes videos from {hashtag}_.json and outputs the edges (stitcher -> stitchee) to {hashtag}_edges.txt.

Repair Mode

To repair incomplete edges:

python src/get_edges.py HASHTAG_NAME repair

Extract targets

The get_targets.py script processes a list of TikTok video URLs to extract stitchee video IDs—the videos that have been stitched by other users (stitchers). It then collects detailed data about these stitchee videos over specified date intervals using the TikTok API. The aggregated data is saved into a JSON file for further analysis.

Arguments

  • HASHTAG_NAME (Required): Specifies the hashtag for which video data will be retrieved.
  • BATCH_SIZE (Optional): Specifies the number of video IDs to process in each API request batch. Defaults to 10,000

Usage

python get_targets.py cooking 5000

This will retrieve information for the stitchee videos associated with the cooking hashtag, processing 5,000 video IDs per API request batch.

About

Characterizing the structure of communication on TikTok using frequent subgraph mining, graph embeddings, and sentiment analysis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published