๐ค A Facebook robot used to scrape data in your favorite group. You can retrieve statistics per user, and publish/edit posts automatically, without an API key.
Use only this project with people that consent to share personal data, or data publicly available. Hence, the developers of this tool won't be responsible for any misuse of data collected using this tool.
The purpose of this API is to extract statistics from a group, for example the user who published the best post (post with the highest reactions). You can check the list of built-in statistics for more details.
This bot is used to retrieve data in public/private groups. You must be part of the group to scrape its content. Then, you can retrieve textual data and post information (user, date, reactions/types, comments, replies) within a group.
Development | Status | Feature |
---|---|---|
Group | finished |
|
Post | finished |
|
Comment | finished |
|
API | finished |
|
Statistics | finished |
|
You can also generate statistics with built-in functions and template:
๐ ๐๐๐ฆ๐๐ฌ ๐๐๐ญ๐ฎ๐ฌ ๐
Here is a template example for a meme group.
๐ฅ ๐๐น๐ผ๐ฏ๐ฎ๐น ๐ฅ๐ฎ๐ป๐ธ๐ถ๐ป๐ด ๐ฅ
๐
๐ฝ๐๐จ๐ฉ ๐๐๐ข๐๐จ
๐ฅ Top 1 User
๐ฅ Top 2 User
๐ฅ Top 3 User
๐๐ก๐...
๐ฅ ๐๐ผ๐ป๐ผ๐ฟ๐ ๐ฅ
๐ ๐๐ค๐จ๐ฉ ๐ผ๐๐ฉ๐๐ซ๐
๐ฅ Top 1 User
๐ฅ Top 2 User
๐ฅ Top 3 User
๐๐ก๐...
๐ฅ ๐ฅ๐ฒ๐ฎ๐ฐ๐๐ถ๐ผ๐ป๐ ๐ฅ
๐ ๐๐ช๐ฃ๐ฃ๐๐๐จ๐ฉ
๐ฅ Top 1 User
๐ฅ Top 2 User
๐ฅ Top 3 User
๐๐ก๐...
๐ Message generated at 2020-12-12T19:44:10.600157
You can of course create your own template.
The bot need a valid account to extract information. Then, it scrapes all posts in a Facebook group feed. For each post, it extracts the reactions, comments, replies and their respective reactions.
The scraping process is made with a Firefox webdriver, a.k.a geckodriver
. You can download one here.
BeautifulSoup
Selenium
Python 3.8
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
Before using this project, make sure you have Python 3.8 and Git. You should also download a webdriver. You can find geckodriver here.
First, clone this repository using git
in your terminal:
git clone https://github.com/arthurdjn/facebook-hall-of-fame
Then, install the dependencies using pip (should be installed with Anaconda) in your Anaconda terminal:
pip install -r requirements.txt
If you have any issues using the above command, try installing each package separately, using:
pip install NameOfPackage
To block scraping process and protect user's data, facebook uses dynamic CSS sheets. Thus, when a facebook page is refreshed or loaded, the HTML elements id change over time. To bypass this issue, this package uses a table of known elements, so the bot can be aware when an id changed in the CSS when a page refreshes.
You will need to provide URLs to unique reactions. You can achieved that by creating a group, create multiple posts and associate for each a unique reaction (LOVE
,AHAH
,LIKE
etc.). Then, click on the reaction and copy/paste its URL. Note that you should use the mobile version of facebook to retrive the URL and not the standard version of facebook.
Connect to the api:
# Global parameters
EXECUTABLE_PATH = "driver/geckodriver.exe"
REACTION2HREF = {
"LIKE": "/ufi/reaction/profile/browser/?ft_ent_identifier=",
"LOVE": "/ufi/reaction/profile/browser/?ft_ent_identifier=",
"CARE": "/ufi/reaction/profile/browser/?ft_ent_identifier=",
"AHAH": "/ufi/reaction/profile/browser/?ft_ent_identifier=",
"WOW": "/ufi/reaction/profile/browser/?ft_ent_identifier=",
"SAD": "/ufi/reaction/profile/browser/?ft_ent_identifier=",
"ANGER": "/ufi/reaction/profile/browser/?ft_ent_identifier="
}
EMAIL = "your_email"
PASSWORD = "your_password"
# Connect to the API
from halloffame import HallOfFameAPI
api = HallOfFameAPI(executable_path=EXECUTABLE_PATH, reaction2href=REACTION2HREF)
api.login(EMAIL, PASSWORD)
# Initialize the table of reactions
api.init_reactions()
Then, connect to a group and start scraping:
# To retrieve everything (posts, comments, reactions)
posts = api.get_posts("your_group_id")
# To retrieve comments
comments = api.get_comments("your_group_id", "your_post_id")
# To retrieve reactions
reactions = api.get_reactions("your_post_id")
Statistics | Description |
---|---|
BEST-POST-REACTION |
Ordered list of posts by their number of reactions (all categories)). |
BEST-COMMENT-REACTION |
Ordered list of comments by their number of reactions (all categories)). |
BEST-REPLY-REACTION |
Ordered list of replies by their number of reactions (all categories)). |
POST-COUNT |
Ordered list of user by their number of posts. |
REACTION-COUNT |
Ordered list of user by their number of reactions. |
COMMENT-REPLY-COUNT |
Ordered list of user by their number of comments and replies. |
COMMENT-COUNT |
Ordered list of user by their number of comments only. |
REPLY-COUNT |
Ordered list of user by their number of replies only. |
REACTION-AHAH |
Ordered list of user by their number of AHAH reaction. |
REACTION-LOVE |
Ordered list of user by their number of LOVE reaction. |
REACTION-CARE |
Ordered list of user by their number of CARE reaction. |
REACTION-WOW |
Ordered list of user by their number of WOW reaction. |
REACTION-SAD |
Ordered list of user by their number of SAD reaction. |
REACTION-ANGER |
Ordered list of user by their number of ANGER reaction. |
REACTION-LIKE |
Ordered list of user by their number of LIKE reaction. |
You can compute the statistics from a list posts
of Post
using:
from halloffame import get_top_stats
stats = get_top_stats(posts)
stats = {
"BEST-POST-REACTION": [
{
"user_id": 97987,
...
},
...
],
...
}
To apply statistics in a facebook post, you can use a template: it will fasten your workflow.
Simply write the general structure of your text and wrap the elements that sill change (either stats or fonts) with << >>
tags.
For example, to apply a bold font on the text this is a text
, simply use <<BOLD>>this is a text<<BOLD>>
.
Same for statistics: <<TOP1-BEST-POST-REACTION>>
. Note that BEST-POST-REACTION
is a list, so to get the first user add the token TOP1
(and TOP2
for the second etc.).
You can also use both together: <<BOLD>><<TOP1-BEST-POST-REACTION>><<BOLD>>
template = """
๐ <<BOLD-SERIF>>Hall Of Fame<<BOLD-ITALIC>> ๐
Here is a template example for a meme group.
๐ฅ <<BOLD>>Rank<<BOLD>> ๐ฅ
๐
<<BOLD-ITALIC>>Best Memes<<BOLD-ITALIC>>
๐ฅ <<TOP1-BEST-POST-REACTION>>
๐ฅ <<TOP2-BEST-POST-REACTION>>
๐ฅ <<TOP3-BEST-POST-REACTION>>
๐๐ก๐...
๐ฅ <<BOLD>>Honors<<BOLD>> ๐ฅ
๐ <<BOLD-ITALIC>>Most Active<<BOLD-ITALIC>>
๐ฅ <<TOP1-POST-COUNT>>
๐ฅ <<TOP2-POST-COUNT>>
๐ฅ <<TOP3-POST-COUNT>>
๐๐ก๐...
๐ฅ <<BOLD>>Reactions<<BOLD>> ๐ฅ
๐ <<BOLD-ITALIC>>Funniest<<BOLD-ITALIC>>
๐ฅ <<TOP1-REACTION-AHAH>>
๐ฅ <<TOP2-REACTION-AHAH>>
๐ฅ <<TOP3-REACTION-AHAH>>
๐๐ก๐...
๐ Message generated at <<DATE-NOW>>
"""
Then,
from halloffame import apply_template, get_top_stats
stats = get_top_stats(posts)
generated_text = apply_template(template, stats)