Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Commit

Permalink
Use Facebook user agent for openGraph queries
Browse files Browse the repository at this point in the history
We found that some websites return opengraph information based
on the user agent.  Since Facebook is the creator of opengraph,
using the Facebook user agent when requesting the opengraph metadata
should work in the widest variety of situations.
https://developers.facebook.com/docs/sharing/webmasters/#user-agent

Signed-off-by: Andrew Ryan <andrewryanchama@clover.club>
  • Loading branch information
AndrewRyanChama authored and tenpura-shrimp committed Feb 14, 2022
1 parent 55113dd commit 6530e91
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 6 deletions.
1 change: 1 addition & 0 deletions changelog.d/11985.misc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Use Facebook user agent for openGraph queries.
4 changes: 1 addition & 3 deletions synapse/res/providers.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,11 @@
"endpoints": [
{
"schemes": [
"https://twitter.com/*/status/*",
"https://*.twitter.com/*/status/*",
"https://twitter.com/*/moments/*",
"https://*.twitter.com/*/moments/*"
],
"url": "https://publish.twitter.com/oembed"
}
]
}
]
]
10 changes: 7 additions & 3 deletions synapse/rest/media/v1/preview_url_resource.py
Original file line number Diff line number Diff line change
Expand Up @@ -326,8 +326,9 @@ async def _do_preview(self, url: str, user: UserID, ts: int) -> bytes:

# Compile the Open Graph response by using the scraped
# information from the HTML and overlaying any information
# from the oEmbed response.
og = {**og_from_html, **og_from_oembed}
# from the oEmbed response. og tags from the original html
# have priority over oEmbed data.
og = {**og_from_oembed, **og_from_html}

await self._precache_image_url(user, media_info, og)
else:
Expand Down Expand Up @@ -402,7 +403,10 @@ async def _download_url(self, url: str, output_stream: BinaryIO) -> DownloadResu
url,
output_stream=output_stream,
max_size=self.max_spider_size,
headers={"Accept-Language": self.url_preview_accept_language},
headers={
"Accept-Language": self.url_preview_accept_language,
b"User-Agent": ["Synapse (bot)"],
},
is_allowed_content_type=_is_previewable,
)
except SynapseError:
Expand Down

0 comments on commit 6530e91

Please sign in to comment.