A metadata scraper with support for oEmbed, Twitter Cards and Open Graph Protocol for Node.js (>=v8.0.0).
Note: Will not work in the Browser
Unfurl (spread out from a furled state) will take a url
and some options
, fetch the url
, extract the metadata we care about and format the result in a sane way. It supports all major metadata providers and expanding it to work for any others should be trivial.
So you know when you link to something on Slack, or Facebook, or Twitter - they typically show a preview of the link. To do so they have crawled the linked website for metadata and enriched the link by providing more context about it. Which usually entails grabbing its title, description and image/player embed.
npm install unfurl.js
oembed?: boolean
- support retrieving oembed metadatatimeout? number
- req/res timeout in ms, it resets on redirect. 0 to disable (OS limit applies)follow?: number
- maximum redirect count. 0 to not follow redirectcompress?: boolean
- support gzip/deflate content encodingsize?: number
- maximum response body size in bytes. 0 to disableheaders?: Headers | Record<string, string> | Iterable<readonly [string, string]> | Iterable<Iterable<string>>
- map of request headers, overrides the defaults
Default headers:
{
'Accept': 'text/html, application/xhtml+xml',
'User-Agent': 'facebookexternalhit'
}
import { unfurl } from 'unfurl.js'
const result = unfurl('https://github.com/trending')
type Metadata = {
title?: string
description?: string
keywords?: string[]
favicon?: string
author?: string
theme_color?: string
canonical_url?: string
oEmbed?: OEmbedPhoto | OEmbedVideo | OEmbedLink | OEmbedRich
twitter_card: {
card: string
site?: string
creator?: string
creator_id?: string
title?: string
description?: string
players?: {
url: string
stream?: string
height?: number
width?: number
}[]
apps: {
iphone: {
id: string
name: string
url: string
}
ipad: {
id: string
name: string
url: string
}
googleplay: {
id: string
name: string
url: string
}
}
images: {
url: string
alt: string
}[]
}
open_graph: {
title: string
type: string
images?: {
url: string
secure_url?: string
type: string
width: number
height: number
alt?: string
}[]
url?: string
audio?: {
url: string
secure_url?: string
type: string
}[]
description?: string
determiner?: string
site_name?: string
locale: string
locale_alt: string
videos: {
url: string
stream?: string
height?: number
width?: number
tags?: string[]
}[]
article: {
published_time?: string
modified_time?: string
expiration_time?: string
author?: string
section?: string
tags?: string[]
}
}
}
type OEmbedBase = {
type: "photo" | "video" | "link" | "rich"
version: string
title?: string
author_name?: string
author_url?: string
provider_name?: string
provider_url?: string
cache_age?: number
thumbnails?: [
{
url?: string
width?: number
height?: number
}
]
}
type OEmbedPhoto = OEmbedBase & {
type: "photo"
url: string
width: number
height: number
}
type OEmbedVideo = OEmbedBase & {
type: "video"
html: string
width: number
height: number
}
type OEmbedLink = OEmbedBase & {
type: "link"
}
type OEmbedRich = OEmbedBase & {
type: "rich"
html: string
width: number
height: number
}
(If you use unfurl.js too feel free to add your project)
- vapid/vapid - A template-driven content management system
- beeman/micro-unfurl - small microservice that unfurls a URL and returns the OpenGraph meta data.
- probot/unfurl - a GitHub App built with probot that unfurls links on Issues and Pull Request discussions