-
Notifications
You must be signed in to change notification settings - Fork 4
Config syntax
The config file is written in yaml. Here's an example:
auth:
username: admin
password: 'secr3tp@ssw0rd'
endpoints:
- path: /tokio-blog.xml
note: Full text of Tokio blog
source: https://tokio.rs/_next/static/feed.xml
filters:
- full_text: {}
- simplify_html: {}
- path: /hackernews.xml
note: Full text of Hacker News
source: https://news.ycombinator.com/rss
filters:
- full_text:
simplify: true
append_mode: true
The most crucial part of the configuration is the definition of endpoints. Each endpoint corresponds to an RSS feed ready for consumption.
Properties:
-
path
(required): The path of the endpoint. The path should start with a forward slash/
. -
note
(optional): A note describing the endpoint. Used for display purposes only. -
on_the_fly_filters
(optional): Enable On‐the‐Fly filters. Defaults to false. -
source
(optional): The source URL of the RSS feed.- If not specified, the source is dynamic. To use this endpoint, you must include
?source=<url>
query in the request. This allows applying the same filters to different feeds. - If the source points to an HTML page,
rss-funnel
will attempt to generate an RSS feed from the page with a single article. You can then use thesplit
filter to divide the single article into multiple articles. See Cookbook: Hacker News Top Links for an example.
- If not specified, the source is dynamic. To use this endpoint, you must include
-
filters
(required): A list of filters to apply to the feed.- The feed from the
source
goes through the filters in the specified order. Each filter corresponds to a transformation on theFeed
object. - Each filter is specified as a YAML object with the filter's name as the key and its configuration as the value.
- For example, in the filter definition:
- keep_element: .p_mainnew
- The filter's name is
keep_element
. - The configuration is the string value
.p_mainnew
. The configuration type depends on the filter.
- The filter's name is
- The
Feed
object from the last filter is returned as the response.
- The feed from the
-
client
(optional): The configuration for the HTTP client used to fetch the source, such as the user agent. See Client config for details.
Requests need to be made to remote servers in two places:
- The initial fetch of the
source
. - In the
full_text
filter. - In the
merge_feed
filter.
You may want to specify certain HTTP configurations for these requests. You can specify these configurations through the client
field.
Available fields:
-
timeout
(optional, Duration): The timeout for each individual request. You can specify a string value like20s
(supported formats). Defaults to 10 seconds. -
user_agent
(optional, String): The user agent used for fetching the URL. Defaults torss-funnel/<version>
. -
accept
(optional, String): TheAccept
header value. -
referer
(optional, String): TheReferer
header value. -
cookie
(optional, String): The value for theCookie
header. -
assume_content_type
(optional, String): Assume the server returns thisContent-Type
. Useful for feeds whose server returns an incorrectContent-Type
that prevents proper parsing. -
proxy
(optional, String): The proxy server to use for requests. The format ishttp://<host>:<port>
orsocks5://<host>:<port>
(See Proxy in reqwest - Rust).
The source
is a configuration data structure that holds a future feed. This configuration is currently used in two places:
- In the
source
field inside the endpoint's configuration. - In the
merge
filter, either implicitly as the configuration's value or via asource
field. For themerge
filter, you can specify multiple sources in an array, and they will be fetched in parallel.
The source can be written in three formats:
- An absolute URL to the RSS source, e.g.,
source: https://example.com/feed.xml
. - A relative URL, starting with a forward slash
/
, that refers to another endpoint on the instance, e.g.,source: /another-endpoint.xml?source=https://example.com/feed.xml
. - A "from scratch" structure, which is an object with the following string fields. Such a source represents a blank feed created from scratch:
-
format
(required, enum: "rss" or "atom") -
title
(required, string) -
link
(optional, string) -
description
(optional, string)
-
- A templated url, which is an object containing the following fields. This allows the endpoint to accept additional query parameters for the placeholders defined below and fill the template accordingly.
-
template
(required, a string) The template should contain one or more placeholders in form of${NAME}
whereNAME
is the name of the placeholder defined below -
placeholders
(object, key: placeholder name, value: placeholder config)-
default
: (optional, string) The default value for the placeholder if unspecified in endpoint request. -
validation
: (optional, string) A regular expression to validate against.
-
-
The URL can point to a feed source in rss and atom format. But it can also point to a HTML page, in which case, a feed will be generated with the page's body as the only post, allowing for future customization with filters.
Please note that the source can be omitted in the config to enable dynamic source on this endpoint. In which case, the source must be dynamically specified in the endpoint like /endpoint.xml?source=https://website.com/rss.xml
for the endpoint to function.
With a templated source defined, you can request the endpoint like /endpoint.xml?NAME=value
to fill the placeholders with specific values. See https://github.com/shouya/rss-funnel/pull/139 for more examples.
See Filter config.
You can specify the authentication info in the configuration file to protect the inspector UI behind a login page. The configuration syntax is as follows:
auth:
username: admin
password: hunter2
endpoints: ...
If auth
config is not specified, the inspector ui will be public.