Proposal: A DNR rule type to intercept top-document navigation #744
Labels
needs-triage: chrome
Chrome needs to assess this issue for the first time
needs-triage: firefox
Firefox needs to assess this issue for the first time
needs-triage: safari
Safari needs to assess this issue for the first time
Motivation
Interception of top-level navigation is a key feature of uBlock Origin, called "strict-blocking":
Screenshot
Recently, I have worked on porting the feature to uBO Lite (uBOL), which is an MV3-based extension.
Current approach
Implementing top-level navigation to URLs matching a rule is currently possible but (in my opinion) this is cumbersome, and this quickly leads to DNR-imposed limitations.
In the current state of the DNR API, the only way to intercept top-level document navigation is to use
regexSubstitution
-basedredirect
rules.Here are two examples of such rules (actually used by uBOL):
Issues with current approach
Rules must be dynamic- or session-based
The
regexSubstitution
approach requires that the rules are created dynamically and added to either the_dynamic
or_session
ruleset. The reason is that theregexSubstitution
property must point to the full path of the extension document to be used as replacement to the original URL.The two example rules above show a relative path for
regexSubstitution
, but these are unusable in a static ruleset, the actual path must be computed using JS code, and thus these rules can only work as dynamic or session rules.There are complication arising from the need to patch and add these rules as dynamic or session rules. The extension document must be declared as a
web_accessible_resources
inmanifest.json
:The use of the
use_dynamic_url
property ensure that websites will be unable to detect the extension by trying to fetch a resource known to be exposed by the extension.However this also means that the value to use for
regexSubstitution
will change each time the extension is launched, which means the rules can't be added to the_dynamic
ruleset, and thus must be added to the_session
ruleset, which in turn means that the extension might be unable to intercept in time navigation to an otherwise matching strict-block rule -- because all the session rules are being constructed and added at extension launch (extension wake-up is fine since the session rules are left untouched when the extension's worker is suspended).A possible solution for this is to not use
use_dynamic_url
, and to add a random part to the name of exposed resource, in which case the strict-block rules can be patched and added at extension install or update time only, as dynamic rules persist between extension launch. This makes it more difficult to detect the extension, but this still is possible in between update of the extension should an adversary closely watch when the extension updates in order to modify the detection code with the latest resource name.Max number of regex-based rules
Since these rules require the use of a regex-based filter (
regexFilter
), this quickly leads to hit the maximum number of regex rules,declarativeNetRequest.MAX_NUMBER_OF_REGEX_RULES
, which currently is 1000 in Chromium.For rules which consist only of a hostname (i.e.
||example.com^
), this is not really an issue since all these hostnames can be collated in thecondition.requestDomains
array of a single rule.However, all the pattern-based rules, those having a
condition.urlFilter
orcondition.regexFilter
cannot be collated into a single DNR rule, each must be its own rule. This quickly leads to hit thedeclarativeNetRequest.MAX_NUMBER_OF_REGEX_RULES
limit.Because of this, should the limit be met, choices have to be made about whether strict-block rules have priority over other non-strict-block regex-based rules. Not ideal given that one of the main purpose of strict-block rules is to prevent navigation to undesirable webpages and let the user decide to proceed or not.
Retrofitting into
regexSubstitution
is cumbersomeAlso, having to convert
urlFilter
orregexFilter
to a properregexFilter
for the sake of capturing the whole URL being navigated to is cumbersome in my opinion. For example, the originalurlFilter
-based filter from which the second strict-block rule above was derived is/bdv_rd.dbm?ownid=
: Obviously we need to convert/bdv_rd.dbm?ownid=
so that it both matches as originally intended, and capture the whole URL so that it can be fed into theregexSubstitution
property.Better approach
Because of all of the above, I conclude it's not possible to enforce all of the strict-block related uBO filters. I do not know the exact details of a potential solution to be discussed, but as a start I think having a new
redirect
property might be the way:I picked
redirect.interceptURL
but I am not good with naming, I am sure something better can be proposed. The\\0
part simply tells where the full URL of the intercepted navigation would go, so that it is exposed to the intercepting extension document. The DNR API would internally apply and use the actual full extension path the same way it's done for theredirect.extensionPath
property.Also, it might be safer to have a new type of redirect which can apply only to network requests related to top navigation, in which case it would not be necessary to declare
"resourceTypes": [ "main_frame" ]
as this would be implicit.Benefits of
redirect.interceptURL
-like approach:regexFilter
required, hence no consequence onMAX_NUMBER_OF_REGEX_RULES
limiturlFilter
/regexFilter
patterns to enable full URL capture forregexSubstitution
sakeredirect.extensionPath
The text was updated successfully, but these errors were encountered: