Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: $removeparam #4528

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open

feat: $removeparam #4528

wants to merge 24 commits into from

Conversation

seia-soto
Copy link
Member

@seia-soto seia-soto commented Dec 13, 2024

This adds experimental $removeparam support for adblocker: https://github.com/gorhill/ublock/wiki/static-filter-syntax#removeparam

The suggested matching process changed by this PR is illustrated as follows:

  1. $important (not subject to exceptions)
  2. redirection ($removeparam and $redirect=resource)
  3. normal filters
  4. exceptions

The removeparam filter priorities are as the followings, and also described in the exception test as a code:

Details
  • s1) global removal + exception => global removal wins
@@||example.com$removeparam=utm
||example.com$removeparam

result.filter=global removal
result.exception=(none)

  • s2) removal + global exception => exception wins
||example.com$removeparam=utm
@@||example.com$removeparam

result.filter=removal
result.exception=global exception

  • s3) removal + exception => exception wins
||example.com$removeparam=utm
@@||example.com$removeparam=utm

result.filter=removal
result.exception=exception


  1. global exception
  2. global removal
  3. exception
  4. removal

TODO

  • Parsing regex as value (in separate PR with refactor and clean up of regex util funcs)

@seia-soto seia-soto added the PR: New Feature 🚀 Increment minor version when merged label Dec 13, 2024
@seia-soto seia-soto self-assigned this Dec 13, 2024
@seia-soto seia-soto added the WIP label Dec 13, 2024
Copy link
Member

@chrmod chrmod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seia-soto
Copy link
Member Author

Note: we're going to apply filters one by one

@seia-soto seia-soto removed the WIP label Dec 25, 2024
@seia-soto
Copy link
Member Author

Leaving a note: regexp wrapping: [\\?&](${value!.slice(1, flagSeparatorIndex)}[^=]*)

test: add `isRemoveParam` and `isRedirectable`

chore: reject on `removeparam` negation

test: parsing removeparam

test: removeparam

test: removeparam
@seia-soto seia-soto marked this pull request as ready for review December 27, 2024 11:03
@seia-soto seia-soto requested a review from remusao as a code owner December 27, 2024 11:03
@seia-soto seia-soto requested a review from chrmod December 27, 2024 11:03
packages/adblocker/src/filters/network.ts Show resolved Hide resolved
packages/adblocker/src/filters/network.ts Outdated Show resolved Hide resolved
packages/adblocker/src/filters/network.ts Show resolved Hide resolved
packages/adblocker/test/parsing.test.ts Show resolved Hide resolved
packages/adblocker/src/engine/engine.ts Outdated Show resolved Hide resolved
@seia-soto seia-soto requested a review from remusao January 6, 2025 06:02
Copy link
Member

@chrmod chrmod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this support exceptions like $removeparam=igshid,domain=instagram.com|threads.net?

packages/adblocker/src/filters/network.ts Show resolved Hide resolved
* s1) global removal + exception => global removal wins
```
@@||example.com$removeparam=utm
||example.com$removeparam
```
result.filter=global removal
result.exception=(none)

* s2) removal + global exception => exception wins
```
||example.com$removeparam=utm
@@||example.com$removeparam
```
result.filter=removal
result.exception=global exception

* s3) removal + exception => exception wins
```
||example.com$removeparam=utm
@@||example.com$removeparam=utm
```
result.filter=removal
result.exception=exception

---

0. global exception
1. global removal
2. exception
3. removal
@seia-soto seia-soto requested a review from chrmod January 16, 2025 08:10
packages/adblocker/test/engine/engine.test.ts Show resolved Hide resolved
@@ -1151,6 +1162,8 @@ export default class NetworkFilter implements IFilter {
}
}

mask >>>= 0;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrmod @remusao Should we move this operator to constructor or another place? We need to make sure that the mask to be uint32 before it get serialized for safety. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Unsigned_right_shift_assignment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seia-soto can this be removed now?

Copy link
Member

@chrmod chrmod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets move out the bit operations safety out of this PR. Besides, it looks good imo.

@seia-soto
Copy link
Member Author

Waiting for #4634

@seia-soto seia-soto marked this pull request as draft January 31, 2025 10:24
@chrmod chrmod marked this pull request as ready for review February 6, 2025 16:53
Copy link
Collaborator

@remusao remusao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AdGuard documentation mentions:

Please note that such rules are only applied to GET, HEAD, OPTIONS, and sometimes POST requests.

Where do we intend to perform those checks?

Comment on lines +779 to +780
if (filter.isRemoveParam()) {
redirects.push(filter);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we store removeparam as redirect rules? Aren't they of different nature?

for (const filter of redirectableFilters) {
if (filter.isRemoveParam()) {
if (filter.isException()) {
removeparamExceptionFilters.set(filter.removeparam!, filter);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be possible to inform the type checker from isRemoveParam that filter.removeparam is not undefined/null.

if (filter.isException()) {
removeparamExceptionFilters.set(filter.removeparam!, filter);
} else {
removeparamFilters.set(filter.removeparam!, filter);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to use removeparam as a key in a Set? What about other possible constraints like 1p/3p, or request type such as script/xhr/etc.

// * If redirect=none is found, then cancel all redirects.
// * Else if redirect-rule is found, only redirect if request would be blocked.
// * Else if redirect is found, redirect.
if (result.filter === undefined && redirectFilters.length !== 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the rational behind giving priority to removeparam over redirects? Wouldn't it make sense to do it the other way around? A redirect rule is a block rule + a redirect rule (as per uBO semantics). Hence if a request is blocked it should have priority over keeping the request and remove some of the parameters.

@@ -888,6 +889,16 @@ export default class NetworkFilter implements IFilter {
mask = setBit(mask, NETWORK_FILTER_MASK.isReplace);
optionValue = value;

break;
case 'removeparam':
// TODO: Support regex
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about ~ negation supported by AdGuard? I see it might be handled below but maybe worth documenting the limitation (and having a unit test for each)

}

if (redirectUrl.searchParams.has(key)) {
result.filter = filter;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems invalid in case there is more than one removeparam filter applying to the URL. I also think that in terms of semantics, the result.filter means a blocking filter, but removeparam filters aren't blocking requests. Maybe would it make more sense to add an additional key on result with a list of removeparams (or if we want to generalize, a list of URL/request modifiers)

Comment on lines +686 to +690
for (let i = 0; i < params.length; i++) {
const { redirect } = engine.match(request);
expect(redirect).not.to.be.undefined;
request = urlToDocumentRequest(redirect!.dataUrl);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is surprising to me that we need to call the .match(...) multiple times to remove all params. Shouldn't a single call to .match(...) remove all matching params in a single pass?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We didn't know how to return the filters matched since the return type allows only one filter to be returned. That means not the all of filters matched will be sent to the return value of match unless we change the shape.

Comment on lines +1588 to +1592
result.redirect = {
body: '',
contentType: 'text/plain',
dataUrl: redirectUrl!.toString(),
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit counter-intuitive to me to reuse the dataUrl in order to express the removeparam result. Why not return a list of URL modifiers to be interpreted by the client code and adapted based on available capability of the platform? (iOS, Manifest v2, Manifest v3, might not have the best way to modify request parameters) Ideally the adblocker library should not need to think about these considerations and return a more "abstract" kind of response.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add more tests for corner cases? For example: if we have a global $removeparam and also @@||example.org$removeparam=utm, do we remove all parameters or skip utm?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a test case to cover the regexp removeparam case (as being unsupported for now)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: New Feature 🚀 Increment minor version when merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants