Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: $removeparam #4528

Open
wants to merge 26 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
94f5868
feat: `$removeparam`
seia-soto Dec 13, 2024
29e9c39
fix: check `parametersRemoved` to prevent infinite loop
seia-soto Dec 13, 2024
3ec1181
fix: parsing and restoration
seia-soto Dec 13, 2024
fd43ad1
feat: removeparam without option value
seia-soto Dec 13, 2024
9497047
refactor: drop use of URLSearchParams
seia-soto Dec 25, 2024
56f90cc
feat: introduce `NetworkFilter.prototype.isRedirectable`
seia-soto Dec 25, 2024
47b1302
refactor: improve matching search param key
seia-soto Dec 25, 2024
7ea4488
test: removeparam
seia-soto Dec 27, 2024
29e9022
test: removeparam removes parameter sequentially
seia-soto Dec 27, 2024
3cb0861
fix(test): unintentional isRedirectable definition
seia-soto Dec 27, 2024
9c1e12b
refactor: use `URL` constructor
seia-soto Jan 6, 2025
c8da3bb
refactor: clarify `removeparam` getter
seia-soto Jan 6, 2025
533110e
refactor: move heavy calls to `before`
seia-soto Jan 14, 2025
1c07025
fix(test): invalid scopes
seia-soto Jan 15, 2025
90d051b
refactor: remove url transforms outside of before hook
seia-soto Jan 15, 2025
ec917aa
fix: removeparam exception
seia-soto Jan 16, 2025
c1db908
fix(test): make global removeparam work with malformed urls
seia-soto Jan 16, 2025
d48fe63
fix(test): correct bucket location
seia-soto Jan 16, 2025
88f35a1
fix: convert to unsigned int on set
seia-soto Jan 23, 2025
160febd
fix: convert to unsigned int on bit clear
seia-soto Jan 23, 2025
4040a89
fix: modify mask safely when applying cpt mask
seia-soto Jan 23, 2025
90ce5c2
fix: make mask unsigned at the end of parse
seia-soto Jan 23, 2025
da507f4
chore: drop unnecessary operation
seia-soto Jan 31, 2025
3c6236b
Merge branch 'master' into removeparam
chrmod Feb 6, 2025
09ea7e7
chore: drop unnecessary unsigned right shift by #4634
seia-soto Feb 10, 2025
76e6a51
feat: removeparam with request substitution bucket
seia-soto Feb 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 106 additions & 21 deletions packages/adblocker/src/engine/engine.ts
Original file line number Diff line number Diff line change
Expand Up @@ -776,10 +776,14 @@ export default class FilterEngine extends EventEmitter<EngineEventHandlers> {
} else if (filter.isGenericHide() || filter.isSpecificHide()) {
hideExceptions.push(filter);
} else if (filter.isException()) {
exceptions.push(filter);
if (filter.isRemoveParam()) {
redirects.push(filter);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we store removeparam as redirect rules? Aren't they of different nature?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're different but removeparam is implemented using redirection.

} else {
exceptions.push(filter);
}
} else if (filter.isImportant()) {
importants.push(filter);
} else if (filter.isRedirect()) {
} else if (filter.isRedirectable()) {
redirects.push(filter);
} else {
filters.push(filter);
Expand Down Expand Up @@ -1443,29 +1447,97 @@ export default class FilterEngine extends EventEmitter<EngineEventHandlers> {
if (request.isSupported) {
// Check the filters in the following order:
// 1. $important (not subject to exceptions)
// 2. redirection ($redirect=resource)
// 2. redirection ($removeparam and $redirect=resource)
// 3. normal filters
// 4. exceptions
result.filter = this.importants.match(request, this.isFilterExcluded.bind(this));

let redirectNone: NetworkFilter | undefined;
let redirectRule: NetworkFilter | undefined;

// 1st priority: @@||<entity>$removeparam
// 2nd priority: ||<entity>$removeparam
// 3rd priority: @@||<entity>$removeparam=x
// 4th priority: ||<entity>$removeparam=x
let removeparamExceptionFilter: NetworkFilter | undefined;
let redirectUrl: URL | undefined;

// If `result.filter` is `undefined`, it means there was no $important
// filter found so far. We look for a $redirect filter. There is some
// extra logic to handle special cases like redirect-rule and
// redirect=none.
//
// * If redirect=none is found, then cancel all redirects.
// * Else if redirect-rule is found, only redirect if request would be blocked.
// * Else if redirect is found, redirect.
// filter found so far. We look for $removeparam and $redirect filter.
if (result.filter === undefined) {
const redirects = this.redirects
.matchAll(request, this.isFilterExcluded.bind(this))
// highest priorty wins
.sort((a, b) => b.getRedirectPriority() - a.getRedirectPriority());
const redirectableFilters = this.redirects.matchAll(
request,
this.isFilterExcluded.bind(this),
);
const redirectFilters: NetworkFilter[] = [];
const removeparamFilters: Map<string, NetworkFilter> = new Map();
const removeparamExceptionFilters: Map<string, NetworkFilter> = new Map();
for (const filter of redirectableFilters) {
if (filter.isRemoveParam()) {
if (filter.isException()) {
removeparamExceptionFilters.set(filter.removeparam!, filter);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be possible to inform the type checker from isRemoveParam that filter.removeparam is not undefined/null.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's happening because I defined it as getter.

} else {
removeparamFilters.set(filter.removeparam!, filter);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to use removeparam as a key in a Set? What about other possible constraints like 1p/3p, or request type such as script/xhr/etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those conditions (1p, 3p, and request types) should be handled in the network filter matching process.

}
} else {
redirectFilters.push(filter);
}
}

// We don't need to match `removeparam` at all if:
// * `removeparam` filters not matched
// * URL doesn't have search parameter
if (removeparamFilters.size !== 0) {
redirectUrl = new URL(request.url);
if (redirectUrl.searchParams.size !== 0) {
for (const [key, filter] of removeparamFilters) {
// Remove all params in case of option value is empty
// We will not match individual exceptions since it has a higher priority than them
if (key === '') {
result.filter = filter;
removeparamExceptionFilter = removeparamExceptionFilters.get('');

// In case of non-existence of global exception, we will remove params
if (removeparamExceptionFilter === undefined) {
redirectUrl.search = '';
}

break;
}

if (redirectUrl.searchParams.has(key)) {
result.filter = filter;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems invalid in case there is more than one removeparam filter applying to the URL. I also think that in terms of semantics, the result.filter means a blocking filter, but removeparam filters aren't blocking requests. Maybe would it make more sense to add an additional key on result with a list of removeparams (or if we want to generalize, a list of URL/request modifiers)

removeparamExceptionFilter =
removeparamExceptionFilters.get('') ?? removeparamExceptionFilters.get(key);

redirectUrl.searchParams.delete(key);

// Stop if
// * the filter was effective
// * the filter was excepted with global exception (1st priority)
if (
removeparamExceptionFilter === undefined ||
removeparamExceptionFilter.removeparam === ''
) {
break;
}
}
}
}
}

// If `result.filter` still remains `undefined`, there
// is some extra logic to handle special cases like
// redirect-rule and redirect=none.
//
// * If redirect=none is found, then cancel all redirects.
// * Else if redirect-rule is found, only redirect if request would be blocked.
// * Else if redirect is found, redirect.
if (result.filter === undefined && redirectFilters.length !== 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the rational behind giving priority to removeparam over redirects? Wouldn't it make sense to do it the other way around? A redirect rule is a block rule + a redirect rule (as per uBO semantics). Hence if a request is blocked it should have priority over keeping the request and remove some of the parameters.

const redirects = redirectFilters
// highest priorty wins
.sort((a, b) => b.getRedirectPriority() - a.getRedirectPriority());

if (redirects.length !== 0) {
for (const filter of redirects) {
if (filter.getRedirectResource() === 'none') {
redirectNone = filter;
Expand Down Expand Up @@ -1503,17 +1575,30 @@ export default class FilterEngine extends EventEmitter<EngineEventHandlers> {
// If there was a redirect match and no exception was found, then we
// proceed and process the redirect rule. This means two things:
//
// 1. Check if a redirect=none rule was found, which acts as exception.
// 2. If no exception was found, prepare `result.redirect` response.
// 1. Check if there's a removeparam exception was found, which acts as exception.
// 2. Check if a redirect=none rule was found, which acts as exception.
// 3. If no exception was found, prepare `result.redirect` response.
if (
result.filter !== undefined &&
result.exception === undefined &&
result.filter.isRedirect()
result.filter.isRedirectable()
) {
if (redirectNone !== undefined) {
result.exception = redirectNone;
if (result.filter.isRemoveParam()) {
if (removeparamExceptionFilter === undefined) {
result.redirect = {
body: '',
contentType: 'text/plain',
dataUrl: redirectUrl!.toString(),
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit counter-intuitive to me to reuse the dataUrl in order to express the removeparam result. Why not return a list of URL modifiers to be interpreted by the client code and adapted based on available capability of the platform? (iOS, Manifest v2, Manifest v3, might not have the best way to modify request parameters) Ideally the adblocker library should not need to think about these considerations and return a more "abstract" kind of response.

} else {
result.exception = removeparamExceptionFilter;
}
} else {
result.redirect = this.resources.getResource(result.filter.getRedirectResource());
if (redirectNone !== undefined) {
result.exception = redirectNone;
} else {
result.redirect = this.resources.getResource(result.filter.getRedirectResource());
}
}
}
}
Expand Down
37 changes: 37 additions & 0 deletions packages/adblocker/src/filters/network.ts
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,7 @@ export const enum NETWORK_FILTER_MASK {
isHostnameAnchor = 1 << 28,
isRedirectRule = 1 << 29,
isRedirect = 1 << 30,
isRemoveParam = 1 << 31,
// IMPORTANT: the mask is now full, no more options can be added
// Consider creating a separate fitler type for isReplace if a new
// network filter option is needed.
Expand Down Expand Up @@ -887,6 +888,16 @@ export default class NetworkFilter implements IFilter {
mask = setBit(mask, NETWORK_FILTER_MASK.isReplace);
optionValue = value;

break;
case 'removeparam':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Since it would be invalid to have filters with a combination of removeparam and another optionValue, can we add a check to reject filters that try to combine at least two of these options?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good idea, I think we can do it in a separate PR instead. removeparam is not only the filter option involving optionValue.

// TODO: Support regex
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about ~ negation supported by AdGuard? I see it might be handled below but maybe worth documenting the limitation (and having a unit test for each)

if (negation || value.startsWith('/')) {
seia-soto marked this conversation as resolved.
Show resolved Hide resolved
return null;
}

mask = setBit(mask, NETWORK_FILTER_MASK.isRemoveParam);
optionValue = value;

break;
default: {
// Handle content type options separatly
Expand Down Expand Up @@ -1151,6 +1162,8 @@ export default class NetworkFilter implements IFilter {
}
}

mask >>>= 0;
seia-soto marked this conversation as resolved.
Show resolved Hide resolved
seia-soto marked this conversation as resolved.
Show resolved Hide resolved

return new NetworkFilter({
filter,
hostname,
Expand Down Expand Up @@ -1265,6 +1278,14 @@ export default class NetworkFilter implements IFilter {
return this.optionValue;
}

public get removeparam(): string | undefined {
if (!this.isRemoveParam()) {
return undefined;
}

return this.optionValue;
}

public isCosmeticFilter(): this is CosmeticFilter {
return false;
}
Expand Down Expand Up @@ -1535,6 +1556,14 @@ export default class NetworkFilter implements IFilter {
options.push('badfilter');
}

if (this.isRemoveParam()) {
if (this.removeparam!.length > 0) {
seia-soto marked this conversation as resolved.
Show resolved Hide resolved
options.push(`removeparam=${this.optionValue}`);
Comment on lines +1565 to +1567
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Maybe we don't need to mix the use of three different methods/getters.

Suggested change
if (this.isRemoveParam()) {
if (this.removeparam!.length > 0) {
options.push(`removeparam=${this.optionValue}`);
const removeparam = this.removeparam;
if (removeparam !== undefined) {
if (removeparam.length > 0) {
options.push(`removeparam=${removeparam}`);

} else {
options.push('removeparam');
seia-soto marked this conversation as resolved.
Show resolved Hide resolved
}
}

if (options.length > 0) {
if (typeof modifierReplacer === 'function') {
filter += `$${options.map(modifierReplacer).join(',')}`;
Expand Down Expand Up @@ -1608,6 +1637,14 @@ export default class NetworkFilter implements IFilter {
return getBit(this.getMask(), NETWORK_FILTER_MASK.isReplace);
}

public isRemoveParam(): boolean {
return getBit(this.getMask(), NETWORK_FILTER_MASK.isRemoveParam);
}

public isRedirectable(): boolean {
return this.isRedirect() || this.isRemoveParam();
}

// Expected to be called only with `$replace` modifiers
public getHtmlModifier(): HTMLModifier | null {
// Empty `$replace` modifier is to disable all replace modifiers on exception
Expand Down
2 changes: 1 addition & 1 deletion packages/adblocker/src/utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ export function getBit(n: number, mask: number): boolean {
}

export function setBit(n: number, mask: number): number {
return n | mask;
return (n | mask) >>> 0;
}

export function clearBit(n: number, mask: number): number {
Expand Down
Loading
Loading