-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
$removeparam doesn't work well with UrlEncoded gb2312 Chinese word #1717
Comments
I can reproduce. It seems the issue is when parsing the query parameters using URLSearchParams -- this is surprising, I would expect this API to properly handle encoded query values. So if I understand correctly, the URL is encoded using the page encoding. |
Fix is in 1.37.3rc2, but rc1 dev build in Chrome store is pending review, so I don't know when rc2 will be available in Chrome store. |
@MkQtS from where did you originally get this gb2312-encoded "中文"? |
It looks like a similar (I'm not sure if the protocol is to file a new ticket, but I thought I'd at least mention it here first for confirmation.) |
I am pretty sure this is a site issue since it appears it custom-encode some parts of the URL which should be encoded using encodeURIComponent(), but uBO will have to be ready for invalid decodeURIComponent(). |
@gwarser I am Chinese. Long long ago, I found that sometimes Chinese character would be shown as |
@MkQtS I ask because I tried various ways and I always get correct utf-8 encoding. There is even |
Prerequisites
I tried to reproduce the issue when...
Description
The url is
https://www.baidu.com/s?wd=%D6%D0%CE%C4&oq=test
,%D6%D0%CE%C4
is actually a UrlEncoded form of a Chinese word中文
which encoding with gb2312.After I add a fliter like this:
||baidu.com/s^$removeparam=oq
, the url becamehttps://www.baidu.com/s?wd=%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD
, which makes no sense.However, It works well for
https://www.baidu.com/s?wd=%E4%B8%AD%E6%96%87&oq=test
,%E4%B8%AD%E6%96%87
is also a UrlEncoded form of a Chinese word中文
encoding with UTF-8. After using the fliter, the url becamehttps://www.baidu.com/s?wd=%E4%B8%AD%E6%96%87
.I am not sure if it's related to system or browser. When I exactly type
https://www.baidu.com/s?wd=%E4%B8%AD%E6%96%87&oq=test
, the Omnibox would just showhttps://www.baidu.com/s?wd=中文&oq=test
. When I typehttps://www.baidu.com/s?wd=%D6%D0%CE%C4&oq=test
, the Omnibox would also showhttps://www.baidu.com/s?wd=%D6%D0%CE%C4&oq=test
A specific URL where the issue occurs
https://www.baidu.com/s?wd=%D6%D0%CE%C4&oq=test
Steps to Reproduce
1.Open
https://www.baidu.com/s?wd=%D6%D0%CE%C4&oq=test
2.Add a fliter:
||baidu.com/s^$removeparam=oq
3.Refresh
Expected behavior
The url becomes
https://www.baidu.com/s?wd=%D6%D0%CE%C4
Actual behavior
The url became
https://www.baidu.com/s?wd=%EF%BF%BD%EF%BF%BD%EF%BF%BD%EF%BF%BD
uBlock Origin version
1.37.2
Browser name and version
Chrome 93.0.4577.63
Operating System and version
Windows 10, 21H1
The text was updated successfully, but these errors were encountered: