Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FormAnalyzer] scoring password hints #720
[FormAnalyzer] scoring password hints #720
Changes from 14 commits
2cec3ec
6f5f58f
5a95320
a08905c
0640b6f
7402c0d
068f8d2
cb29187
f4233d8
905dc7d
aa812e8
5272584
e1648cb
ed37372
66bfaad
3e9b49a
729735b
642d018
efa9f47
3760990
d73a9c8
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any risk here that this causes a performance issue when text content is very large?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially! Actually using this existing utility:
duckduckgo-autofill/src/Form/matching.js
Line 836 in 85b0b7b
I've added a 1000 cutoff limit, had to modify the utility a bit for custom cutoff limit.
66bfaad#diff-29e05a441db72f7ca04df5e51bdccfcdebc988497045a25fd89fc587247c4ed2R836
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dbajpeyi Do we have a sense if 1000 is a solid number to rely on? (too big, too small, just right?)
In #720 (comment) it looks like the Apple form has 1800+ characters (mostly whitespace), so 1000 might be too conservative given it won't be run on that form. However, this question comes out of not having a good sense of where that balance lies between performance and usefulness -- do we have data we can lean on? (e.g. we've a bunch of forms in this repo -- what character size are they generally? Or do we have a rule of thumb on the largest size of string we want to run a regex on based on how often this code executes?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removeExcessWhitespaces
call from the test run:With 1000 limit, it is being called ~196/555 (35%) of the forms.
With 1800 limit, it is is being called ~233/555 (41%) of our forms in the suite, this is only just a bit more.
I a tiny bit difference in test suite run time (~10.5 -> 11.4s). I won't say it's that bad though.
The hints are not scored with 1000 limit for the apple form but that one is scored fairly strongly already (score: 9), and the other forms have still improved with 1000 (1 new and 2 existing forms). This for me seems like a good enough improvement, and I rather want to keep on the lower side. If we see more examples, we can increase the threshold to accommodate more. But I think 1000 is a good threshold for now, as it covers most forms that can be improved, without much overhead.
We generally cut texts to
TEXT_LENGTH_CUTOFF
currently which is set to 100. But in this case, I think we want to go longer. I don't see a very specific reason for the 100 number, I think it was arbitrarily set.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave it another thought, and some more test run and I noticed a tiny bit slower tests vs
main
. I am sticking to750
now - which still works for the 3 forms I mentioned, 642d018.