Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Parser Regex #43

Merged
merged 5 commits into from
Jul 28, 2022
Merged

Update Parser Regex #43

merged 5 commits into from
Jul 28, 2022

Conversation

peterjan
Copy link
Collaborator

PULL REQUEST

Overview

This PR updates the parser regex to catch a skylink that was missed by the parser in an abuse report.
I've updated the regexes and the parser in a way that will detect the missed skylink format (and also other formats that were undetectable up until now), without causing any false positives (which we definitely do not want since we have automated replies now)

The format the skylink was reported at looked like this:
https:// siasky [.]netEABg4mZrsNcedNPazZ4kSFAYBzf7f8ZgHO1Tu1L-NN8Gjg
It's hard to believe the scanner didn't catch this, but it's missing the / after net.

The reason that extracting skylinks from these texts is harder than it seems, is that sometimes the email body contains headers that look like this:

X-UI-Out-Filterresults: notjunk:1;V03:K0:sQbC5Bf/7VA=:BVBvnd1QjaGT0MiZL1Ho9A
	 IfQpxAOa2PG7BhMwdjkSKRkIi/0Xi320ptoRVrfdAAfeBr+OlbE7g1lSC70AY1aq/+Fpbv4wK
	 3w2N9ynN89sZ8DCaJdB7ly3XgvTsG63gsWdX8Qx0neby0Ej1pajsGSgib3Zm8tezcKH7kM+uH
	 8vULEwVR983S1CyJCBaD2LqZ2TmObmdS+5OJ/edFn2tq2WoPNrpgdm2AFO0gTOwQJ7h7ZG7Cw
	 C51GLljzSwED8mirSv3crcZeIBAS1Id6HFLPoaPWp4PveU/v0K8KtULYo7z19AK6hQgwViBiU
	 Xq2l7J/I405Ww4d83HRzSQk5RYrUot3RK7Z1kuWHlS2xZrnuwbD/O/2jZ1wqm8ODWogMHSGkU
	 I98W13ylJ0OsjeGFO+nsutUv3MjInhjUV3BBvOsnOMPOEOB6O6XEm1wr4UtjHcc9NUBPBvNh9
	 H+gscpw0FrvBbZa+9XSyucw0nXv8ux6AcRDIkceD/k7QPuQ9qF7tieTcu08DuYDQn9NyBefCl
	 RgFTNK0mc/IGzqsAmjjLJjN3Or8ZFb9AGX4Km12EJu5AVmgaX8HWNy7TkwU/G/8fRhwNm1MZA
	 tvKIzaih0+MQ3vhyhX68w4FaCyw03DtqUuXiWc/B+ieWBognxojBZW8fnl6gh1JAtvlo0LKQp
	 GMyXa9CB0//7vKj4QzhelXKBJJgYM8711kf0IFnD84KydbfFnV0LupfaJ57SHxX6EQpsO8YE5
	 Q3y3pDDyLVRM6fCl4EjRAoVRJTN+cWfVrqR2XbR8PzsEhgLpvc0oqDoNuLLFLc9tNZyVRm+3M
	 NDkpXctNC4+MD8zqzyiDiRUOZ27w9qeZqUIEqMlbnpmYnILxrfZL8A5WXYajQ5BDUYi1oMT4W
	 UT47J3cxaP66B+03lzJqMDPAxGGzBoH4buNH0ku66gi0xcmhQtBcWhfDsGM9V9RSXeG/2FmHI
	 i4y3714s6I4zN5G7Fr7EPgg61IkFB+swtoo1O5WrNJ+jFWe5nIsCXWCinXRZgaD4Q2/+57VP5
	 idJHzNoSCPhRv6mwO/9+ia/4pVxgU8wVX6huAHRsFD2WkmpU42jsBGiWOwFj43HTwPuBxfBH9
	 VhQDFA5VMxSpI+4TBiXX9ZYWqnKGpBoBtfKDHqGxF5C1JqWv2xMsiUD9c43po1Z9SsfBEC2A5
	 cfV/KfZ5odL68cjZ0s7OQXt36o

which would label YM8711kf0IFnD84KydbfFnV0LupfaJ57SHxX6EQpsO8YE5 as a valid skylink.

There was a second shortcoming of the parser I've fixed which is that up until now it was incapable of detecting the raw link on its own line, which is also hard to believe but people are simply reporting it using obfuscated urls.

Example for Visual Changes

N/A

Checklist

  • All git commits are signed. (REQUIRED)
  • All new methods or updated methods have clear docstrings.
  • Testing added or updated for new methods.
  • Verified if any changes impact the WebPortal Health Checks.
  • Appropriate documentation updated.
  • Changelog file created.

Issues Closed

Closes SKY-1207

@peterjan peterjan requested a review from ChrisSchinnerl as a code owner July 13, 2022 14:38
@linear
Copy link

linear bot commented Jul 13, 2022

SKY-1207 Update Parser

the Namecheap abusive skylink list contained certain skylinks the parser was unable to handle, we need to update the parser to ensure it catches the following links:

https:// siasky [.]netEABg4mZrsNcedNPazZ4kSFAYBzf7f8ZgHO1Tu1L-NN8Gjg

ChrisSchinnerl
ChrisSchinnerl previously approved these changes Jul 14, 2022
@peterjan peterjan enabled auto-merge (squash) July 14, 2022 07:42
@peterjan peterjan requested a review from kwypchlo July 14, 2022 07:42
@peterjan peterjan merged commit b8f2c2e into main Jul 28, 2022
@peterjan peterjan deleted the pj/parser-update branch July 28, 2022 08:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants