-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change regex usage to manual parsing #1094
Conversation
01009b5
to
2b3a949
Compare
Wow, this is a incredible result. I'm looking forward to this PR. |
da33d7d
to
dea2293
Compare
Hey @tonyqus I believe this is ready for review. I've updated the benchmarks table to reflect latest version. I'm quite confident that this PR shouldn't break much and public API only has additions of To improve read/load speed I think it would require changing from |
I did think of changing XmlDocument to XmlReader. But it's kind of tradeoff between convenience and performance. XmlDocument is much easier to use than XmlReader for most cases. |
Can you help put CellReferenceParser.cs into SS/Util folder instead of main folder? |
4c427cb
to
11e5c05
Compare
Rebased and tweaks in last commit, Linux build still fails due to the fix PR still pending. |
* use spans when possible
11e5c05
to
0bd93e9
Compare
LGTM |
I started profiling the "worst case" of evaluating formulas which seemed to take a lot of resources. The culprit turned out to be usage of regex parsing to get column/row/cell information. I created a custom parser that should fulfill the current needs and is a lot faster as it's being called in hot code paths. I also updated APIs to allow also
ReadOnlySpan<char>
for operations that needed substring to work with.Based on running the benchmarks in the other PR, here are the improvement numbers. Still room to improve but this was the low-hanging fruit as first step.
NPOI.Benchmarks.LargeExcelFileBenchmark
NPOI.Benchmarks.RangeValuesBenchmark