Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup: PrettyPrint, Render, and Parse #12

Merged
merged 13 commits into from
Aug 2, 2023

Conversation

matthinrichsen-wf
Copy link
Contributor

@matthinrichsen-wf matthinrichsen-wf commented Aug 1, 2023

PR Details

Using strings.Builder and []rune|rune instead of string can dramatic reduce time, bytes, and allocations.

Benchmark used:

var benchFormulas = []string{"=0", "=SUM(A3+B9*2)/2"}

func BenchmarkExcelParser(b *testing.B) {
	for _, formula := range benchFormulas {
		b.ResetTimer()

		ps := efp.ExcelParser()
		b.Run(formula, func(b *testing.B) {
			for n := 0; n < b.N; n++ {
				ps.Parse(formula)
			}
		})

		b.Run(formula+`.PrettyPrint()`, func(b *testing.B) {
			for n := 0; n < b.N; n++ {
				ps.PrettyPrint()
			}
		})
		b.Run(formula+`.Render()`, func(b *testing.B) {
			for n := 0; n < b.N; n++ {
				ps.Render()
			}
		})
	}

	for i := 10; i < 10000; i *= 10 {
		f := `=` + strings.Repeat(`SUM(A:C)+`, i) + `0`
		b.Run(`long`+strconv.Itoa(i), func(b *testing.B) {
			for n := 0; n < b.N; n++ {
				ps := efp.ExcelParser()
				ps.Parse(f)
			}
		})
	}
}
benchcmp is deprecated in favor of benchstat: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat
benchmark                                                 old ns/op      new ns/op     delta
BenchmarkExcelParser/=0-10                                371855         79.2          -99.98%
BenchmarkExcelParser/=0.PrettyPrint()-10                  117320611      62.2          -100.00%
BenchmarkExcelParser/=0.Render()-10                       7286553        19.0          -100.00%
BenchmarkExcelParser/=SUM(A3+B9*2)/2-10                   362986         285           -99.92%
BenchmarkExcelParser/=SUM(A3+B9*2)/2.PrettyPrint()-10     95657053       269           -100.00%
BenchmarkExcelParser/=SUM(A3+B9*2)/2.Render()-10          8838701        96.4          -100.00%
BenchmarkExcelParser/long10-10                            264109         5111          -98.06%
BenchmarkExcelParser/long100-10                           21699004       44848         -99.79%
BenchmarkExcelParser/long1000-10                          1719943709     469958        -99.97%

benchmark                                                 old allocs     new allocs     delta
BenchmarkExcelParser/=0-10                                31             3              -90.32%
BenchmarkExcelParser/=0.PrettyPrint()-10                  10137          3              -99.97%
BenchmarkExcelParser/=0.Render()-10                       10103          1              -99.99%
BenchmarkExcelParser/=SUM(A3+B9*2)/2-10                   31             3              -90.32%
BenchmarkExcelParser/=SUM(A3+B9*2)/2.PrettyPrint()-10     10123          6              -99.94%
BenchmarkExcelParser/=SUM(A3+B9*2)/2.Render()-10          10110          2              -99.98%
BenchmarkExcelParser/long10-10                            1542           44             -97.15%
BenchmarkExcelParser/long100-10                           14961          317            -97.88%
BenchmarkExcelParser/long1000-10                          149192         3022           -97.97%

benchmark                                                 old bytes      new bytes     delta
BenchmarkExcelParser/=0-10                                2033207        136           -99.99%
BenchmarkExcelParser/=0.PrettyPrint()-10                  1107914861     56            -100.00%
BenchmarkExcelParser/=0.Render()-10                       54198420       8             -100.00%
BenchmarkExcelParser/=SUM(A3+B9*2)/2-10                   2036515        1088          -99.95%
BenchmarkExcelParser/=SUM(A3+B9*2)/2.PrettyPrint()-10     1109796115     504           -100.00%
BenchmarkExcelParser/=SUM(A3+B9*2)/2.Render()-10          54331333       24            -100.00%
BenchmarkExcelParser/long10-10                            552290         13656         -97.53%
BenchmarkExcelParser/long100-10                           55977446       123840        -99.78%
BenchmarkExcelParser/long1000-10                          5574307520     1316311       -99.98%

Description

Related Issue

Motivation and Context

While using the library, it was discovered that the string -> []rune conversions were causing many allocations.

How Has This Been Tested

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
    • Removal of Token string on the Parser struct
    • Character constants changed type from string to rune

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@matthinrichsen-wf matthinrichsen-wf changed the title Speedup: PrettyPrint and Render Speedup: PrettyPrint, Render, and Parse Aug 1, 2023
Copy link
Owner

@xuri xuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR. I found that the code coverage has been decreased (formerly was 100%), and the code is not formatted with the lasted gofmt -s linter.

@matthinrichsen-wf
Copy link
Contributor Author

@xuri I believe I've addressed the formatting issue - and as far as I can tell the code coverage should now be at 100% (using go test -v -race -coverprofile=coverage.txt -covermode=atomic ./...)

Copy link
Owner

@xuri xuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. Sorry, the lint issues were not introduced in your changes, gofmt uses tabs for indentation in the comments, could you help to fix it?

diff --git a/efp.go b/efp.go
index b9b2880..e62c3c1 100644
--- a/efp.go
+++ b/efp.go
@@ -108,9 +108,8 @@ type Token struct {
 // Tokens directly maps the ordered list of tokens.
 // Attributes:
 //
-//    items - Ordered list
-//    index - Current position in the list
-//
+//	items - Ordered list
+//	index - Current position in the list
 type Tokens struct {
 	Index int
 	Items []Token

efp.go Outdated
@@ -224,14 +266,16 @@ func ExcelParser() Parser {

// getTokens return a token stream (list).
func (ps *Parser) getTokens() Tokens {

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be removed (lint by gofumpt -l -d -w ./).

Copy link
Owner

@xuri xuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, after upgrade the Excelize library's dependencies efp to the commit 8b2cfd9, the unit test TestCalcCellValue failed, we need to investigate the reason that caused the problem before merging this PR.

@matthinrichsen-wf
Copy link
Contributor Author

@xuri Good to know - I'll investigate to determine the cause of the failure.

@matthinrichsen-wf
Copy link
Contributor Author

matthinrichsen-wf commented Aug 2, 2023

@xuri I believe issues should be addressed now - should I squash the commits on this branch down to one?

As an aside, it's really great to see such an exhaustive set of tests in https://github.com/qax-os/excelize. I had my own set I was working with but that set is much more expansive.

@matthinrichsen-wf matthinrichsen-wf requested a review from xuri August 2, 2023 14:09
- Add 3 formula function errors
- Add comments for the helper function
- Define helper functions after constants, variables, and data types
- Simplify data type declaration in the function parameters
- Remove outdated URL links in the documentation
Copy link
Owner

@xuri xuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for your efforts. Much appreciated. I have made some changes based on your branch.

@xuri xuri merged commit ad255f2 into xuri:master Aug 2, 2023
@xuri xuri added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants