Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DecodeDraw to allow custom image types #3

Merged
merged 1 commit into from
Sep 5, 2020

Conversation

diamondburned
Copy link
Contributor

Closes #2

@codecov
Copy link

codecov bot commented Jul 29, 2020

Codecov Report

Merging #3 into master will decrease coverage by 0.57%.
The diff coverage is 62.50%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master       #3      +/-   ##
==========================================
- Coverage   81.31%   80.74%   -0.58%     
==========================================
  Files           6        6              
  Lines         182      187       +5     
==========================================
+ Hits          148      151       +3     
- Misses         21       22       +1     
- Partials       13       14       +1     
Impacted Files Coverage Δ
decode.go 79.41% <62.50%> (-1.55%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b1b1c81...9fa9a88. Read the comment docs.

@bbrks
Copy link
Owner

bbrks commented Jul 29, 2020

Hey, thanks for the contribution!

The detailed writeup in the issue makes sense to me, as well as the cursory glance over the PR.
I'll have a play around with this a little bit in my day tomorrow and merge if all looks good 👍

@bbrks
Copy link
Owner

bbrks commented Aug 5, 2020

Sorry, this is taking a little longer than expected. I ran into an issue when testing this where the new interface-based image.Set is causing a huge increase in allocations for the default Decode method (for NRGBA images).

I can work around it by doing a type check to switch to the faster paths for known image types, but want to be sure I've not screwed anything up in the process of switching between alpha and non-alpha premultiplied colors when doing so.

16:50 $ benchcmp bench-master.txt bench-new.txt
benchcmp is deprecated in favor of benchstat: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat
benchmark                                          old ns/op     new ns/op     delta
BenchmarkDecode/LFE.@D9F01_2%L%MIVD*9Goe-;WB-8     820101        815481        -0.56%
BenchmarkDecode/LNAdApj[00aymkj[TKay9}ay-Sj[-8     750371        761578        +1.49%
BenchmarkDecode/LNMF%n00%#MwS|WCWEM{R*bbWBbH-8     816556        823240        +0.82%
BenchmarkDecode/KJG8_@Dgx]_4V?xuyE%NRj-8           687708        699058        +1.65%

benchmark                                          old allocs     new allocs     delta
BenchmarkDecode/LFE.@D9F01_2%L%MIVD*9Goe-;WB-8     3              1027           +34133.33%
BenchmarkDecode/LNAdApj[00aymkj[TKay9}ay-Sj[-8     3              1027           +34133.33%
BenchmarkDecode/LNMF%n00%#MwS|WCWEM{R*bbWBbH-8     3              1027           +34133.33%
BenchmarkDecode/KJG8_@Dgx]_4V?xuyE%NRj-8           3              1027           +34133.33%

benchmark                                          old bytes     new bytes     delta
BenchmarkDecode/LFE.@D9F01_2%L%MIVD*9Goe-;WB-8     4448          8544          +92.09%
BenchmarkDecode/LNAdApj[00aymkj[TKay9}ay-Sj[-8     4448          8544          +92.09%
BenchmarkDecode/LNMF%n00%#MwS|WCWEM{R*bbWBbH-8     4448          8544          +92.09%
BenchmarkDecode/KJG8_@Dgx]_4V?xuyE%NRj-8           4384          8480          +93.43%

@diamondburned
Copy link
Contributor Author

diamondburned commented Aug 5, 2020

That's an interesting observation. I'm guessing it has to do with the type assertion here, which possibly creates a copy of color.NRGBA.

I think a fast path approach is reasonable. Something like this, I assume:

setter, isNRGBA := dst.(*image.NRGBA)

for .. {
    for .. {
        if isNRGBA {
            setter.SetNRGBA(x, y, color.NRGBA{})
        } else {
            dst.Set(x, y, color.NRGBA{})
        }
    }
}

Edit: It seems like inlining color.NRGBA doesn't have any benefits compiler-wise. Below are 2 Godbolt links that possibly demonstrate this:

As we can see, the Go code and Assembly outputs for both pastes are different. However, what's more important is that both have the same amount of movq instructions, thus in either cases, they will treat the struct values the same.

Given this, I can rewrite the above example as such:

for .. {
    for .. {
        var c = color.NRGBA{}

        if isNRGBA {
            setter.SetNRGBA(x, y, c)
        } else {
            dst.Set(x, y, c)
        }
    }
}

@bbrks
Copy link
Owner

bbrks commented Aug 5, 2020

I think this is going to end up being just as alloc heavy when using an RGBA image instead of an NRGBA image. I've played around a little and got a fast path low-alloc approach for both cases. Only downside that I can see is the type switch is inside the loop, but the benchmark doesn't show any impact.

What do you think?

type drawImageNRGBA interface {
	SetNRGBA(x, y int, c color.NRGBA)
}

type drawImageRGBA interface {
	SetRGBA(x, y int, c color.RGBA)
}
for .. {
    for .. {

			switch d := dst.(type) {
			case *image.NRGBA:
				d.SetNRGBA(x, y, color.NRGBA{
					R: uint8(linearTosRGB(r)),
					G: uint8(linearTosRGB(g)),
					B: uint8(linearTosRGB(b)),
					A: 255,
				})
			case *image.RGBA:
				d.SetRGBA(x, y, color.RGBA{
					R: uint8(linearTosRGB(r)),
					G: uint8(linearTosRGB(g)),
					B: uint8(linearTosRGB(b)),
					A: 255,
				})
			default:
				d.Set(x, y, color.NRGBA{
					R: uint8(linearTosRGB(r)),
					G: uint8(linearTosRGB(g)),
					B: uint8(linearTosRGB(b)),
					A: 255,
				})
			}
22:01 $ benchstat orig.txt pr.txt pr-switch2.txt
name \ time/op                             orig.txt     pr.txt        pr-switch2.txt
Decode/LFE.@D9F01_2%L%MIVD*9Goe-;WB-8       874µs ± 7%    825µs ± 1%      795µs ± 1%
Decode/LNAdApj[00aymkj[TKay9}ay-Sj[-8       801µs ± 3%    765µs ± 1%      733µs ± 1%
Decode/LNMF%n00%#MwS|WCWEM{R*bbWBbH-8       881µs ±10%    831µs ± 0%      798µs ± 1%
Decode/KJG8_@Dgx]_4V?xuyE%NRj-8             756µs ±11%    702µs ± 1%      669µs ± 1%
DecodeDraw/LFE.@D9F01_2%L%MIVD*9Goe-;WB-8                 851µs ± 1%      813µs ± 3%
DecodeDraw/LNAdApj[00aymkj[TKay9}ay-Sj[-8                 787µs ± 0%      742µs ± 3%
DecodeDraw/LNMF%n00%#MwS|WCWEM{R*bbWBbH-8                 852µs ± 0%      797µs ± 1%
DecodeDraw/KJG8_@Dgx]_4V?xuyE%NRj-8                       726µs ± 1%      668µs ± 0%

name \ alloc/op                            orig.txt     pr.txt        pr-switch2.txt
Decode/LFE.@D9F01_2%L%MIVD*9Goe-;WB-8      4.45kB ± 0%   8.54kB ± 0%     4.45kB ± 0%
Decode/LNAdApj[00aymkj[TKay9}ay-Sj[-8      4.45kB ± 0%   8.54kB ± 0%     4.45kB ± 0%
Decode/LNMF%n00%#MwS|WCWEM{R*bbWBbH-8      4.45kB ± 0%   8.54kB ± 0%     4.45kB ± 0%
Decode/KJG8_@Dgx]_4V?xuyE%NRj-8            4.38kB ± 0%   8.48kB ± 0%     4.38kB ± 0%
DecodeDraw/LFE.@D9F01_2%L%MIVD*9Goe-;WB-8                8.48kB ± 0%     0.29kB ± 0%
DecodeDraw/LNAdApj[00aymkj[TKay9}ay-Sj[-8                8.48kB ± 0%     0.29kB ± 0%
DecodeDraw/LNMF%n00%#MwS|WCWEM{R*bbWBbH-8                8.48kB ± 0%     0.29kB ± 0%
DecodeDraw/KJG8_@Dgx]_4V?xuyE%NRj-8                      8.42kB ± 0%     0.23kB ± 0%

name \ allocs/op                           orig.txt     pr.txt        pr-switch2.txt
Decode/LFE.@D9F01_2%L%MIVD*9Goe-;WB-8        3.00 ± 0%  1027.00 ± 0%       3.00 ± 0%
Decode/LNAdApj[00aymkj[TKay9}ay-Sj[-8        3.00 ± 0%  1027.00 ± 0%       3.00 ± 0%
Decode/LNMF%n00%#MwS|WCWEM{R*bbWBbH-8        3.00 ± 0%  1027.00 ± 0%       3.00 ± 0%
Decode/KJG8_@Dgx]_4V?xuyE%NRj-8              3.00 ± 0%  1027.00 ± 0%       3.00 ± 0%
DecodeDraw/LFE.@D9F01_2%L%MIVD*9Goe-;WB-8                 2.05k ± 0%      0.00k ± 0%
DecodeDraw/LNAdApj[00aymkj[TKay9}ay-Sj[-8                 2.05k ± 0%      0.00k ± 0%
DecodeDraw/LNMF%n00%#MwS|WCWEM{R*bbWBbH-8                 2.05k ± 0%      0.00k ± 0%
DecodeDraw/KJG8_@Dgx]_4V?xuyE%NRj-8                       2.05k ± 0%      0.00k ± 0%
func BenchmarkDecodeDraw(b *testing.B) {
	for _, test := range testFixtures {
		// skip tests without hashes
		if test.hash == "" {
			continue
		}

		b.Run(test.hash, func(b *testing.B) {
			dst := image.NewRGBA(image.Rect(0, 0, 32, 32))
			for i := 0; i < b.N; i++ {
				_ = blurhash.DecodeDraw(dst, test.hash, 1)
			}
		})
	}
}

@diamondburned
Copy link
Contributor Author

I think the color should be created outside the type switch to make it shorter. Other than that, I think it's fine.

@bbrks bbrks merged commit 9fa9a88 into bbrks:master Sep 5, 2020
@bbrks
Copy link
Owner

bbrks commented Sep 5, 2020

Sorry this took a while to merge! I've been tied up with work stuff recently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add DecodeDraw to allow custom image types
2 participants