[WIP] Implement Tiff codec #119

Andy-Wilkinson · 2017-03-02T12:58:25Z

Prerequisites

I have written a descriptive pull-request title
I have verified that there are no overlapping pull-requests open
I have verified that I am following matches the existing coding patterns and practise as demonstrated in the repository. These follow strict Stylecop rules 👮.
I have provided test coverage for my change (where applicable)

Description

This pull request implements (or will implement when complete) a TIFF decoder and encoder for ImageSharp. A new library and NuGet package will be introduced (ImageSharp.Formats.Tiff) in line with the existing formats.

Feature Checklist

Reading TIFF file structure
Parsing of relevant metadata
Extraction of image data blocks
Construction of image (from strip data)
Construction of image (from tile data)

Compression Types

Photometric Interpretation Formats

codecov-io · 2017-03-02T13:10:52Z

Codecov Report

❗ No coverage uploaded for pull request base (master@dce781d). Click here to learn what that means.
The diff coverage is 93.94%.

@@            Coverage Diff            @@
##             master     #119   +/-   ##
=========================================
  Coverage          ?   89.59%           
=========================================
  Files             ?     1038           
  Lines             ?    45854           
  Branches          ?     3255           
=========================================
  Hits              ?    41081           
  Misses            ?     4058           
  Partials          ?      715

Impacted Files	Coverage Δ
...ts/PixelBlenders/DefaultPixelBlenders.Generated.cs	`77.77% <ø> (ø)`
src/ImageSharp/Formats/Tiff/TiffEncoder.cs	`0% <0%> (ø)`
...ImageSharp/Formats/Tiff/TiffConfigurationModule.cs	`0% <0%> (ø)`
src/ImageSharp/Formats/Tiff/ImageExtensions.cs	`0% <0%> (ø)`
...PhotometricInterpretation/BlackIsZero8TiffColor.cs	`100% <100%> (ø)`
...PhotometricInterpretation/WhiteIsZero1TiffColor.cs	`100% <100%> (ø)`
...PhotometricInterpretation/BlackIsZero4TiffColor.cs	`100% <100%> (ø)`
src/ImageSharp/Formats/Tiff/TiffIfd/TiffIfd.cs	`100% <100%> (ø)`
...rc/ImageSharp/PixelFormats/PixelBlender{TPixel}.cs	`100% <100%> (ø)`
...s/Tiff/Compression/PackBitsTiffCompressionTests.cs	`100% <100%> (ø)`
... and 61 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dce781d...a8d25ef. Read the comment docs.

Andy-Wilkinson · 2017-03-02T13:13:37Z

Just seeding this [WIP] PR with the initial project structure. A bit of a mutant at the moment as I've only got the dotnet-preview4 tooling (.csproj based) on my dev PC at the moment, but this should all converge well before this gets merged. I've also introduced a separate unit testing project for rapid TDD, but this can be combined into the main test project before merging.

Quick comment on the TiffGen code in the unit testing project - this is a set of classes for generating in-memory TIFF files for unit testing purposes. They are designed to be simple rather than performant (and therefore hopefully correct!), easily generate varied TIFF structures (including invalid TIFF files to show that the decoder correctly handles/throws on these) and allow checking that the decoder handles padding/oddly ordered blocks/etc.

JimBobSquarePants · 2017-03-05T22:17:15Z

Great to see this opened!

antonfirsov · 2017-03-13T23:58:32Z

src/ImageSharp.Formats.Tiff/TiffDecoderCore.cs

+
+        public TiffIfd ReadIfd(uint offset)
+        {
+            InputStream.Seek(offset, SeekOrigin.Begin);


I think we can't assume that the TIFF image starts at the beginning of the stream. (Or at least we don't do this in other codecs.)

No we can't assume that. Someone could have multiple images in a stream or other information. Best we can do is look for the identifier at the position we are currently at.

Okay... The easy fix is to store the offset that we start at, and do all Seeks relative to this, i.e. InputStream.Seek(StartOffset + offset, SeekOrigin.Begin).

I'd really love to get rid of the Seeks entirely, but unfortunately the TIFF file-format relies heavily on referencing offsets within the file. There is no way you can tell what the next bit of data is otherwise. As an example a file layout could be,

0-8 -> TIFF file header (8 bytes) - references first IFD at offset 30000
8-10000 -> Raw image data here
30000-... -> IFD (basically an image header) - references image data at offset 8

Until you read the IFD at offset 30000, you have no way of telling what data is at offset 8 - it could be image data in an unknown compression/format, another IFD, strings from the metadata. The format doesn't have any identifiers to let you know until you have read the IFD itself.

Any clever ideas to avoid Seeks would be great though! 😄

antonfirsov · 2017-03-14T00:28:39Z

src/ImageSharp.Formats.Tiff/TiffDecoderCore.cs

+                throw new ImageFormatException($"A value of type '{entry.Type}' cannot be converted to a SignedRational.");
+
+            byte[] bytes = ReadBytes(ref entry);
+            SignedRational[] result = new SignedRational[entry.Count];


Isn't it possible to do buffered read without allocating new arrays?
You can store them as pre-allocated members, if the arrays are expected to have a known upper limit. If the limit is not known and/or or the arrays are too big, we should introduce the following utility class to reuse buffers:

class AutoGrowBuffer<T> : IDisposable { private PinnedBuffer<T> currentBuffer; public AutoGrowBuffer(int initialSize) { this.currentBuffer = new PinnedBuffer<T>(initialSize); } // BufferSpan<T> should be very similar to BufferPointer<T> (or a replacement!), having a Length property. I will implement it, if we agree on the design. public BufferSpan<T> GetBufferSpan(int length) { if (this.currentBuffer.Array.Length < length) { this.currentBuffer.Dispose(); this.currentBuffer = new PinnedBuffer<T>(length); } return new BufferSpan<T>(this.currentBuffer, length); } public void Dispose() { this.currentBuffer .Dispose(); } } // USAGE: private AutoGrowBuffer<SignedRational> signedRationalBuffer; // BufferSpan<T> should be very similar to BufferPointer<T> (or a replacement!), having a Length property. I will implement it, if we agree on the design. public BufferSpan<SignedRational> ReadSignedRationalSpan(ref TiffIfdEntry entry) { .... BufferSpan<SignedRational> result = this.signedRationalBuffer.GetBufferSpan(entry.Count); .... }

/CC @JimBobSquarePants

I've used known max values in the past. That would be my favored approach

https://github.com/JimBobSquarePants/ImageSharp/blob/0b5396d0df399dc0eb7968d1479a3e93378de1aa/src/ImageSharp.Formats.Png/PngDecoderCore.cs#L50

For additional larger values I've rented/returned the buffer and made sure I was careful reading them.

I do like that class though. Clever idea!

This is definitely a place for Span-like stuff. Better encapsulation for data-chunk operations without unecessary allocations! I think I need to turn my 'BufferPointer' into 'BufferSpan'. It will be almost System.Span compatible then!

That would be great! 👍

Interesting idea... There's definitely a place for sharing buffers in managing the performance of the existing code. I'm not sure how much benefit there will be for SignedRational arrays - they are pretty rare in TIFF files and I included them mainly for completeness. I'll get more of a feel for this as I try out a number of sample files.

On the other hand - I'm allocating a new byte[] in the GetBytes(...) method that should really be allocated once and reused. The returned byte[] doesn't even need to be the correct length, so I can make a good guess of the maximum likely size, and revert to a new byte[] only if there is a need to exceed this... I'll look into it.

Great! We dont need to bother with uncommon cases, I just wanted to give some general hints based on the code :)

antonfirsov · 2017-03-14T00:44:26Z

src/ImageSharp.Formats.Tiff/TiffDecoderCore.cs

+
+        private Int32 ToInt32(byte[] bytes, int offset)
+        {
+            if (IsLittleEndian)


@JimBobSquarePants has introduced several classes to deal with endianness in streams. Shouldn't we use them in Tiff decoder too?
Their implementations definitely need some optimization by eliminating super-frequent virtual calls, but we could follow an "optimize once, win everywhere" approach by using them!

I would like this to be a thing.

There's a definitely a good rationale for centralising this logic. I'll see if I can do this in the future once I've got the main features working.

antonfirsov · 2017-03-14T00:47:04Z

tests/ImageSharp.Formats.Tiff.Tests/Formats/Tiff/TiffDecoderHeaderTests.cs

+
+        [Theory]
+        [MemberData(nameof(IsLittleEndianValues))]
+        public void ReadHeader_ReadsEndianness(bool isLittleEndian)


I definitely <3 this approach in testing!
I wish all our codecs were under this level of unit testing.

Agreed, great stuff. The others really should be. We should add a chore tag for issues so we can do this 😝

antonfirsov · 2017-03-14T01:07:24Z

@Andy-Wilkinson I wanted to try the code, but I couln't find a working solution.
@JimBobSquarePants will finish switching all our stuff in main branch to VS2017 soon. You can integrate your work better then :)

Andy-Wilkinson · 2017-03-14T11:35:27Z

Yeah - Sorry about that @antonfirsov . It is a bit of a mutant project at the with a mix of the project.json/.csproj approaches. I see that @JimBobSquarePants has merged the VS2017 branch, so I'll try to get everything consistent soon!

JimBobSquarePants · 2017-03-17T23:33:30Z

@Andy-Wilkinson Hoping the merge makes things a lot easier for you!

Andy-Wilkinson · 2017-03-18T09:05:34Z

Thanks @JimBobSquarePants . The VS2017 merge has allowed me to bring everything together into the main ImageSharp solution. Just wishing now that I'd enabled StyleCop and Doc comments from the start! 😉 Almost there with fixing the warnings, and everything will be right going forward.

JimBobSquarePants · 2018-02-08T12:01:56Z

@Andy-Wilkinson I've had a go at fixing those merge conflicts for you but I could have made a mistake. Please let me know I've I've broken things for you.

CLAassistant · 2018-02-08T12:25:04Z

All committers have signed the CLA.

… tiff-codec

JimBobSquarePants · 2018-03-04T13:10:09Z

Hi @Andy-Wilkinson

I managed to update your fork with the latest code from our master with all test passing.

I had a look at the effort, so far.. It's absolutely incredible! 👍

The API is solidifying now so we should be able to make the effort to help you out a lot more to complete this (We should have helped more earlier). There's a lot of new performance goodies on offer within the codebase that we can apply in some places (memory management for example).

I'd love to get the ball rolling again so let me know when you have some free time.

… tiff-codec

JimBobSquarePants · 2018-10-17T23:59:33Z

Just trying to keep this so that it builds and doesn't get too out of date.

JimBobSquarePants · 2018-12-06T18:57:43Z

Hi @Andy-Wilkinson

Just dropping you a message to see whether this PR is something that that you will be able to continue with or whether we should merge it now into our own separate branch to continue working on? Ideally we would love to have your help but if that's not possible that's totally cool also.

Additionally would it be possible to re-sign the CLA? We lost a lot of signatures by accident when we globalized the tool across all our repositories.

Cheers

James

Andy-Wilkinson · 2018-12-07T15:57:08Z

Hi @JimBobSquarePants

Apologies that there's not been much progress recently - to be honest, I've not had time for much dev work so I've tended to drop in and out of my own experimental mini-projects when I get chance. The TIFF codec is going to require more of a concerted effort so it is probably best if you merge into a separate branch. It will open it up to multiple contributors to provide smaller PRs (I might get time for small chunks too).

Cheers,
Andy

PS. I've signed the CLA.

JimBobSquarePants · 2018-12-10T05:12:56Z

Hi @Andy-Wilkinson

Thanks for the update, I appreciate it and all your amazing efforts so far.

I'll merge this in as-is then into a new tiff-codec branch which will then allow other contributors to help out.

[WIP] Implement Tiff codec

Andy-Wilkinson added 2 commits February 26, 2017 20:47

Add initial Tiff codec projects, and TiffGen

a74d143

Add Tiff implementation of IImageFormat

8cbfc8e

Andy-Wilkinson added 7 commits March 2, 2017 21:57

Reference ImageSharp (temporary code sharing)

0196499

Add stub Tiff encoders/decoders

4f6c75a

Read and validate the TIFF file header.

7e3d2c0

Make Tiff implementation details internal

e976a2c

Make TiffDecoderCore more unit-testable

be842dc

Read raw data from TIFF IFDs

5e0fc3b

More comprehensive testing of header checks

e4e2b4c

Andy-Wilkinson added 6 commits March 6, 2017 22:17

Read raw bytes for IFD entries

0534ce0

Read integer types from TIFF IFDs

e3d7436

Read strings from TIFF IFDs

a637ef4

Read rational numbers from TIFF IFDs

fe6d5f6

Use constants for data type sizes

6792d88

Read floating-point numbers from TIFF IFDs

90a3dc9

antonfirsov reviewed Mar 14, 2017

View reviewed changes

Andy-Wilkinson added 5 commits March 14, 2017 20:59

Add a number of TIFF specific constants/enums

8cec9d6

Merge branch 'master' into tiff-codec

1cb02a4

Incorporate Tiff codec into new project structure

736019e

Remove unneeded using statements

94847e8

Fix many StyleCop warnings!

1646c67

Andy-Wilkinson added 2 commits March 18, 2017 15:24

Add documentation to all elements.

3b51c36

Run TIFF unit tests in CI

aef9284

JimBobSquarePants mentioned this pull request Aug 3, 2017

No Tiff support. #12

Closed

25 tasks

ajryan referenced this pull request in ajryan/tesseract Dec 30, 2017

netcore port

f1cad6f

Merge branch 'master' into tiff-codec

5ca5f77

JimBobSquarePants added 4 commits February 9, 2018 15:24

Merge branch 'master' into tiff-codec

e4bb5e4

Merge branch 'master' of https://github.com/SixLabors/ImageSharp into…

affd736

… tiff-codec

Update codebase to catch up with changes to main repo.

64094c5

Use environment specific newline in test

5fa27fb

JimBobSquarePants added 3 commits March 5, 2018 00:17

Merge branch 'master' into tiff-codec

3bd7f72

Merge branch 'master' of https://github.com/SixLabors/ImageSharp into…

6b212c6

… tiff-codec

Fix namespace reference in tests

554318b

JimBobSquarePants added this to the Future milestone Mar 21, 2018

JimBobSquarePants added 2 commits August 31, 2018 21:11

Merge branch 'master' into tiff-codec

8a513a5

Merge remote-tracking branch 'upstream/master' into tiff-codec

bc7bafe

JimBobSquarePants added 3 commits October 18, 2018 09:52

Update tests/Images/External

ca00f9b

Merge remote-tracking branch 'upstream/master' into tiff-codec

f0ead2d

Update IPixel method calls to match new signatures

baf6239

Update external references

a8d25ef

JimBobSquarePants changed the base branch from master to tiff-codec December 10, 2018 05:09

JimBobSquarePants merged commit a61e83a into SixLabors:tiff-codec Dec 10, 2018

JimBobSquarePants added a commit that referenced this pull request Feb 17, 2021

Merge pull request #119 from Andy-Wilkinson/tiff-codec

d1942f8

[WIP] Implement Tiff codec

JimBobSquarePants added a commit that referenced this pull request Feb 17, 2021

Merge pull request #119 from Andy-Wilkinson/tiff-codec

c46fc03

[WIP] Implement Tiff codec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Implement Tiff codec #119

[WIP] Implement Tiff codec #119

Andy-Wilkinson commented Mar 2, 2017 •

edited

Loading

codecov-io commented Mar 2, 2017 •

edited by codecov bot

Loading

Andy-Wilkinson commented Mar 2, 2017

JimBobSquarePants commented Mar 5, 2017

antonfirsov Mar 13, 2017

JimBobSquarePants Mar 14, 2017

Andy-Wilkinson Mar 14, 2017

antonfirsov Mar 14, 2017 •

edited

Loading

JimBobSquarePants Mar 14, 2017

JimBobSquarePants Mar 14, 2017

antonfirsov Mar 14, 2017

JimBobSquarePants Mar 14, 2017

Andy-Wilkinson Mar 14, 2017

antonfirsov Mar 15, 2017

antonfirsov Mar 14, 2017 •

edited

Loading

JimBobSquarePants Mar 14, 2017

Andy-Wilkinson Mar 14, 2017

antonfirsov Mar 14, 2017

JimBobSquarePants Mar 14, 2017

antonfirsov commented Mar 14, 2017

Andy-Wilkinson commented Mar 14, 2017

JimBobSquarePants commented Mar 17, 2017

Andy-Wilkinson commented Mar 18, 2017

JimBobSquarePants commented Feb 8, 2018 •

edited

Loading

CLAassistant commented Feb 8, 2018 •

edited

Loading

JimBobSquarePants commented Mar 4, 2018

JimBobSquarePants commented Oct 17, 2018

JimBobSquarePants commented Dec 6, 2018

Andy-Wilkinson commented Dec 7, 2018

JimBobSquarePants commented Dec 10, 2018

[WIP] Implement Tiff codec #119

[WIP] Implement Tiff codec #119

Conversation

Andy-Wilkinson commented Mar 2, 2017 • edited Loading

Prerequisites

Description

Feature Checklist

Compression Types

Photometric Interpretation Formats

codecov-io commented Mar 2, 2017 • edited by codecov bot Loading

Codecov Report

Andy-Wilkinson commented Mar 2, 2017

JimBobSquarePants commented Mar 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antonfirsov Mar 14, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antonfirsov Mar 14, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antonfirsov commented Mar 14, 2017

Andy-Wilkinson commented Mar 14, 2017

JimBobSquarePants commented Mar 17, 2017

Andy-Wilkinson commented Mar 18, 2017

JimBobSquarePants commented Feb 8, 2018 • edited Loading

CLAassistant commented Feb 8, 2018 • edited Loading

JimBobSquarePants commented Mar 4, 2018

JimBobSquarePants commented Oct 17, 2018

JimBobSquarePants commented Dec 6, 2018

Andy-Wilkinson commented Dec 7, 2018

JimBobSquarePants commented Dec 10, 2018

Andy-Wilkinson commented Mar 2, 2017 •

edited

Loading

codecov-io commented Mar 2, 2017 •

edited by codecov bot

Loading

antonfirsov Mar 14, 2017 •

edited

Loading

antonfirsov Mar 14, 2017 •

edited

Loading

JimBobSquarePants commented Feb 8, 2018 •

edited

Loading

CLAassistant commented Feb 8, 2018 •

edited

Loading