Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mosaic / composite item type #150

Closed
mojodna opened this issue Aug 14, 2018 · 11 comments
Closed

Mosaic / composite item type #150

mojodna opened this issue Aug 14, 2018 · 11 comments

Comments

@mojodna
Copy link
Collaborator

mojodna commented Aug 14, 2018

Grouped images should be representable within a STAC Catalog. These may range from multiple parts of a DigitalGlobe strip (multiple assets, but logically and effectively treated as a single asset alongside additional (metadata, thumbnail, etc.) assets) to a curated collection of items that an entity wishes to share.

Properties of an element within such a collection of assets:

  • layer index
  • range of valid resolutions (i.e. zooms) for an individual asset
  • asset validity footprint (potentially represented using quad keys corresponding to Web Mercator tiles at a resolution greater than that of the image)
  • resampling method to use when necessary
  • whether to stretch values to shift into a visible range
  • min/max values for each band when stretching
  • custom NODATA value
  • band selections

In the case of components of a DG strip, this would be included in the list of assets (ideally with some indication that it should be used in preference to individual components, perhaps using q values: "Quality factors allow the user or user agent to indicate the relative degree of preference for that media-range, using the qvalue scale from 0 to 1 (section 3.9). The default value is q=1.")

A curated collection of items potentially equates to a STAC Item in terms of usage, so this is something to reconcile.

Many of these things provide hints for display purposes and aren't necessarily descriptors for the item itself, so that also merits consideration.

The combination of resolution ranges and quad keys allows a tiler to know when a given asset should be included (and in what order) in a composite image.

We envision this as a small-ish JSON file that is HTTP-accessible and can be used in place of a COG URL (or POSTed) with something like tiles.rdnt.io.

/cc @sharkinsspatial

@matthewhanson
Copy link
Collaborator

@mojodna
Most of these fields look to me like they could apply to any EO data - for example NODATA and gain/offset values. This is actually a problem right now when dealing with landsat data since each band has different gain/offsets to get to TOA' (note not true TOA but TOA without sun angle correction), and that info is only available through the MTL metadatata file, not in the STAC item.

@simonff
Copy link

simonff commented Aug 20, 2018

Yes, none of these are mosaic-specific. We will add them to the raster extension(s) of the Dataset spec.

@mojodna : What's a sample use case for quality factors?

@matthewhanson : For reference, in EE almost all properties from the MTL file are stored in each Landsat asset's metadata, and thus TOA can be computed on the fly. In Collection 1 average sun angle can be taken into account: https://landsat.usgs.gov/using-usgs-landsat-8-product

However, it's not clear if storing them in the STAC catalog is necessary if STAC is intended just for listing/retrieving/visualizing assets and not for providing input for computations.

@matthewhanson
Copy link
Collaborator

@simonff The problem we have run into is, in the case of Landsat, in order to visualize it you need to apply the gains and offsets and it makes it easier if those gains and offsets are in the STAC record rather than in the datafile because that requires you need to read the metadata from the header of the files which is more overhead (we are reading just windowed pieces of the files remotely from S3, so the overhead of reading additional metadata is not small).

In the case of sun angle for Landsat there are two problems:
1 - this special case of handling Landsat means specific processing code just for Landsat, whereas if it were already in TOA reflectance (or surface reflectance) you can use the same processing code as for Sentinel and other sensors. We're currently working with USGS and pushing for them to distribute it as such because right now many people are using Landsat data incorrectly because they aren't correcting it.
2 - While you can use average sun angle (ie scene center angle) it is not ideal when you visualize two adjacent rows in the same path. You will see an artifact at the scene border. The sun angle really should be calculate per pixel and applied as an array.

@matthewhanson
Copy link
Collaborator

matthewhanson commented Aug 20, 2018

@simonff Also, while some of these fields do apply to the dataset as a whole, some of them (such as gain/offset) would be per Item as they can change across scenes.

@cholmes cholmes added prio: should-have would be very good to have in the release and removed new extension labels Aug 23, 2018
@cholmes cholmes added this to the 0.6.0-RC1 milestone Aug 24, 2018
@m-mohr m-mohr modified the milestones: 0.6.0-RC1, future Oct 9, 2018
@vincentsarago
Copy link

👋 @mojodna @matthewhanson
I'd love to see this moving.

About the proposed properties, IMO (and for my use cases) the most important is to have the zoom range and the quadkey coverage for each item.

Fee comments about the proposed items:

  • layer index

Not sure what it means

  • range of valid resolutions (i.e. zooms) for an individual asset

👍 (or resolution + number of overviews, if present)

  • asset validity footprint (potentially represented using quad keys corresponding to Web Mercator tiles at a resolution greater than that of the image)

👍 Quadkey is perfect. Having the full list of quadkey might be a bit expensive (in processing/storage/response) so maybe the list of quakdey at the lowest resolution.

  • resampling method to use when necessary

😐 I see this as optional and is implementation specific IMO.

  • whether to stretch values to shift into a visible range

😐 I see this as optional and is implementation specific IMO.

  • min/max values for each band when stretching

😐 I see this as optional and is implementation specific IMO.

  • custom NODATA value

😐 I see this as optional and is implementation specific IMO.

  • band selections

👍

With our recent work on COG mosaics https://medium.com/devseed/cog-talk-part-2-mosaics-bbbf474e66df we use quadkey indexes intermediate files to link a tile request to a COG so having a quadkey info directly in the stac metadata will make it easier to create those.

@m-mohr m-mohr added new extension and removed stac-sprint-3-discuss prio: should-have would be very good to have in the release labels Jul 18, 2019
@palmerj
Copy link

palmerj commented Dec 16, 2019

Any more interest in something like this?

I'm interested in an extension that provides for a collection of geotiff files that make up a Mosaic dataset. Looking at the existing data model I think a the existing STAC collection almost provides this. I would be interested in the following additional fields:

  • Thumbnail asset - providing thumbnail link for the whole mosaic. Maybe this can be extended?
  • CRS for the mosaic. Maybe adding to properties once this proj extension (Projection Extension #485) is accepted
  • Metadata asset for the mosaic dataset - e.g ISO XML link
  • Geometry for the footprint of the mosaic dataset

Collection are already supporting the following fields I need:

  • description
  • keywords
  • extent
  • providers
  • licence (including link)
  • links (provide references to all the Geotiff tile items)

@palmerj
Copy link

palmerj commented Dec 18, 2019

I would also be interesting in having a geometry for the footprint of the mosaic dataset, but understand that might not be possible with Collections not being GeoJSON features.

@palmerj
Copy link

palmerj commented Dec 19, 2019

Also just came across this catalog layout best practise:

Items should be stored in subdirectories of their parent catalog. This means that each item and its assets are contained in a unique subdirectory

I think in regards to mosaics it best that tile items are not stored in subdirectories. I understand this practise for very large catalogue of datasets that contain one tiff file per band.

@palmerj
Copy link

palmerj commented Jan 10, 2020

Anyone interested in this?

@m-mohr
Copy link
Collaborator

m-mohr commented Jan 10, 2020

@palmerj From previous work on other extensions, it's often a good idea to just start a draft and put it up as PR. Afterwards, we can get people to review it, asking explicitly for help from domain experts and asking for help on Twitter. Asking here may not get you enough attention.

@cholmes cholmes modified the milestones: future, new extensions Feb 26, 2021
@m-mohr
Copy link
Collaborator

m-mohr commented Apr 4, 2023

There's an extension now: https://github.com/stac-extensions/composite Please continue the discussion there.

@m-mohr m-mohr closed this as completed Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants