Skip to content
This repository has been archived by the owner on May 28, 2022. It is now read-only.

Latest commit

 

History

History
65 lines (43 loc) · 1.79 KB

README.md

File metadata and controls

65 lines (43 loc) · 1.79 KB

DLDatasets.jl

This package is archived. Please use MLDatasets.jl and FastAI.jl instead.

Quickly load datasets for deep learning.

DLDatasets.jl provides a convenient way to download and load large deep learning datasets that do not fit into memory. It also provides building blocks for building your own datasets, for example FileDataset.

This package uses MLDataPattern.jl's data container interface. The underlying abstractions are inspired by fast.ai's DataBlock API. To iterate over data containers quickly, you could use DataLoaders.jl

DLDatasets.jl is still WIP. Support for tabular datasets is coming.

Usage

Install and import:

]add https://github.com/lorenzoh/DLDatasets.jl
using DLDatasets

Load the low-resolution version of ImageNette and load an observation:

ds = loaddataset(ImageNette, "v2_160px")
image, label = getobs(ds, 1)

Load different dataset splits:

trainds = loaddataset(ImageNette, "v2_160px", split = "train")
trainds, valds = loaddataset(ImageNette, "v2_160px", split = ("train", "val"))

List available datasets and their tags and splits:

datasets()

Iterate over observations fast as the wind:

]add https://github.com/lorenzoh/DataLoaders.jl
using DLDatasets: eachobsparallel

for obs in eachobsparallel(ds)
    # do stuff
end

Datasets

Use DLDatasets.datasets() to get a list of all datasets.

The following datasets with the corresponding tags are implemented.

  • ImageNette
    • "v2_160px"
    • "v2_320px"
  • ImageWoof
    • "v2_160px"
    • "v2_320px"