Skip to content

LINCS dataset

Michael Bornholdt edited this page Jun 22, 2021 · 2 revisions

LINCS dataset


Level 1-3 data. Different Normilizations:

Level 5 consensus data:


Broad sample ID, pertubation, MOA mapping:

Plate map Log (ie which plate map has which perturbation in which well:

Plate code, Plate map, Batch mapping:

Cell centers

Cell centers to the LINCS data can be found on the DGX. Those however are from the Unet and may be different from what CellProfiler/Cytominer is using.

The extracted locations can be found on S3. backup_locations holds the large csv that are directly extracted from the SQLite files (from CellProfiler output) and locations holds the csv locations that DeepProfiler needs. s3://imaging-platform/projects/2015_10_05_DrugRepurposing_AravindSubramanian_GolubLab_Broad/workspace/deep_learning/locations/

These can be extracted via the script from Juan

The extraction scripts used to generate these files can be found in this repo under pre-trained/data


  1. The filenames of the location files are Plate/Well-Site-Nuclei.csv for the input (e.g. SQ0014812/B02-4-Nuclei.csv)
  2. The output filenames (the profiles per site) are PLate/Well_Site.npz (e.g. SQ0014812/B02_4.npz)
  3. The image size in my current experiment is 1080x1080. This may change so always check the size of your images and check the values in your location files. Plot them to be sure that they are correct!
  4. The filenames of the location files on the DGX are incorrect, they have the same Nomenklatur as the images.
  5. Sometimes location files or images are just missing or empty. This is the case for: SQ00015208/B22-5, ...
Clone this wiki locally