Data and code availability
loom file with osmFISH data
Here is the loom file with the osmFISH data: osmFISH_SScortex_mouse_all_cell.loom
The loom file contains the gene expression data in the main matrix as counts and stores various attributes about the cells and genes as metadata. Read more about the loom file format here.
The cell (column) metadata contains:
- CellID, A unique number per cell.
- ClusterID, Cluster label. 0 for excluded cells. (Note: Labels were scrambled, this was corrected on 15 May 2018)
- ClusterName, The given name to the cluster.
- Region, The location of the cell in one of the infered regions.
- Total_molecules, Total molecules per cell.
- Valid, 1 if the cells was part of the final dataset, 0 if excluded (see methods).
- X, X coordinate of the cell in the tissue (unit = pixels, 1 pixel = 0.065μm).
- Y, Y coordinate of the cell in the tissue (unit = pixels, 1 pixel = 0.065μm).
- _tSNE_1, tSNE 1 coordinate of the cell.
- _tSNE_2, tSNE 2 coordinate of the cell.
- size_pix, Cell area in number of pixels (1 pixel = 0.0042255μm2).
- size_um2, Cell area in μm2.
The gene (row) metadata contains:
- Gene, The name of the gene.
- Hybridization, Number of the round in which the gene was labeled.
- Fluorophore, The fluorophore used to label the gene.
hdf5 file with raw counting
Download here the file with the coordinates of all RNA molecules identified in the raw counting: osmFISH raw coords
The coords are grouped by gene and hybridization
Python dictionaries with segmentation data
Download here the files with all the segmented regions segmented regions. The cell IDs used in the downstream processing can be extracted from the loom file or by this dictionary where 1=used cells and 0 = discharged cells cells dict Segmented cell metadata can be found in the object properties dictionary. Use the centroids of the cells in this dictionary when you want to match the RNA signal to the cell segmentation. The loom file contains a rotated version of the centroids for visualization purposes.
All code is available on github:
The raw image dataset contains roughly 6.000.000 images. The total size is over 5 TB and the dataset is available upon request. However, we are not sure how to conveniently transport it yet. Maybe the fastest way is to send a couple of hard drives trough snail mail.
A detailed protocol can be found on Protocols.io