loompy package

Submodules

loompy.loompy module

class loompy.loompy.LoomConnection(filename: str, mode: str = 'r+')[source]

Bases: object

mode
last_modified() → str[source]

Return an ISO8601 timestamp when the file was last modified

Note: if the file has no timestamp, and mode is ‘r+’, a new timestamp is created and returned. Otherwise, the current time in UTC is returned

get_changes_since(timestamp: str) → Dict[str, List][source]
sparse(rows: numpy.ndarray = None, cols: numpy.ndarray = None, layer: str = None) → scipy.sparse.coo.coo_matrix[source]

Return the main matrix as a scipy.sparse.coo_matrix, without loading dense matrix in RAM

Parameters:
  • rows – Rows to include, or None to include all
  • cols – Columns to include, or None to include all
  • layer – Layer to return, or None to return the default layer
Returns:

scipy.sparse.coo_matrix

close(suppress_warning: bool = False) → None[source]

Close the connection. After this, the connection object becomes invalid. Warns user if called after closing.

Parameters:suppress_warning – Suppresses warning message if True (defaults to false)
closed
set_layer(name: str, matrix: numpy.ndarray, chunks: Tuple[int, int] = (64, 64), chunk_cache: int = 512, dtype: str = 'float32', compression_opts: int = 2) → None[source]

DEPRECATED - Use ds.layer.Name = matrix or ds.layer[`Name] = matrix` instead

add_columns(layers: Union[numpy.ndarray, typing.Dict[str, numpy.ndarray], loompy.layer_manager.LayerManager], col_attrs: Dict[str, numpy.ndarray], *, fill_values: Dict[str, numpy.ndarray] = None) → None[source]

Add columns of data and attribute values to the dataset.

Parameters:
  • layers (dict or numpy.ndarray or LayerManager) – Either: 1) A N-by-M matrix of float32s (N rows, M columns) in this case columns are added at the default layer 2) A dict {layer_name : matrix} specified so that the matrix (N, M) will be added to layer layer_name 3) A LayerManager object (such as what is returned by view.layers)
  • col_attrs (dict) – Column attributes, where keys are attribute names and values are numpy arrays (float or string) of length M
  • fill_values – dictionary of values to use if a column attribute is missing, or “auto” to fill with zeros or empty strings
Returns:

Nothing.

Notes

  • This will modify the underlying HDF5 file, which will interfere with any concurrent readers.
  • Column attributes in the file that are NOT provided, will be deleted (unless fill value provided).
  • Array with Nan should not be provided
add_loom(other_file: str, key: str = None, fill_values: Dict[str, numpy.ndarray] = None, batch_size: int = 1000, convert_attrs: bool = False) → None[source]

Add the content of another loom file

Parameters:
  • other_file (str) – filename of the loom file to append
  • key – Primary key to use to align rows in the other file with this file
  • fill_values (dict) – default values to use for missing attributes (or None to drop missing attrs, or ‘auto’ to fill with sensible defaults)
  • batch_size (int) – the batch size used by batchscan (limits the number of rows/columns read in memory)
  • convert_attrs (bool) – convert file attributes that differ between files into column attributes
Returns:

Nothing, but adds the loom file. Note that the other loom file must have exactly the same number of rows, and must have exactly the same column attributes. The all the contents including layers but ignores layers in other_file that are not already persent in self

delete_attr(name: str, axis: int = 0) → None[source]

DEPRECATED - Use del ds.ra.key or del ds.ca.key instead, where key is replaced with the attribute name

set_attr(name: str, values: numpy.ndarray, axis: int = 0, dtype: str = None) → None[source]

DEPRECATED - Use ds.ra.key = values or ds.ca.key = values instead

list_edges(*, axis: int) → List[str][source]

DEPRECATED - Use ds.row_graphs.keys() or ds.col_graphs.keys() instead

get_edges(name: str, *, axis: int) → Tuple[[numpy.ndarray, numpy.ndarray], numpy.ndarray][source]

DEPRECATED - Use ds.row_graphs[name] or ds.col_graphs[name] instead

set_edges(name: str, a: numpy.ndarray, b: numpy.ndarray, w: numpy.ndarray, *, axis: int) → None[source]

DEPRECATED - Use ds.row_graphs[name] = g or ds.col_graphs[name] = g instead

scan(*, items: numpy.ndarray = None, axis: int = None, layers: Iterable = None, key: str = None, batch_size: int = 512) → Iterable[Tuple[[int, numpy.ndarray], loompy.loom_view.LoomView]][source]

Scan across one axis and return batches of rows (columns) as LoomView objects

Parameters:
  • items (np.ndarray) – the indexes [0, 2, 13, … ,973] of the rows/cols to include along the axis OR: boolean mask array giving the rows/cols to include
  • axis (int) – 0:rows or 1:cols
  • batch_size (int) – the chuncks returned at every element of the iterator
  • layers (iterable) – if specified it will batch scan only across some of the layers of the loom file if layers == None, all layers will be scanned if layers == [“”] or “”, only the default layer will be scanned
  • key – Name of primary key attribute. If specified, return the values sorted by the key
Returns:

  • Iterable that yields triplets
  • (ix, indexes, view)
  • ix (int) – first position / how many rows/cols have been yielded alredy
  • indexes (np.ndarray[int]) – the indexes with the same numbering of the input args cells / genes (i.e. np.arange(len(ds.shape[axis]))) this is ix + selection
  • view (LoomView) – a view corresponding to the current chunk

batch_scan(cells: numpy.ndarray = None, genes: numpy.ndarray = None, axis: int = 0, batch_size: int = 1000, layer: str = None) → Iterable[Tuple[[int, numpy.ndarray], numpy.ndarray]][source]

DEPRECATED - Use scan instead

batch_scan_layers(cells: numpy.ndarray = None, genes: numpy.ndarray = None, axis: int = 0, batch_size: int = 1000, layers: Iterable = None) → Iterable[Tuple[[int, numpy.ndarray], Dict]][source]

DEPRECATED - Use scan instead

map(f_list: List[Callable[numpy.ndarray, int]], *, axis: int = 0, chunksize: int = 1000, selection: numpy.ndarray = None) → List[numpy.ndarray][source]

Apply a function along an axis without loading the entire dataset in memory.

Parameters:
  • f (list of func) – Function(s) that takes a numpy ndarray as argument
  • axis (int) – Axis along which to apply the function (0 = rows, 1 = columns)
  • chunksize (int) – Number of rows (columns) to load per chunk
  • selection (array of bool) – Columns (rows) to include
Returns:

numpy.ndarray result of function application The result is a list of numpy arrays, one per supplied function in f_list. This is more efficient than repeatedly calling map() one function at a time.

permute(ordering: numpy.ndarray, axis: int) → None[source]

Permute the dataset along the indicated axis.

Parameters:
  • ordering (list of int) – The desired order along the axis
  • axis (int) – The axis along which to permute
Returns:

Nothing.

export(out_file: str, layer: str = None, format: str = 'tab') → None[source]

Export the specified layer and row/col attributes as tab-delimited file

loompy.loompy.create_append(filename: str, layers: Union[numpy.ndarray, typing.Dict[str, numpy.ndarray], loompy.layer_manager.LayerManager], row_attrs: Dict[str, numpy.ndarray], col_attrs: Dict[str, numpy.ndarray], *, file_attrs: Dict[str, str] = None, fill_values: Dict[str, numpy.ndarray] = None) → None[source]

Append columns to a loom file, or create a new loom file if it doesn’t exist

Parameters:
  • filename (str) – The filename (typically using a .loom file extension)
  • layers (np.ndarray or Dict[str, np.ndarray] or LayerManager) – Two-dimensional (N-by-M) numpy ndarray of float values Or dictionary of named layers, each an N-by-M ndarray or LayerManager, each layer an N-by-M ndarray
  • row_attrs (dict) – Row attributes, where keys are attribute names and values are numpy arrays (float or string) of length N
  • col_attrs (dict) – Column attributes, where keys are attribute names and values are numpy arrays (float or string) of length M
  • file_attrs (dict) – Global attributes, where keys are attribute names and values are strings
Returns:

Nothing

loompy.loompy.create(filename: str, layers: Union[numpy.ndarray, typing.Dict[str, numpy.ndarray], loompy.layer_manager.LayerManager], row_attrs: Dict[str, numpy.ndarray], col_attrs: Dict[str, numpy.ndarray], *, file_attrs: Dict[str, str] = None) → None[source]

Create a new .loom file from the given data.

Parameters:
  • filename (str) – The filename (typically using a .loom file extension)
  • layers (np.ndarray or scipy.sparse or Dict[str, np.ndarray] or LayerManager) – Two-dimensional (N-by-M) numpy ndarray of float values Or sparse matrix Or dictionary of named layers, each an N-by-M ndarray or LayerManager, each layer an N-by-M ndarray
  • row_attrs (dict) – Row attributes, where keys are attribute names and values are numpy arrays (float or string) of length N
  • col_attrs (dict) – Column attributes, where keys are attribute names and values are numpy arrays (float or string) of length M
  • file_attrs (dict) – Global attributes, where keys are attribute names and values are strings
Returns:

Nothing

Remarks:
If the file exists, it will be overwritten. See create_append for a function that will append to existing files.
loompy.loompy.create_from_cellranger(indir: str, outdir: str = None, genome: str = None) → str[source]

Create a .loom file from 10X Genomics cellranger output

Parameters:
  • indir (str) – path to the cellranger output folder (the one that contains ‘outs’)
  • outdir (str) – output folder wher the new loom file should be saved (default to indir)
  • genome (str) – genome build to load (e.g. ‘mm10’; if None, determine species from outs folder)
Returns:

Path to the created loom file.

Return type:

path (str)

loompy.loompy.combine(files: List[str], output_file: str, key: str = None, file_attrs: Dict[str, str] = None, batch_size: int = 1000, convert_attrs: bool = False) → None[source]

Combine two or more loom files and save as a new loom file

Parameters:
  • files (list of str) – the list of input files (full paths)
  • output_file (str) – full path of the output loom file
  • key (string) – Row attribute to use to verify row ordering
  • file_attrs (dict) – file attributes (title, description, url, etc.)
  • batch_size (int) – limits the batch or cols/rows read in memory (default: 1000)
  • convert_attrs (bool) – convert file attributes that differ between files into column attributes
Returns:

Nothing, but creates a new loom file combining the input files.

The input files must (1) have exactly the same number of rows, (2) have exactly the same sets of row and column attributes.

loompy.loompy.connect(filename: str, mode: str = 'r+') → loompy.loompy.LoomConnection[source]

Establish a connection to a .loom file.

Parameters:
  • filename (str) – Name of the .loom file to open
  • mode (str) – read/write mode, accepts ‘r+’ (read/write) or ‘r’ (read-only), defaults to ‘r+’
Returns:

A LoomConnection instance.

Remarks:

This function should typically be called as a context manager:

with loompy.connect(filename) as ds:
…do something…

This ensures that the file will be closed automatically when the context block ends

Module contents