utils

Collection of utility functions.

pysmFISH.utils.add_coords_to_yaml(folder, hyb_nr, hyb_key='Hyb')[source]

Read tile number and coordinates and add them to the yaml file. Read the tile number and microscope coordinates for each tile from the microscope file called “coord_file_name” in “folder”. Then insert them in dictionary “TilesPositions” in the yaml metadata file called Experimental_metadata.yaml.

folder: str
Exact path to the folder, including trailing “/”
hyb_nr: int
Hybridization number denoting for which hybridization we should read and insert the coordinates
hyb_key: str
Possible values ‘Hyb’ or ‘Strip’. To add coordinates for stripping if necessary.
pysmFISH.utils.check_trailing_slash(dir_path, os_windows)[source]

This function check if there is a trailing slash at the end of a directory path and add it if missing

dir_path= str
Path to the directory
dir_path= str
Path to the directory
pysmFISH.utils.combine_gene_pos(hybridizations_infos, converted_positions, hybridization)[source]

Gather info about the imaging at each hybridization.

This function creates a dictionary where for each hybridization are shown the genes and number of positions imaged for each gene. This function will be useful to created distribution lists for running parallel processing of the datasets.

hybridizations_infos: dict
Dictionary with parsed Hybridizations metadata
converted_positions: dict
Dictionary with the coords of the images for all hybridization. The coords are a list of floats
hybridization: str
Selected hybridization to process
genes_and_positions: dict
Dictionary where for each hybridization, the genes and number of positions imaged for each gene are showed.
pysmFISH.utils.create_single_directory(hyb_dir, gene, hybridization, processing_hyb, suffix, add_slash, analysis_name=None)[source]

Function used to create a subdirectory

hyb_dir: str
Path to the directory of the hybridization currently processed.
gene: str
Gene name to be included in the directory.
processing_hyb: str
Name of the hybridization processed (ex. ‘EXP-17-BP3597_hyb2’).
suffix: str
Extra info to add to the directory name (ex. blended).
add_slash: str
Slash added according to the os.
analysis_name: str
Name of the analysis associated to the folder
sufx_dir_path: str
Path to the created directory
pysmFISH.utils.create_subdirectory_tree(hyb_dir, hybridization, hybridizations_infos, processing_hyb, suffix, add_slash, skip_tags=None, skip_genes=None, analysis_name=None)[source]

Function that creates the directory tree where to save the temporary data.

hyb_dir: str
Path of the hyb to process
hybridization: str
Name of the hybridization to process (ex. Hybridization2)
hybridizations_infos: dict
Dictionary containing the hybridizations info parsed from the Experimental_metadata.yaml file
processing_hyb: str
Name of the processing experiment (ex. EXP-17-BP3597_hyb2)
suffix: str
Suffix to add to the folder with useful description (ex. tmp)
add_slash: str
‘’ for win and ‘/’ for linux
skip_tags: list
tags that won’t be processed (ex. _IF)
skip_genes list
list of genes to skip
analysis_name: str
Name of the analysis run
sufx_dir_path: str
Path of the sufx directory of the processed hybridization
sufx_gene_dirs: list
List of the paths of the sufx directory for the genes to process
pysmFISH.utils.determine_os()[source]

This function check if the system is running windows. and return the correct slash type to use

os_windows: bool
True if the os is windows.
add_slash: str
‘’ for windows or ‘/’ for any other system
pysmFISH.utils.experimental_metadata_parser(hyb_dir)[source]

Parse the yaml file containing all the metadata of the experiment

The file must be located inside the experimental folder.

hyb_dir: str
Path to the .yaml file containing the metadata of the experiment
experiment_infos: dict
Dictionary with the information on the experiment.
HybridizationInfos: dict
Dictionary with the information on the hybridization.
converted_positions: dict
Dictionary with the coords of the images for all hybridization. The coords are a list of floats
microscope_parameters: dict
Dictionary with the microscope parameters for all hybridization
pysmFISH.utils.filtering_raw_counting_config_parser(hyb_dir)[source]

Parse the yaml file containing all configurations for running the analysis

The file must be located inside the experimental folder.

hyb_dir: str
Path to the .yaml file containing the metadata of the experiment
config_parameters: dict
Dictionary with all the configuration parameters
pysmFISH.utils.general_yaml_parser(file_path)[source]

Parse a general yaml file and return the dictionary with all the content

The file must be located inside the experimental folder.

file_path: str
Path to the .yaml file containing the metadata of the experiment
parameters: dict
Dictionary with all the configuration parameters
pysmFISH.utils.identify_nodes(client)[source]

Function used to determine the address of the nodes in order to better split the work

client: dask.obj
Dask.distributed client.
node_addresses: OrderedDict
Ordered dictionary. The keys are the addresses of the nodes and the items are the full addresses of teh workers of a specific node.
pysmFISH.utils.init_console_logger()[source]

Send the logging output to the stderr stream. After running this function the logging message will typically end up in your console output.

pysmFISH.utils.init_file_logger(log_path, maxBytes=0, backupCount=0)[source]

Send the logging output to a file.

The logs are placed in a directory called “logs”, if this directory is not found at the location denoted by log_path, it will be made. On each run of the program a new log file will be produced. max_bytes and backup_count are passed to RotatingFileHandler directly. So a new file will be produced when the filesize reaches max_bytes. If either of max_bytes or backup_count is zero, rollover never occurs and everything will be logged in one file.

log_path: str
Full path to the directory where the log directory is/should go.
maxBytes: int
Maximum size in bytes of a single log file, a new log file will be started when this size is reached. If zero, all logging will be written to one file. (Default: 0)
backup_count: int
The maxinum number of logging file that will be produced. If zero, all logging will be written to one file. (Default: 0)
pysmFISH.utils.list_chunking(list_to_chunk, num_chunks)[source]

Helper function used to chunk a list in a number of sublists equal to num_chunks

list_to_chunk: list
List to be chunked
num_chunks: int
Number of sublists to obtain
chunked_list: list
List containing the chunked lists
pysmFISH.utils.partial_image_mean(img_paths)[source]

Helper function used to calculate the mean of a set of images. It runs on a worker and help parallel image processing

img_paths: list
List of paths to the images saved as *.npy
ImgMean: np.array
Array storing the calculated image mean