utils¶

Collection of utility functions.

pysmFISH.utils.add_coords_to_yaml(folder, hyb_nr, hyb_key='Hyb')[source]¶

Read tile number and coordinates and add them to the yaml file. Read the tile number and microscope coordinates for each tile from the microscope file called “coord_file_name” in “folder”. Then insert them in dictionary “TilesPositions” in the yaml metadata file called Experimental_metadata.yaml.

folder: str: Exact path to the folder, including trailing “/”
hyb_nr: int: Hybridization number denoting for which hybridization we should read and insert the coordinates
hyb_key: str: Possible values ‘Hyb’ or ‘Strip’. To add coordinates for stripping if necessary.

pysmFISH.utils.check_trailing_slash(dir_path, os_windows)[source]¶

This function check if there is a trailing slash at the end of a directory path and add it if missing

dir_path= str: Path to the directory

dir_path= str: Path to the directory

pysmFISH.utils.combine_gene_pos(hybridizations_infos, converted_positions, hybridization)[source]¶

Gather info about the imaging at each hybridization.

This function creates a dictionary where for each hybridization are shown the genes and number of positions imaged for each gene. This function will be useful to created distribution lists for running parallel processing of the datasets.

hybridizations_infos: dict: Dictionary with parsed Hybridizations metadata
converted_positions: dict: Dictionary with the coords of the images for all hybridization. The coords are a list of floats
hybridization: str: Selected hybridization to process

genes_and_positions: dict: Dictionary where for each hybridization, the genes and number of positions imaged for each gene are showed.

pysmFISH.utils.create_single_directory(hyb_dir, gene, hybridization, processing_hyb, suffix, add_slash, analysis_name=None)[source]¶

Function used to create a subdirectory

hyb_dir: str: Path to the directory of the hybridization currently processed.
gene: str: Gene name to be included in the directory.
processing_hyb: str: Name of the hybridization processed (ex. ‘EXP-17-BP3597_hyb2’).
suffix: str: Extra info to add to the directory name (ex. blended).
add_slash: str: Slash added according to the os.
analysis_name: str: Name of the analysis associated to the folder

sufx_dir_path: str: Path to the created directory

pysmFISH.utils.create_subdirectory_tree(hyb_dir, hybridization, hybridizations_infos, processing_hyb, suffix, add_slash, skip_tags=None, skip_genes=None, analysis_name=None)[source]¶

Function that creates the directory tree where to save the temporary data.

hyb_dir: str: Path of the hyb to process
hybridization: str: Name of the hybridization to process (ex. Hybridization2)
hybridizations_infos: dict: Dictionary containing the hybridizations info parsed from the Experimental_metadata.yaml file
processing_hyb: str: Name of the processing experiment (ex. EXP-17-BP3597_hyb2)
suffix: str: Suffix to add to the folder with useful description (ex. tmp)
add_slash: str: ‘’ for win and ‘/’ for linux
skip_tags: list: tags that won’t be processed (ex. _IF)
skip_genes list: list of genes to skip
analysis_name: str: Name of the analysis run

sufx_dir_path: str: Path of the sufx directory of the processed hybridization
sufx_gene_dirs: list: List of the paths of the sufx directory for the genes to process

pysmFISH.utils.determine_os()[source]¶

This function check if the system is running windows. and return the correct slash type to use

os_windows: bool: True if the os is windows.
add_slash: str: ‘’ for windows or ‘/’ for any other system

pysmFISH.utils.experimental_metadata_parser(hyb_dir)[source]¶

Parse the yaml file containing all the metadata of the experiment

The file must be located inside the experimental folder.

hyb_dir: str: Path to the .yaml file containing the metadata of the experiment

experiment_infos: dict: Dictionary with the information on the experiment.
HybridizationInfos: dict: Dictionary with the information on the hybridization.
converted_positions: dict: Dictionary with the coords of the images for all hybridization. The coords are a list of floats
microscope_parameters: dict: Dictionary with the microscope parameters for all hybridization

pysmFISH.utils.filtering_raw_counting_config_parser(hyb_dir)[source]¶

Parse the yaml file containing all configurations for running the analysis

The file must be located inside the experimental folder.

hyb_dir: str: Path to the .yaml file containing the metadata of the experiment

config_parameters: dict: Dictionary with all the configuration parameters

pysmFISH.utils.general_yaml_parser(file_path)[source]¶

Parse a general yaml file and return the dictionary with all the content

The file must be located inside the experimental folder.

file_path: str: Path to the .yaml file containing the metadata of the experiment

parameters: dict: Dictionary with all the configuration parameters

pysmFISH.utils.identify_nodes(client)[source]¶

Function used to determine the address of the nodes in order to better split the work

client: dask.obj: Dask.distributed client.

node_addresses: OrderedDict: Ordered dictionary. The keys are the addresses of the nodes and the items are the full addresses of teh workers of a specific node.

pysmFISH.utils.init_console_logger()[source]¶: Send the logging output to the stderr stream. After running this function the logging message will typically end up in your console output.

pysmFISH.utils.init_file_logger(log_path, maxBytes=0, backupCount=0)[source]¶

Send the logging output to a file.

The logs are placed in a directory called “logs”, if this directory is not found at the location denoted by log_path, it will be made. On each run of the program a new log file will be produced. max_bytes and backup_count are passed to RotatingFileHandler directly. So a new file will be produced when the filesize reaches max_bytes. If either of max_bytes or backup_count is zero, rollover never occurs and everything will be logged in one file.

log_path: str: Full path to the directory where the log directory is/should go.
maxBytes: int: Maximum size in bytes of a single log file, a new log file will be started when this size is reached. If zero, all logging will be written to one file. (Default: 0)
backup_count: int: The maxinum number of logging file that will be produced. If zero, all logging will be written to one file. (Default: 0)

pysmFISH.utils.list_chunking(list_to_chunk, num_chunks)[source]¶

Helper function used to chunk a list in a number of sublists equal to num_chunks

list_to_chunk: list: List to be chunked
num_chunks: int: Number of sublists to obtain

chunked_list: list: List containing the chunked lists

pysmFISH.utils.partial_image_mean(img_paths)[source]¶

Helper function used to calculate the mean of a set of images. It runs on a worker and help parallel image processing

img_paths: list: List of paths to the images saved as *.npy

ImgMean: np.array: Array storing the calculated image mean