precellar.align#
- precellar.align(assay, genome_index, *, modality, output, output_type='alignment', mito_dna=['chrM', 'M'], shift_left=4, shift_right=-5, aligner=None, compression=None, compression_level=None, temp_dir=None, num_threads=8, chunk_size=10000000)#
Align fastq reads to the reference genome and generate unique fragments.
- Parameters:
assay (Assay | Path) – A Assay object or file path to the yaml sequencing specification file, see pachterlab/seqspec.
genom_index (Path) – File path to the genome index. The genome index can be created by the
make_genome_index
function.modality (str) – The modality of the sequencing data, e.g., “rna” or “atac”.
output (Path) – File path to the output file. The type of the output file is determined by the
output_type
parameter (see below).output_type (Literal["alignment", "fragment", "gene_quantification"]) – The type of the output file. If “alignment”, the output will be a BAM file containing the alignments. If “fragment”, the output will be a fragment file containing the unique fragments. If “gene_quantification”, the output will be a h5ad file containing the gene quantification.
shift_left (int) – The number of bases to shift the left end of the fragment. Available only when
output_type='fragment'
.shift_right (int) – The number of bases to shift the right end of the fragment. Available only when
output_type='fragment'
.aligner (str | None) – The aligner to use for the alignment. If None, the aligner will be inferred from the modality.
compression (str | None) – The compression algorithm to use for the output fragment file. If None, the compression algorithm will be inferred from the file extension.
compression_level (int | None) – The compression level to use for the output fragment file.
temp_dir (Path | None) – The temporary directory to use.
num_threads (int) – The number of threads to use.
chunk_size (int) – This parameter is used to control the number of bases processed in each chunk per thread. The total number of bases in each chunk is determined by: chunk_size * num_threads.
- Returns:
A dictionary containing the QC metrics of the alignment and fragment generation.
- Return type: