precellar.utils.extract_barcode_from_name#

precellar.utils.extract_barcode_from_name(in_fq, *, out_fq=None, rename=None, get_barcode=None, barcode_fq=None, compression=None, compression_level=None, num_threads=16)#

Remove barcode from the read names of fastq records.

The old practice of storing barcodes in read names is not recommended. This function extracts barcodes from read names and writes them to a separate fastq file.

Parameters:
  • in_fq (str) – File path or URL to the input fastq file.

  • out_fq (Path) – File path to the output fastq file.

  • rename (Callable[str, str] | None) – A function that takes in the original read name and returns the new read name.

  • get_barcode (Callable[str, str] | None) – A function that takes in the original read name and returns the barcode sequence.

  • barcode_fq (Path | None) – File path to the output barcode fastq file.

  • compression (Literal['gzip', 'zst'] | None) – Compression algorithm to use. If None, the compression algorithm will be inferred from the file extension.

  • compression_level (int | None) – Compression level to use.

  • num_threads (int) – The number of threads to use.