Set up#

[22]:
import precellar
precellar.__version__
[22]:
'0.3.0-dev0'

Prepare seqspec and read#

We made seqspec template base on https://teichlab.github.io/scg_lib_structs/methods_html/scifi-ATAC-seq.html

[8]:
assay = precellar.Assay("seqspec_template/scifi_atac.yaml")
Modality ATAC has nesting depth: 1
[9]:
assay
[9]:
scifi_atac
└── atac(200-2000)
    ├── illumina_p5(29) [↓I2(16)✗]
    ├── gem_barcode(16)
    ├── custom_read1(32) [↓R1(74-100)✗]
    ├── tn5_index_A(5)
    ├── mosaic_end_1(19)
    ├── gdna(1-1800)
    ├── mosaic_end_2(19)
    ├── tn5_index_B(5)
    ├── custom_read2(34) [↑R2(74-100)✗]
    ├── sample_index(8)
    └── illumina_p7(24) [↑I1(8)✗]

The data is downloaded from https://www.ncbi.nlm.nih.gov/sra/?term=SRR25320539.

[18]:
!zstdcat /data2/litian/precellar_temp/scifi-ATAC/SRR25320539/SRR25320539_2.fastq.zst | head -n 4
@SRR25320539.1 A00201R:718:HYKV2DSX3:3:1101:2437:1016 length=151
CGCGAAGATGTGTATAAGAGACAGATAAAACAAAAAACGCCCACATCAAAATGAGTTATTCAAAGCAACATTCCAAGTCATCTAGTTGACAAACTCTATCACTCACAACAAACTATATTAAAATATCCATCCTATACACAAGACTTAGACA
+SRR25320539.1 A00201R:718:HYKV2DSX3:3:1101:2437:1016 length=151
FFFFFFFFFFFFFFFFFFFFFFFFF,,,,F,,,,FF,:F,FFF,,,:F:FF,,,F:,F,::,:::F:F:F,,,F::,FFFF:F:FF,F:::,F:F:,:FF,F::,FFF,,:,,,::F,F,F,:FFFFFF,FFFFFF:F:F:,F,FFF,,FF
zstd: error 70 : Write error : cannot write block : Broken pipe
[19]:
!zstdcat /data2/litian/precellar_temp/scifi-ATAC/SRR25320539/SRR25320539_4.fastq.zst | head -n 4
@SRR25320539.1 A00201R:718:HYKV2DSX3:3:1101:2437:1016 length=151
CTATCAGATGTGTATAAGAGACAGGCTATAAATAGATTTCTCTAGCTTTTTATGCACTTTTTTCTGTTGTAGTGTACTACGCCTAGTGTATATGAGGTATCAATGATGATATGTGGATTGGATGTCTCGCGTATGTCAACTAGAGGGAATT
+SRR25320539.1 A00201R:718:HYKV2DSX3:3:1101:2437:1016 length=151
FFF:,:::FFFFFFFFFFFFFFF:,,,,:,FFF,,,,:,::,F,,F,,FF:,,,FF,FF,,,F,,F,,F:F:,,,F,F,,,,,,,,F,FF,,,FF:,,F:F:,,F:,,,FFF,,,F,,,:,F,,,F:F,,,FF,,,:FF,,F,,:,,,,F,
zstd: error 70 : Write error : cannot write block : Broken pipe
[20]:
!zstdcat /data2/litian/precellar_temp/scifi-ATAC/SRR25320539/SRR25320539_1.fastq.zst | head -n 4
@SRR25320539.1 A00201R:718:HYKV2DSX3:3:1101:2437:1016 length=8
CTCGTACT
+SRR25320539.1 A00201R:718:HYKV2DSX3:3:1101:2437:1016 length=8
,:F,FFFF
zstd: error 70 : Write error : cannot write block : Broken pipe
[21]:
!zstdcat /data2/litian/precellar_temp/scifi-ATAC/SRR25320539/SRR25320539_3.fastq.zst | head -n 4
@SRR25320539.1 A00201R:718:HYKV2DSX3:3:1101:2437:1016 length=16
GGTCTTCACGATCCTC
+SRR25320539.1 A00201R:718:HYKV2DSX3:3:1101:2437:1016 length=16
:F:,,,,,,,FFF,,,
zstd: error 70 : Write error : cannot write block : Broken pipe
[10]:
assay.update_read("R1", fastq="/data2/litian/precellar_temp/scifi-ATAC/SRR25320539/SRR25320539_2.fastq.zst")
assay.update_read("R2", fastq="/data2/litian/precellar_temp/scifi-ATAC/SRR25320539/SRR25320539_4.fastq.zst")
assay.update_read("I1", fastq="/data2/litian/precellar_temp/scifi-ATAC/SRR25320539/SRR25320539_1.fastq.zst")
assay.update_read("I2", fastq="/data2/litian/precellar_temp/scifi-ATAC/SRR25320539/SRR25320539_3.fastq.zst")
[11]:
!wget https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-60/fasta/zea_mays/dna/Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.toplevel.fa.gz
--2025-03-07 17:32:43--  https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-60/fasta/zea_mays/dna/Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.toplevel.fa.gz
193.62.193.161ensemblgenomes.ebi.ac.uk (ftp.ensemblgenomes.ebi.ac.uk)...
connected. to ftp.ensemblgenomes.ebi.ac.uk (ftp.ensemblgenomes.ebi.ac.uk)|193.62.193.161|:443...
HTTP request sent, awaiting response... 200 OK
Length: 645391865 (615M) [application/x-gzip]
Saving to: 'Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.toplevel.fa.gz'

Zea_mays.Zm-B73-REF 100%[===================>] 615.49M  4.35MB/s    in 1m 47s

2025-03-07 17:34:31 (5.75 MB/s) - 'Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.toplevel.fa.gz' saved [645391865/645391865]

[12]:
precellar.make_bwa_index(fasta="/data2/litian/database/genome/Zea_mays.Zm-B73-REFERENCE-NAM-5.0.dna.toplevel.fa.gz",
                         genome_prefix="/data2/litian/database/bwa/zm/zm")
ref_seq_len = 4364151988
[bwa_index] Pack FASTA... 15.04 sec
* Entering FMI_search
init ticks = 126244973136
ref seq len = 4364151988
binary seq ticks = 81294602774
build suffix-array ticks = 1536448616796
pos: 545518999, ref_seq_len__: 545518998
build fm-index ticks = 1348375543496
count = 0, 1161389254, 2182075994, 3202762734, 4364151988
BWT[3512089108] = 4
CP_SHIFT = 6, CP_MASK = 63
sizeof CP_OCC = 64
max_occ_ind = 68189874

Alignment#

[13]:
bwa = precellar.aligners.BWAMEM2("/data2/litian/database/bwa/zm/zm")
[15]:

precellar.align( assay, modality='atac', aligner=bwa, output='/data2/litian/precellar_data//processed_file/20250307_scifi_out.fragment.tsv.zst', output_type='fragment', num_threads=32, )
[2025-03-07T10:13:37Z INFO] Starting alignment process
[2025-03-07T10:13:37Z INFO] Using provided Assay object
[2025-03-07T10:13:37Z INFO] Using modality: ATAC
[2025-03-07T10:13:37Z INFO] Initialized aligner: BWA-MEM2
[2025-03-07T10:13:37Z INFO] Initializing FastqProcessor with 32 threads and chunk size 10000000
[2025-03-07T10:13:37Z INFO] Adding mitochondrial DNA references: ["chrM", "M"]
[2025-03-07T10:13:37Z INFO] Generating alignments
[2025-03-07T10:13:37Z INFO] Using BWA-MEM2 aligner
[2025-03-07T10:13:37Z INFO] Starting gen_barcoded_alignments with chunk_size=320000000
[2025-03-07T10:13:37Z INFO] Starting gen_barcoded_fastq for modality: ATAC
[2025-03-07T10:13:37Z INFO] Counting barcodes...
[2025-03-07T10:16:22Z INFO] Processing read R1 with 3 segments, is_reverse: false
[2025-03-07T10:16:22Z INFO] Segment in read R1: type=Barcode, range=0..5, id=tn5_index_A
[2025-03-07T10:16:22Z INFO] Segment in read R1: type=CustomPrimer, range=5..24, id=mosaic_end_1
[2025-03-07T10:16:22Z INFO] Segment in read R1: type=Gdna, range=24..151, id=gdna
[2025-03-07T10:16:22Z INFO] Created annotator for read R1 with 2 subregions
[2025-03-07T10:16:23Z INFO] Processing read R2 with 3 segments, is_reverse: true
[2025-03-07T10:16:23Z INFO] Segment in read R2: type=Barcode, range=0..5, id=tn5_index_B
[2025-03-07T10:16:23Z INFO] Segment in read R2: type=CustomPrimer, range=5..24, id=mosaic_end_2
[2025-03-07T10:16:23Z INFO] Segment in read R2: type=Gdna, range=24..151, id=gdna
[2025-03-07T10:16:23Z INFO] Created annotator for read R2 with 2 subregions
[2025-03-07T10:16:23Z INFO] Processing read I1 with 1 segments, is_reverse: true
[2025-03-07T10:16:23Z INFO] Segment in read I1: type=Barcode, range=0..8, id=sample_index
[2025-03-07T10:16:23Z INFO] Created annotator for read I1 with 1 subregions
[2025-03-07T10:16:23Z INFO] Processing read I2 with 1 segments, is_reverse: false
[2025-03-07T10:16:23Z INFO] Segment in read I2: type=Barcode, range=0..16, id=gem_barcode
[2025-03-07T10:16:23Z INFO] Created annotator for read I2 with 1 subregions
[2025-03-07T10:16:23Z INFO] FastqReader created. Is paired-end: true
[2025-03-07T10:16:23Z INFO] Number of annotators: 4
[2025-03-07T10:16:23Z INFO] Number of readers: 4
[2025-03-07T10:16:23Z INFO] Barcodes found: [("tn5_index_A", 5), ("tn5_index_B", 5), ("sample_index", 8), ("gem_barcode", 16)]
[2025-03-07T10:16:23Z INFO] UMIs found: []
[2025-03-07T10:16:23Z INFO] Total reads reported: 65937541
[2025-03-07T10:16:23Z INFO] Aligning 65937541 reads...
[2025-03-07T10:16:23Z INFO] Processing output type: fragment
[2025-03-07T10:16:23Z INFO] Generating fragments
[2025-03-07T10:16:23Z INFO] Using compression: Some(Zstd) with level: None
[2025-03-07T10:16:24Z INFO] Created output file: "/data2/litian/precellar_data//processed_file/20250307_scifi_out.fragment.tsv.zst"
[2025-03-07T10:16:27Z INFO] Processing chunk with 981596 records
[2025-03-07T10:16:27Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
  1%|▍         |   981596/65937541 [00:53<59:14, 18273.95it/s][2025-03-07T10:17:21Z INFO] Processing chunk with 981596 records
[2025-03-07T10:17:21Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
  3%|▌         |  1963192/65937541 [01:44<56:52, 18745.66it/s][2025-03-07T10:18:12Z INFO] Processing chunk with 981596 records
[2025-03-07T10:18:12Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
  4%|▋         |  2944788/65937541 [02:35<55:18, 18983.43it/s][2025-03-07T10:19:02Z INFO] Processing chunk with 981596 records
[2025-03-07T10:19:02Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
  6%|▊         |  3926384/65937541 [03:26<54:22, 19007.44it/s][2025-03-07T10:19:55Z INFO] Processing chunk with 981596 records
[2025-03-07T10:19:55Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
  7%|▉         |  4907980/65937541 [04:18<53:29, 19018.15it/s][2025-03-07T10:20:45Z INFO] Processing chunk with 981596 records
[2025-03-07T10:20:45Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
  9%|█         |  5889576/65937541 [05:08<52:23, 19103.71it/s][2025-03-07T10:21:35Z INFO] Processing chunk with 981596 records
[2025-03-07T10:21:35Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 10%|█▎        |  6871172/65937541 [05:58<51:23, 19154.03it/s][2025-03-07T10:22:26Z INFO] Processing chunk with 981596 records
[2025-03-07T10:22:26Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 12%|█▍        |  7852768/65937541 [06:49<50:27, 19187.38it/s][2025-03-07T10:23:17Z INFO] Processing chunk with 981596 records
[2025-03-07T10:23:17Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 13%|█▌        |  8834364/65937541 [07:40<49:37, 19179.20it/s][2025-03-07T10:24:07Z INFO] Processing chunk with 981596 records
[2025-03-07T10:24:07Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 15%|█▋        |  9815960/65937541 [08:31<48:43, 19196.59it/s][2025-03-07T10:24:58Z INFO] Processing chunk with 981596 records
[2025-03-07T10:24:58Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 16%|█▊        | 10797556/65937541 [09:21<47:49, 19214.29it/s][2025-03-07T10:25:49Z INFO] Processing chunk with 981596 records
[2025-03-07T10:25:49Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 18%|█▉        | 11779152/65937541 [10:12<46:55, 19236.79it/s][2025-03-07T10:26:40Z INFO] Processing chunk with 981596 records
[2025-03-07T10:26:40Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 19%|██        | 12760748/65937541 [11:03<46:04, 19234.89it/s][2025-03-07T10:27:30Z INFO] Processing chunk with 981596 records
[2025-03-07T10:27:30Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 21%|██▎       | 13742344/65937541 [11:54<45:11, 19246.32it/s][2025-03-07T10:28:21Z INFO] Processing chunk with 981596 records
[2025-03-07T10:28:21Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 22%|██▍       | 14723940/65937541 [12:44<44:20, 19249.38it/s][2025-03-07T10:29:12Z INFO] Processing chunk with 981596 records
[2025-03-07T10:29:12Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 24%|██▌       | 15705536/65937541 [13:35<43:29, 19249.42it/s][2025-03-07T10:30:03Z INFO] Processing chunk with 981596 records
[2025-03-07T10:30:03Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 25%|██▋       | 16687132/65937541 [14:26<42:36, 19264.75it/s][2025-03-07T10:30:53Z INFO] Processing chunk with 981596 records
[2025-03-07T10:30:53Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 27%|██▊       | 17668728/65937541 [15:16<41:44, 19274.94it/s][2025-03-07T10:31:44Z INFO] Processing chunk with 981596 records
[2025-03-07T10:31:44Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 28%|██▉       | 18650324/65937541 [16:08<40:54, 19266.63it/s][2025-03-07T10:32:35Z INFO] Processing chunk with 981596 records
[2025-03-07T10:32:35Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 30%|███       | 19631920/65937541 [16:58<40:02, 19270.59it/s][2025-03-07T10:33:26Z INFO] Processing chunk with 981596 records
[2025-03-07T10:33:26Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 31%|███▎      | 20613516/65937541 [17:50<39:12, 19264.94it/s][2025-03-07T10:34:17Z INFO] Processing chunk with 981596 records
[2025-03-07T10:34:17Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 33%|███▍      | 21595112/65937541 [18:40<38:21, 19267.97it/s][2025-03-07T10:35:08Z INFO] Processing chunk with 981596 records
[2025-03-07T10:35:08Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 34%|███▌      | 22576708/65937541 [19:31<37:29, 19273.92it/s][2025-03-07T10:35:58Z INFO] Processing chunk with 981596 records
[2025-03-07T10:35:58Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 36%|███▊      | 23558304/65937541 [20:21<36:37, 19289.00it/s][2025-03-07T10:36:49Z INFO] Processing chunk with 981596 records
[2025-03-07T10:36:49Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 37%|███▉      | 24539900/65937541 [21:11<35:45, 19292.42it/s][2025-03-07T10:37:39Z INFO] Processing chunk with 981596 records
[2025-03-07T10:37:39Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 39%|████      | 25521496/65937541 [22:02<34:54, 19298.49it/s][2025-03-07T10:38:30Z INFO] Processing chunk with 981596 records
[2025-03-07T10:38:30Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 40%|████▎     | 26503092/65937541 [22:53<34:03, 19297.96it/s][2025-03-07T10:39:20Z INFO] Processing chunk with 981596 records
[2025-03-07T10:39:20Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 42%|████▍     | 27484688/65937541 [23:44<33:12, 19296.86it/s][2025-03-07T10:40:12Z INFO] Processing chunk with 981596 records
[2025-03-07T10:40:12Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 43%|████▌     | 28466284/65937541 [24:36<32:23, 19284.86it/s][2025-03-07T10:41:03Z INFO] Processing chunk with 981596 records
[2025-03-07T10:41:03Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 45%|████▋     | 29447880/65937541 [25:27<31:32, 19280.36it/s][2025-03-07T10:41:55Z INFO] Processing chunk with 981596 records
[2025-03-07T10:41:55Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 46%|████▊     | 30429476/65937541 [26:19<30:42, 19269.23it/s][2025-03-07T10:42:46Z INFO] Processing chunk with 981596 records
[2025-03-07T10:42:46Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 48%|███���▉     | 31411072/65937541 [27:09<29:51, 19274.96it/s][2025-03-07T10:43:37Z INFO] Processing chunk with 981596 records
[2025-03-07T10:43:37Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 49%|█████     | 32392668/65937541 [28:00<29:00, 19276.10it/s][2025-03-07T10:44:27Z INFO] Processing chunk with 981596 records
[2025-03-07T10:44:27Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 51%|█████▎    | 33374264/65937541 [28:51<28:09, 19272.90it/s][2025-03-07T10:45:19Z INFO] Processing chunk with 981596 records
[2025-03-07T10:45:19Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 52%|█████▍    | 34355860/65937541 [29:42<27:18, 19276.19it/s][2025-03-07T10:46:09Z INFO] Processing chunk with 981596 records
[2025-03-07T10:46:09Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 54%|█████▌    | 35337456/65937541 [30:33<26:27, 19277.22it/s][2025-03-07T10:47:02Z INFO] Processing chunk with 981596 records
[2025-03-07T10:47:02Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 55%|█████▋    | 36319052/65937541 [31:25<25:37, 19260.43it/s][2025-03-07T10:47:53Z INFO] Processing chunk with 981596 records
[2025-03-07T10:47:53Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 57%|█████▊    | 37300648/65937541 [32:15<24:46, 19267.85it/s][2025-03-07T10:48:43Z INFO] Processing chunk with 981596 records
[2025-03-07T10:48:43Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 58%|█████▉    | 38282244/65937541 [33:06<23:55, 19271.10it/s][2025-03-07T10:49:33Z INFO] Processing chunk with 981596 records
[2025-03-07T10:49:33Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 60%|██████    | 39263840/65937541 [33:57<23:04, 19271.89it/s][2025-03-07T10:50:25Z INFO] Processing chunk with 981596 records
[2025-03-07T10:50:25Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 61%|██████▎   | 40245436/65937541 [34:48<22:12, 19274.19it/s][2025-03-07T10:51:15Z INFO] Processing chunk with 981596 records
[2025-03-07T10:51:15Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 63%|██████▍   | 41227032/65937541 [35:38<21:21, 19280.67it/s][2025-03-07T10:52:06Z INFO] Processing chunk with 981596 records
[2025-03-07T10:52:06Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 64%|██████▌   | 42208628/65937541 [36:28<20:30, 19284.96it/s][2025-03-07T10:52:56Z INFO] Processing chunk with 981596 records
[2025-03-07T10:52:56Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 66%|██████▋   | 43190224/65937541 [37:18<19:39, 19293.53it/s][2025-03-07T10:53:45Z INFO] Processing chunk with 981596 records
[2025-03-07T10:53:45Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 67%|██████▊   | 44171820/65937541 [38:08<18:47, 19298.24it/s][2025-03-07T10:54:36Z INFO] Processing chunk with 981596 records
[2025-03-07T10:54:36Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 68%|██████▉   | 45153416/65937541 [38:59<17:56, 19299.42it/s][2025-03-07T10:55:27Z INFO] Processing chunk with 981596 records
[2025-03-07T10:55:27Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 70%|███████   | 46135012/65937541 [39:51<17:06, 19294.05it/s][2025-03-07T10:56:18Z INFO] Processing chunk with 981596 records
[2025-03-07T10:56:18Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 71%|███████▍  | 47116608/65937541 [40:41<16:15, 19299.81it/s][2025-03-07T10:57:09Z INFO] Processing chunk with 981596 records
[2025-03-07T10:57:09Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 73%|███████▌  | 48098204/65937541 [41:32<15:24, 19299.72it/s][2025-03-07T10:57:59Z INFO] Processing chunk with 981596 records
[2025-03-07T10:57:59Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 74%|███████▋  | 49079800/65937541 [42:22<14:33, 19304.06it/s][2025-03-07T10:58:50Z INFO] Processing chunk with 981596 records
[2025-03-07T10:58:50Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 76%|███████▊  | 50061396/65937541 [43:14<13:42, 19298.42it/s][2025-03-07T10:59:41Z INFO] Processing chunk with 981596 records
[2025-03-07T10:59:41Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 77%|███████▉  | 51042992/65937541 [44:04<12:51, 19303.82it/s][2025-03-07T11:00:32Z INFO] Processing chunk with 981596 records
[2025-03-07T11:00:32Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 79%|████████  | 52024588/65937541 [44:55<12:00, 19300.44it/s][2025-03-07T11:01:22Z INFO] Processing chunk with 981596 records
[2025-03-07T11:01:22Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 80%|████████▎ | 53006184/65937541 [45:45<11:09, 19306.14it/s][2025-03-07T11:02:13Z INFO] Processing chunk with 981596 records
[2025-03-07T11:02:13Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 82%|████████▍ | 53987780/65937541 [46:35<10:18, 19314.14it/s][2025-03-07T11:03:02Z INFO] Processing chunk with 981596 records
[2025-03-07T11:03:02Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 83%|████████▌ | 54969376/65937541 [47:25<09:27, 19317.45it/s][2025-03-07T11:03:53Z INFO] Processing chunk with 981596 records
[2025-03-07T11:03:53Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 85%|████████▋ | 55950972/65937541 [48:16<08:36, 19317.56it/s][2025-03-07T11:04:43Z INFO] Processing chunk with 981596 records
[2025-03-07T11:04:43Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 86%|████████▊ | 56932568/65937541 [49:06<07:46, 19323.17it/s][2025-03-07T11:05:33Z INFO] Processing chunk with 981596 records
[2025-03-07T11:05:33Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 88%|████████▉ | 57914164/65937541 [49:56<06:55, 19326.31it/s][2025-03-07T11:06:24Z INFO] Processing chunk with 981596 records
[2025-03-07T11:06:24Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 89%|█████████ | 58895760/65937541 [50:46<06:04, 19331.85it/s][2025-03-07T11:07:14Z INFO] Processing chunk with 981596 records
[2025-03-07T11:07:14Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 91%|█████████▎| 59877356/65937541 [51:37<05:13, 19330.14it/s][2025-03-07T11:08:05Z INFO] Processing chunk with 981596 records
[2025-03-07T11:08:05Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 92%|█████████▍| 60858952/65937541 [52:28<04:22, 19329.95it/s][2025-03-07T11:08:56Z INFO] Processing chunk with 981596 records
[2025-03-07T11:08:56Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 94%|█████████▌| 61840548/65937541 [53:19<03:31, 19330.54it/s][2025-03-07T11:09:46Z INFO] Processing chunk with 981596 records
[2025-03-07T11:09:46Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 95%|█████████▋| 62822144/65937541 [54:09<02:41, 19332.68it/s][2025-03-07T11:10:37Z INFO] Processing chunk with 981596 records
[2025-03-07T11:10:37Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 97%|█████████▊| 63803740/65937541 [55:00<01:50, 19328.71it/s][2025-03-07T11:11:28Z INFO] Processing chunk with 981596 records
[2025-03-07T11:11:28Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
 98%|█████████▉| 64785336/65937541 [55:50<00:59, 19334.21it/s][2025-03-07T11:12:18Z INFO] Processing chunk with 981596 records
[2025-03-07T11:12:18Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
100%|██████████| 65766932/65937541 [56:41<00:08, 19333.12it/s][2025-03-07T11:13:06Z INFO] Processing chunk with 170609 records
[2025-03-07T11:13:06Z INFO] Sample record - Barcode present: true, UMI present: false, Read1 present: true, Read2 present: true
100%|██████████| 65937541/65937541 [56:53<00:00, 19317.63it/s][2025-03-07T11:13:17Z INFO] Alignment process completed in 3580.02s
[15]:
{'sequenced_reads': 131875082.0,
 'frac_properly_paired': 0.8498182393547251,
 'frac_mitochondrial': nan,
 'num_unique_fragments': 0.0,
 'frac_duplicates': nan,
 'frac_fragment_in_nucleosome_free_region': nan,
 'frac_confidently_mapped': nan,
 'frac_unmapped': nan,
 'frac_q30_bases_read1': 0.8275834390742883,
 'frac_q30_bases_read2': 0.8165559236264723,
 'frac_valid_barcode': 0.0,
 'frac_fragment_flanking_single_nucleosome': nan,
 'sequenced_read_pairs': 65937541.0,
 'frac_q30_bases_barcode': 0.8074974962385236}
[ ]: