regbench_data.eqtl.retrieve_eqtl#

regbench_data.eqtl.retrieve_eqtl(id)#

Retrieves all datasets.

Parameters:

ids (str | list[str]) – The ID or list of IDs of the datasets to retrieve.

Returns:

A dictionary mapping dataset IDs to Polars DataFrames. The DataFrames contain the following columns:

gene_id: GENCODE/Ensembl gene ID or RNAcentral URS ID
phenotype_id: Phenotype ID, e.g., intron coordinates and cluster combined with gene ID for sQTLs
gene_name: GENCODE gene name
biotype: gene or transcript classification (protein coding, lncRNA, etc.)
variant_id: variant ID in the format “chr_pos_ref_alt_b38” (1-based position)
pip: posterior inclusion probability (PIP)
af: allele frequency of the ALT allele (in-sample)
cs_id: credible set ID (number)
cs_size: credible set size
afc: allelic fold change (aFC) of the lead variant (highest PIP) in the credible set
afc_se: standard error of the aFC of the lead variant (highest PIP) in the credible set
var_chrom: chromosome of the variant
var_pos: 0-based position of the variant
var_ref: reference allele
var_alt: alternate allele

Return type:

dict[str, pl.DataFrame]