regbench_data.eqtl.retrieve_eqtl#

regbench_data.eqtl.retrieve_eqtl(id)#

Retrieves all datasets.

Parameters:

ids (str | list[str]) – The ID or list of IDs of the datasets to retrieve.

Returns:

A dictionary mapping dataset IDs to Polars DataFrames. The DataFrames contain the following columns:
  • gene_id: GENCODE/Ensembl gene ID or RNAcentral URS ID

  • phenotype_id: Phenotype ID, e.g., intron coordinates and cluster combined with gene ID for sQTLs

  • gene_name: GENCODE gene name

  • biotype: gene or transcript classification (protein coding, lncRNA, etc.)

  • variant_id: variant ID in the format “chr_pos_ref_alt_b38” (1-based position)

  • pip: posterior inclusion probability (PIP)

  • af: allele frequency of the ALT allele (in-sample)

  • cs_id: credible set ID (number)

  • cs_size: credible set size

  • afc: allelic fold change (aFC) of the lead variant (highest PIP) in the credible set

  • afc_se: standard error of the aFC of the lead variant (highest PIP) in the credible set

  • var_chrom: chromosome of the variant

  • var_pos: 0-based position of the variant

  • var_ref: reference allele

  • var_alt: alternate allele

Return type:

dict[str, pl.DataFrame]