The Emirati T2T-level Pangenome: A complete Diploid Graph of 58 Genomes
This dataset is an Emirati telomere-to-telomere (T2T) pangenome graph in GBZ format built from 116 haplotype-resolved assemblies spanning 58 individuals (28 trio-based and 30 single-sample assemblies). Assemblies were generated with long-read sequencing (PacBio HiFi and ONT ultra-long) with standard polishing, then integrated into a graph representation. The genomes show high contiguity (median ≈150 Mb) and high consensus accuracy (median QV 59). The resulting GBZ graph captures globally shared and Emirati-enriched variation, including sequence in complex regions, and serves as a population-matched reference for downstream variant discovery and annotation.
- 10/09/2025
- 2 samples
- DAC: EGAC00001001544
- Technologies: PromethION, unspecified
Data access policy for UAE Genomes data.
Submissions for data access is granted upon written request for research purposes (non-commercial) use only. Bona fide, academic users can apply for data access under the following conditions: 1. Data is provided for non-clinical, non-commercial research purposes only. 2. Data is not distributed to any other individual or entity without the UAE Genomes Data access Committee's permission. 3. Data present is experimental in nature, and must not be used to make any clinical decisions. 4. Data that you are accessing is done so with no warranties, expressed or implied, and employees or agents of Khalifa University of Science and Technology have no liability in connection with its use. If you agree with the conditions above, please apply for access by email indicating your consent.
Studies are experimental investigations of a particular phenomenon, e.g., case-control studies on a particular trait or cancer research projects reporting matching cancer normal genomes from patients.
| Study ID | Study Title | Study Type |
|---|---|---|
| EGAS50000001232 | Population Genomics | |
| EGAS50000001233 | Population Genomics | |
| EGAS50000001234 | Population Genomics | |
| EGAS50000001235 | Population Genomics |
This table displays only public information pertaining to the files in the dataset. If you wish to access this dataset, please submit a request. If you already have access to these data files, please consult the download documentation.
| ID | File Type | Size | Quality Report |
Located in
i
|
|---|---|---|---|---|
| EGAF50000425212 | d12 | 4.5 GB |
|
|
| EGAF50000425213 | d12 | 4.5 GB |
|
|
| 2 Files (9.0 GB) | ||||
