Need Help?

The Emirati T2T-level Pangenome: A complete Diploid Graph of 58 Genomes

This dataset is an Emirati telomere-to-telomere (T2T) pangenome graph in GBZ format built from 116 haplotype-resolved assemblies spanning 58 individuals (28 trio-based and 30 single-sample assemblies). Assemblies were generated with long-read sequencing (PacBio HiFi and ONT ultra-long) with standard polishing, then integrated into a graph representation. The genomes show high contiguity (median ≈150 Mb) and high consensus accuracy (median QV 59). The resulting GBZ graph captures globally shared and Emirati-enriched variation, including sequence in complex regions, and serves as a population-matched reference for downstream variant discovery and annotation.

Request Access

Data access policy for UAE Genomes data.

Submissions for data access is granted upon written request for research purposes (non-commercial) use only. Bona fide, academic users can apply for data access under the following conditions: 1. Data is provided for non-clinical, non-commercial research purposes only. 2. Data is not distributed to any other individual or entity without the UAE Genomes Data access Committee's permission. 3. Data present is experimental in nature, and must not be used to make any clinical decisions. 4. Data that you are accessing is done so with no warranties, expressed or implied, and employees or agents of Khalifa University of Science and Technology have no liability in connection with its use. If you agree with the conditions above, please apply for access by email indicating your consent.

Studies are experimental investigations of a particular phenomenon, e.g., case-control studies on a particular trait or cancer research projects reporting matching cancer normal genomes from patients.

Study ID Study Title Study Type
EGAS50000001232 Population Genomics
EGAS50000001233 Population Genomics
EGAS50000001234 Population Genomics
EGAS50000001235 Population Genomics

This table displays only public information pertaining to the files in the dataset. If you wish to access this dataset, please submit a request. If you already have access to these data files, please consult the download documentation.

ID File Type Size Quality Report
Located in
EGAF50000425212 d12 4.5 GB
EGAF50000425213 d12 4.5 GB
2 Files (9.0 GB)