Need Help?

Papua New Guinean Lowlanders dataset

The Papua New Guinean Lowlanders dataset includes 41 whole genome sequences for Papua New Guinean individuals sampled in Daru. DNA was extracted from saliva samples (Oragen kit). Sequencing libraries were prepared using the TruSeq DNA PCR-Free HT kit. 150 bp paired-end sequencing was performed on the Illumina HiSeq X5 sequencer. The PGAP dataset provides Fastq, mapped cram files (GRCh38) and phenotype measurements.

Request Access

Papua New Guinean Lowlanders Dataset (PNGLD) – data access policy

DATA ACCESS AGREEMENT 1. The User Institution agrees to only use these Data for the purpose of the Project (described in Appendix II) and only for Research Purposes. The User Institution further agrees that it will only use these Data for Research Purposes which are within the limitations (if any) set out in Appendix I. 2. The User Institution agrees to preserve, at all times, the confidentiality of these Data. In particular, it undertakes not to use, or attempt to use these Data to compromise or otherwise infringe the confidentiality of information on Research Participants. Without prejudice to the generality of the foregoing, the User Institution agrees to use at least the measures set out in Appendix I to protect these Data. 3. The User Institution agrees to protect the confidentiality of Research Participants in any research papers or publications that they prepare by taking all reasonable care to limit the possibility of identification. 4. The User Institution agrees not to link or combine these Data to other information or archived data available in a way that could re-identify the Research Participants, even if access to that data has been formally granted to the User Institution or is freely available without restriction. 5. The User Institution agrees only to transfer or disclose these Data, in whole or part, or any material derived from these Data, to the Authorised Personnel. Should the User Institution wish to share these Data with an External Collaborator, the External Collaborator must complete a separate application for access to these Data. 6. The User Institution agrees that the Data Producers, and all other parties involved in the creation, funding or protection of these Data: a) make no warranty or representation, express or implied as to the accuracy, quality or comprehensiveness of these Data; b) exclude to the fullest extent permitted by law all liability for actions, claims, proceedings, demands, losses (including but not limited to loss of profit), costs, awards damages and payments made by the Recipient that may arise (whether directly or indirectly) in any way whatsoever from the Recipient’s use of these Data or from the unavailability of, or break in access to, these Data for whatever reason and; c) bear no responsibility for the further analysis or interpretation of these Data. 7. The User Institution agrees to follow the Fort Lauderdale Guidelines (https://www.wtccc.org.uk/wtccc/assets/wtd003207.pdf) and the Toronto Statement (http://www.nature.com/nature/journal/v461/n7261/full/461168a.html). This includes but is not limited to recognising the contribution of the Data Producers and including a proper acknowledgement in all reports or publications resulting from the use of these Data. 8. The User Institution agrees to follow the Publication Policy in Appendix III. This includes respecting the moratorium period for the Data Producers to publish the first peer-reviewed report describing and analysing these Data. 9. The User Institution agrees not to make intellectual property claims on these Data and not to use intellectual property protection in ways that would prevent or block access to, or use of, any element of these Data, or conclusion drawn directly from these Data. 10. The User Institution can elect to perform further research that would add intellectual and resource capital to these data and decide to obtain intellectual property rights on these downstream discoveries. In this case, the User Institution agrees to implement licensing policies that will not obstruct further research and to follow the U.S. National Institutes of Health Best Practices for the Licensing of Genomic Inventions (2005) (https://www.icgc.org/files/daco/NIH_BestPracticesLicensingGenomicInventions_2005_en.pdf) in conformity with the Organisation for Economic Co-operation and Development Guidelines for the Licensing of the Genetic Inventions (2006) (http://www.oecd.org/science/biotech/36198812.pdf). 11. The User Institution agrees to destroy/discard the Data held, once it is no longer used for the Project, unless obliged to retain the data for archival purposes in conformity with audit or legal requirements. 12. The User Institution will notify the Papua New Guinean Lowlanders Dataset (PNGLD) committee within 30 days of any changes or departures of Authorised Personnel. 13. The User Institution will notify the Papua New Guinean Lowlanders Dataset (PNGLD) committee prior to any significant changes to the protocol for the Project. 14. The User Institution will notify the Papua New Guinean Lowlanders Dataset (PNGLD) committee as soon as it becomes aware of a breach of the terms or conditions of this agreement. 15. The Papua New Guinean Lowlanders Dataset (PNGLD) committee may terminate this agreement by written notice to the User Institution. If this agreement terminates for any reason, the User Institution will be required to destroy any Data held, including copies and backup copies. This clause does not prevent the User Institution from retaining these data for archival purpose in conformity with audit or legal requirements. 16. The User Institution accepts that it may be necessary for the Data Producers to alter the terms of this agreement from time to time. As an example, this may include specific provisions relating to the Data required by Data Producers other than the Papua New Guinean Lowlanders Dataset (PNGLD) committee. In the event that changes are required, the Data Producers or their appointed agent will contact the User Institution to inform it of the changes and the User Institution may elect to accept the changes or terminate the agreement. 17. If requested, the User Institution will allow data security and management documentation to be inspected to verify that it is complying with the terms of this agreement. 18. The User Institution agrees to distribute a copy of these terms to the Authorised Personnel. The User Institution will procure that the Authorised Personnel comply with the terms of this agreement. 19. This agreement (and any dispute, controversy, proceedings or claim of whatever nature arising out of this agreement or its formation) shall be construed, interpreted and governed by the laws of Papua New Guinea and shall be subject to the exclusive jurisdiction of Papua New Guinean courts. PUBLICATION POLICY The Papua New Guinean Lowlanders Dataset (PNGLD) intends to publish the results of their analysis of this dataset and do not consider its deposition into public databases to be the equivalent of such publications. The Papua New Guinean Lowlanders Dataset (PNGLD) anticipates that the dataset could be useful to other qualified researchers for a variety of purposes. However, some areas of work are subject to a publication moratorium. The publication moratorium covers any publications (including oral communications) that describe the use of the dataset. For research papers, submission for publication should not occur until the PNGLD committee has provided written consent for publication on or after a given date, either in a separate written document, or more commonly, as part of this agreement. In any publications based on these data, please describe how the data can be accessed, including the name of the hosting database (e.g., The European Genome-phenome Archive at the European Bioinformatics Institute) and its accession numbers, and acknowledge its use in a form agreed by the User Institution with the Papua New Guinean Lowlanders Dataset (PNGLD) committee. Specific limitations on areas of research: Users must be formally affiliated with an officially recognized Institution. The User can replicate existing studies published by the Papua New Guinean Lowlanders Dataset (PNGLD) research program, using similar techniques, approaches and methods, to ensure that the published science is reproducible. Approval will be automatically granted for such use. The User can undertake new demographic studies, including studies focusing on the history of archaic hominins and modern humans, as long as this does not compete with ongoing studies by the Papua New Guinean Lowlanders Dataset (PNGLD) program. All research projects must be approved by the PNGLD committee. The User can undertake studies of selection, including on alleles with archaic and modern ancestry, as long as this does not compete with ongoing studies by the Papua New Guinean Lowlanders Dataset (PNGLD) program. All research projects must be approved by the Papua New Guinean Lowlanders Dataset (PNGLD) committee. The User cannot undertake studies of a medical or clinical nature without first seeking the approval of the Papua New Guinean Lowlanders Dataset (PNGLD) committee. Evidence of specific ethical approvals, including documentation from a Papuan New Guinean ethics body, will likely be necessary for approval to be granted. The User cannot undertake studies for personal use, such as family history research, or perform this research for others. The User cannot publicly release Papua New Guinean Lowlanders Dataset (PNGLD) data. All rights data release remain with the PNGLD committee. Note that all uses of the data must have specific prior approval from the Papua New Guinean Lowlanders Dataset (PNGLD) committee. Evidence of ethical approvals, including documentation from a Papuan New Guinean ethics body, may be necessary for approval to be granted in some cases. A moratorium on publication until a given date may be a condition of data access and use, primarily in cases where a study proposed by the User overlaps in part or in whole with ongoing studies by the Papua New Guinean Genome Diversity Project (PNGLD) program. Minimum protection measures required: Data can be held in unencrypted files on an institutional compute system, with Unix user group read/write access for one or more appropriate groups but not Unix world read/write access behind a secure firewall. Laptops holding these data should have password protected logins and screen locks (set to lock after 5 min of inactivity). If held on USB keys or other portable hard drives, the data must be encrypted.

Studies are experimental investigations of a particular phenomenon, e.g., case-control studies on a particular trait or cancer research projects reporting matching cancer normal genomes from patients.

Study ID Study Title Study Type
EGAS50000000033 Whole Genome Sequencing
  • add PMID to the study
  • Dataset Released

This table displays only public information pertaining to the files in the dataset. If you wish to access this dataset, please submit a request. If you already have access to these data files, please consult the download documentation.

ID File Type Size Located in
EGAF50000049717 fastq.gz 8.7 GB
EGAF50000049718 fastq.gz 9.0 GB
EGAF50000049719 cram 19.4 GB
EGAF50000049720 fastq.gz 9.2 GB
EGAF50000049721 fastq.gz 9.7 GB
EGAF50000049722 cram 22.1 GB
EGAF50000049723 fastq.gz 18.4 GB
EGAF50000049724 cram 21.2 GB
EGAF50000049725 fastq.gz 19.5 GB
EGAF50000049726 cram 16.5 GB
EGAF50000049727 fastq.gz 11.5 GB
EGAF50000049728 fastq.gz 12.1 GB
EGAF50000049729 cram 19.4 GB
EGAF50000049730 fastq.gz 10.2 GB
EGAF50000049731 fastq.gz 10.7 GB
EGAF50000049732 cram 19.6 GB
EGAF50000049733 fastq.gz 12.1 GB
EGAF50000049734 fastq.gz 12.8 GB
EGAF50000049735 cram 19.7 GB
EGAF50000049736 cram 15.2 GB
EGAF50000049737 cram 14.0 GB
EGAF50000049738 fastq.gz 37.6 GB
EGAF50000049739 cram 14.8 GB
EGAF50000049740 cram 14.6 GB
EGAF50000049741 cram 34.2 GB
EGAF50000049742 cram 20.8 GB
EGAF50000049743 fastq.gz 40.1 GB
EGAF50000049744 cram 17.3 GB
EGAF50000049745 cram 14.3 GB
EGAF50000049746 cram 5.7 GB
EGAF50000049747 cram 4.3 GB
EGAF50000049748 cram 18.1 GB
EGAF50000049749 cram 5.3 GB
EGAF50000049750 cram 15.6 GB
EGAF50000049751 cram 4.0 GB
EGAF50000049752 cram 8.5 GB
EGAF50000049753 cram 5.0 GB
EGAF50000049754 cram 3.7 GB
EGAF50000049755 fastq.gz 50.3 GB
EGAF50000049756 cram 6.4 GB
EGAF50000049757 cram 5.1 GB
EGAF50000049758 cram 6.4 GB
EGAF50000049759 cram 5.9 GB
EGAF50000049760 cram 6.9 GB
EGAF50000049761 cram 6.3 GB
EGAF50000049762 cram 5.3 GB
EGAF50000049763 cram 7.9 GB
EGAF50000049764 cram 5.5 GB
EGAF50000049765 cram 5.7 GB
EGAF50000049766 fastq.gz 54.0 GB
EGAF50000049767 cram 5.6 GB
EGAF50000049768 cram 6.2 GB
EGAF50000049769 cram 4.7 GB
EGAF50000049770 cram 5.2 GB
EGAF50000049771 cram 8.3 GB
EGAF50000049772 cram 4.7 GB
EGAF50000049773 fastq.gz 44.6 GB
EGAF50000049774 cram 5.6 GB
EGAF50000049775 fastq.gz 47.6 GB
EGAF50000049776 fastq.gz 36.1 GB
EGAF50000049777 fastq.gz 38.5 GB
EGAF50000049778 fastq.gz 36.5 GB
EGAF50000049779 fastq.gz 39.0 GB
EGAF50000049780 fastq.gz 41.4 GB
EGAF50000049781 fastq.gz 44.4 GB
EGAF50000049782 fastq.gz 43.2 GB
EGAF50000049783 fastq.gz 30.1 GB
EGAF50000049784 fastq.gz 46.3 GB
EGAF50000049785 fastq.gz 32.5 GB
EGAF50000049786 fastq.gz 30.4 GB
EGAF50000049787 fastq.gz 31.5 GB
EGAF50000049788 fastq.gz 37.0 GB
EGAF50000049789 fastq.gz 37.8 GB
EGAF50000049790 fastq.gz 34.2 GB
EGAF50000049791 fastq.gz 35.8 GB
EGAF50000049792 fastq.gz 30.8 GB
EGAF50000049793 fastq.gz 32.2 GB
EGAF50000049794 fastq.gz 38.4 GB
EGAF50000049795 fastq.gz 39.9 GB
EGAF50000049796 fastq.gz 38.2 GB
EGAF50000049797 fastq.gz 39.9 GB
EGAF50000049798 fastq.gz 32.0 GB
EGAF50000049799 fastq.gz 33.5 GB
EGAF50000049800 fastq.gz 39.4 GB
EGAF50000049801 fastq.gz 41.3 GB
EGAF50000049802 fastq.gz 34.3 GB
EGAF50000049803 fastq.gz 11.3 GB
EGAF50000049804 fastq.gz 12.2 GB
EGAF50000049805 fastq.gz 37.1 GB
EGAF50000049806 fastq.gz 9.6 GB
EGAF50000049807 fastq.gz 10.2 GB
EGAF50000049808 fastq.gz 11.5 GB
EGAF50000049809 fastq.gz 12.4 GB
EGAF50000049810 fastq.gz 12.8 GB
EGAF50000049811 fastq.gz 13.7 GB
EGAF50000049812 fastq.gz 8.7 GB
EGAF50000049813 fastq.gz 9.3 GB
EGAF50000049814 fastq.gz 11.0 GB
EGAF50000049815 fastq.gz 11.8 GB
EGAF50000049816 fastq.gz 14.7 GB
EGAF50000049817 fastq.gz 15.3 GB
EGAF50000049818 fastq.gz 12.5 GB
EGAF50000049819 fastq.gz 13.0 GB
EGAF50000049820 fastq.gz 14.8 GB
EGAF50000049821 fastq.gz 15.5 GB
EGAF50000049822 fastq.gz 14.4 GB
EGAF50000049823 fastq.gz 14.9 GB
EGAF50000049824 fastq.gz 14.9 GB
EGAF50000049825 fastq.gz 15.6 GB
EGAF50000049826 fastq.gz 14.4 GB
EGAF50000049827 fastq.gz 15.1 GB
EGAF50000049828 fastq.gz 11.6 GB
EGAF50000049829 fastq.gz 12.4 GB
EGAF50000049830 fastq.gz 15.3 GB
EGAF50000049831 fastq.gz 16.3 GB
EGAF50000049832 fastq.gz 13.0 GB
EGAF50000049833 fastq.gz 13.9 GB
EGAF50000049834 fastq.gz 13.5 GB
EGAF50000049835 fastq.gz 14.5 GB
EGAF50000049836 fastq.gz 12.0 GB
EGAF50000049837 fastq.gz 12.8 GB
EGAF50000049838 fastq.gz 14.4 GB
EGAF50000049839 fastq.gz 15.4 GB
EGAF50000049840 csv 3.2 kB
124 Files (2.4 TB)