Abstract
BACKGROUND/AIMS
Within a specific population, rare variants with minor allele frequencies (MAFs) below 0.01% may underlie a particular genetic disorder when Mendelian diseases are investigated. Although MAFs can be determined by universal databases, the importance of a variant with a low MAF may vary across populations. Hence, documentation of rare variants in healthy individuals could be remarkably valuable. However, such approaches have not yet been applied in the Turkish population. Thus, we aimed to identify rare germline variants in a healthy Turkish population.
MATERIALS AND METHODS
We re-analyzed whole-exome sequencing data from 80 healthy Turkish individuals and filtered variants using a MAF threshold of <0.01%. We further assessed the pathogenicity of the filtered variants according to the American College of Medical Genetics criteria.
RESULTS
There were numerous rare variants, some of which were common to all participants. Importantly, those variants were classified as of unknown significance or as likely pathogenic; however, this classification should be revised because the variants were observed in healthy individuals.
CONCLUSION
We propose that these variants could be benign, as they were detected in healthy Turkish participants. Finally, the rare variants in the Turkish population should still be reported to guide clinicians during routine molecular diagnosis.
INTRODUCTION
Molecular causes of genetic disorders have been identified using genetic variants. Although many variants have been reported for specific genetic disorders, additional variants remain to be discovered.1 Thus, studies in human genetics focus on exploring variants associated with, or resulting in, a particular phenotype.2 Here, using the single-gene single-phenotype approach, Mendelian diseases are generally diagnosed by detecting rare variants in a population.3
Rare variants are defined as deoxyribonucleic acid (DNA) alterations with minor allele frequencies (MAFs) of less than 0.01% in a specific population.4 According to the American College of Medical Genetics (ACMG) criteria5, a variant with low allelic frequency is classified with moderate piece of evidence for pathogenicity (PM2). Thus, low MAFs could be informative for evaluating the unreported variant. However, the MAF cutoff may vary between populations.6Moreover, unreported or not molecularly proven variants, even those with low MAFs, are classified as variants of unknown significance (VUS), which limits molecular diagnosis.7
To determine the pathogenicity of a variant, population databases are required. Although global databases such as gnomAD8, ExAc9, and 1000 Genomes Project10 are available, some countries have established population-specific databases to reduce biases arising from population-based differences.11-13Similarly, Türkiye has reported a whole-exome and whole-genome database14 and launched a portal for variant search (https://tgd.tuseb.gov.tr/) based on data from healthy participants. Nonetheless, the portal allows only single-variant searches, restricting cumulative evaluation of variants with low MAFs. Moreover, some variants of interest were not listed in the database. Hence, a more comprehensive report on variants with low frequencies in healthy cohorts is still needed.
In the present study, we re-analyzed whole exome sequencing (WES) data from 80 healthy Turkish individuals with respect to rare variants. After filtering the variants according to MAF values (<0.01%) from universal databases, we used the Franklin by Genoox tool (https://franklin.genoox.com/)to clarify variant pathogenicity according to ACMG criteria and listed the non-benign variants. Our results indicated that numerous variants had low MAFs, including variants in critical genes that were classified as VUS or likely pathogenic (LP). However, the proportion of those variants in the healthy cohort led to the conclusion that those variants, especially those linked to early-onset genetic disorders, should be re-categorized as benign in the Turkish population. The present study is the first to underscore the importance of rare variants in healthy individuals in the Turkish population, and such studies are still required.
MATERIALS AND METHODS
Ethics and Participants
In the present study, the WES data previously obtained from 80 healthy individuals in the sport genetics studies15,16 were re-analyzed. In previous studies, only polymorphisms of the participants were evaluated using an exome-wide association study. Sixty participants were elite Turkish athletes, and twenty were healthy, unrelated Turkish individuals. In the present study, all participants (27 females, 33.75%; 53 males, 66.25%; age >18 years) were regarded as healthy individuals and were not grouped, as none had any known (declared) genetic disorders. The study was approved by the Gazi University Non-Interventional Clinical Research Ethics Committee (approval number: 09, date: 05.04.2021). Under this ethical approval, all variants were already approved for analysis. Both written and verbal consents were obtained from the participants before the study. Moreover, the data were publicly available at https://doi.org/10.6084/m9.figshare.24496216.v1.
Variant Prioritization
In the study, data generated by WES, which was performed using the Twist Human Comprehensive Exome Panel (Twist Biosciences, USA) and the Illumina NextSeq500 (Illumina Inc., USA) on genomic DNAs from peripheral blood samples of the participants, were further examined. For the analysis of rare variants, the variants in variant call format for each participant were annotated using the ANNOVAR tool17 with the hg19 human reference genome. Next, variants located in exons or exon-intron boundaries were prioritized using VarAFT software18, where variants with read depth >10 were filtered by MAF <0.01%, considering only morbid genes.19 To limit the number of variants and to avoid manipulating any possible de novo variants, only homozygous variants were evaluated, while hemizygous or homozygous variants on chromosome X were documented regardless of participant sex. Finally, the pathogenicity of the variants was determined using the Franklin by Genoox tool (https://franklin.genoox.com/) according to the ACMG criteria.5 Moreover, we queried the Turkish Genome Project (TGP) database (TGP; https://tgd.tuseb.gov.tr/) for each variant, using a single-variant search when the variant was listed, and excluded variants reported in the database.
Statistical Analysis
No statistical analysis was performed in the study.
RESULTS
Each participant had nearly 700,000 variants before filtering. Applying a MAF <0.001 filter to homozygous variants in morbid genes yielded approximately 15 variants per patient. Variant assessment using Franklin by Genoox according to the ACMG criteria showed approximately three variants per patient classified as VUS or LP; on average, one such variant per patient was not documented in the TGP database. Those variants listed in Table 1 were detected in at least one participant. Variants that met ACMG criteria BS2 (observed in a homozygous state in population databases more than expected for disease), BS3 (well-established functional studies show no damaging effect on protein function or splicing), BP3 (in-frame deletions/insertions in a repetitive region with no known function), or BP6 (reported as benign but lacking evidence for independent laboratory evaluation), as well as variants classified as VUS by the Franklin tool were not reported in the results.
DISCUSSION
In the present study, we re-analyzed the WES data from 80 healthy Turkish individuals and documented homozygous rare variants that were recurrently observed in the participants and not reported in the TGP database. Regarding the early-onset status of diseases linked to genes harboring detected variants, we documented 62 variants in 60 genes associated with diverse genetic disorders.
Rare variants are regarded as absent from control (healthy) individuals, and their frequencies are extremely low in population-based genome databases.20 Thus, they are annotated with moderate evidence of pathogenicity (PM2) according to the ACMG criteria (5). Even though marking a variant with the PM2 criterion is suggestive of pathogenicity, studies have pointed out benign or likely benign variants in the presence of the PM2 criterion.21 Hence, reporting rare variants in healthy individuals may be useful to determine whether variants meeting the PM2 criterion are benign or likely benign. In the present cohort, 60 variants were annotated with the PM2 criterion, indicating that their frequencies were markedly low in public genome databases.
The frequency of a variant of interest can be determined using various genomic databases. Nonetheless, those databases are population-specific, and the exact frequencies vary across populations.22 Recently, Türkiye initiated a genome project called TGP, containing genomic information from 557 so-called healthy individuals. The platform allows users to search for individual variants of interest and provides a valuable opportunity to determine variant frequencies in a manner specific to the Turkish population. However, we noted that the platform does not document variants on the sex chromosomes, particularly indel variants. Therefore, listing rare variants in healthy Turkish individuals is fundamental to the assessment of variant pathogenicity in medical genetics in Türkiye.
Among the rare variants detected in the healthy cohort, two variants (FAM20C: NM_020223.4:c.951_952insACAGGTGAGCCCTTCCTTCCTCCCTCC ATCCGCG:p.Asp318Thrfs*118 and KRT10:NM_000421.5: c.1684_1685insAGCTCCGGCGGCGGATACGGCGGCGGCAGC:p.Ser562*) were classified as LP based on PM2 and the very strong pathogenic very strong 1 ( PVS1) criteria for pathogenicity. The PVS1 criterion applies to variant types that result in loss of protein function, including frameshifts, altered splicing, and exonic deletions.23 Those variants were detected in at least three healthy participants in the Turkish population. FAM20C (Golgi-associated secretory pathway kinase) encodes a protein that functions as a Golgi casein kinase, regulates phosphorylation of secreted proteins, and is linked to Raine Syndrome (Online Mendelian Inheritance in Man #259775), which is characterized by neonatal osteosclerotic bone dysplasia.24 However, the LP variant in the FAM20C gene was detected in five independent healthy individuals in the Turkish population, and therefore is unlikely to be the cause of lethal Raine Syndrome. Interestingly, the frameshift variant localizes to the kinase domain of the protein.25 Thus, further molecular studies are required to determine how protein function is conserved in the presence of this variant. The other variant in KRT10 was observed in three participants. The KRT10 gene encodes a type I keratin that is mainly expressed in epidermal cells. The mutations in this gene have been associated with rare skin anomalies. Importantly, the C-terminus of the protein was identified as a mutation hotspot.26, 27 According to the ENSEMBL database28, KRT10 gene (NM_000421.5) encodes a 584 amino acid-length protein. Hence, the variant detected in the healthy participants was localized to the end of the protein, which may not affect the protein’s structure and function.29
During classification, unreported or unproven variants are classified as VUS.30 In the present study, 60 variants were classified as VUS. The classification would change when the variant is detected in healthy individuals.31 Therefore, we propose that the classification of those variants be re-evaluated in medical genetics in Türkiye.
In the present study, we report biallelic variants identified in healthy individuals. Among the 60 genes affected in the homozygous state, 14 have been linked exclusively to dominant genetic disorders. In specific cases, the homozygous variants in dominant diseases have been reported to be neutral32, further proving the non-pathogenic nature of the listed variants.
Study Limitations
Although the study submitted variant lists for a limited number of healthy individuals, similar studies with larger group of participants or a wider sequencing approach as whole genome sequencing should be conducted to define more variants that would be reclassified in Turkish population.
CONCLUSION
The present study documented rare germline and homozygous variants in genes linked to early-onset genetic disorders in healthy Turkish individuals. Although the variants were classified as VUS or LP, their detection in the healthy cohort may indicate that those variants are benign. Moreover, the listed variants were not reported in the TGP database. Thus, reporting rare variants in a population-specific manner, as in the present study, is still valuable for guiding medical geneticists when prioritizing variants. Nevertheless, possible multigenic or multifactorial inheritance patterns should be considered when referring to those variants.
MAIN POINTS
• Rare variants are common in Turkish healthy cohort.
• The reported rare variants of unknown significance may be reclassified as benign in the Turkish population.
• Further population-based studies could facilitate variant prioritization.


