Gene mapping and molecular analysis of hereditarynon-polyposis colorectal cancer (Lynch Syndrome)using systems biological approaches

Hereditary non-polyposis colorectal cancer (HNPCC) also known as Lynch Syndrome (LS), is a hereditary form of colorectal cancer (CRC). LSis caused by mutations in the mismatch repair (MMR) genes, mostly in MLH1, MSH2, MSH6 and PMS2. Identification of these gene mutations is essential to diagnose CRC, especially at a young age to increase the survival rate. Using open target platform, we have performed genetic association studies to analyze the different genes involved in the LS and to obtain target for disease evidence. We have also analyzed upstream regulators as target molecules in the data sets. We discovered that MLH1, MSH2, MSH6, PMS2, MLH3, EPCAM, TGFBR2, FBXO11 and PRSS58 were showing most association in LS. Our findings may further enhance the understanding of the hereditaryform of CRC.


Background:
Lynch syndrome is an autosomal dominant condition caused by many mismatch repair genes including four important genes; MLH1, PMS2, MSH2 and MSH6 [1]. LS was named in honor of Henry T. Lynch, who reported several families in detail during 1966-67 [2, 3]. LS accounts for 1-5% of all CRC and also present an increased risk of many extra colonic cancer types [4]. Mutations in the MMR genes lead to the inactivation or lower efficiency to repair mismatches in DNA that leads to the accumulation of spontaneous mutations mostly consist of the insertions and deletions in short repetitive DNA sequences termed microsatellites. The changes in short microsatellite sequences lead to the microsatellite instability, that is found in the majority of LS tumors (>90%) in patients with germ line mutations in MMR genes [5]. Therefore the current strategy before sequencing these MMR genes is to do the microsatellite instability (MSI) testing. So, if the patient tumor DNA is found with MSI, it will likely yield a mutation in MMR genes. Generally, the five different regions with microsatellites are looked at, and the tumor is considered highly unstable if instability is found in two or more regions. While the tumor is called as unstable-low if the instability is found in only one region and stable if no instability is found [6-9]. The identification of the MMR gene status is very important for surveillance and early intervention especially in the carriers and the family members of the CRC patients, therefore appropriate measure could be taken to limit the disease and improve the survival of the patients and carriers of the disease. And also excluding the family members for any mismatch ©Biomedical Informatics (2019) gene mutation carriers may reduce the worry and high-risk surveillance burden of prevention testing. In this article, we have utilized open target platform for genetic association studies of different genes spectrum involved in LS, their upstream regulators and canonical pathways.

Materials & Methods: Data-Mining of Genetic Associations in Lynch Syndrome:
We have used the Open Targets Platform (https://www.targetvalidation.org/),a free-online integrated web resource of genetics, omics and chemical data to aid systematic drug target identification and ranking linking these associations back to the underlying evidence and its source which gives the prioritization of drugs for gene targets based on the strength of their association with a disease such as LS [10,11]. The open targets platform assemble data types from multiple open sources and implement a scoring system on the gene target-disease associations aiming at providing users to classify, recognize, and prioritize suitable drug targets for further examination. The Open Targets score for the associations is a range between 0 (no association) and 1(strong association). The Open Targets Platform gives scores with varying shades of blue (the darker the blue, the stronger the genetic association with a particular disease) and the overall association score is the result of the combination of all data source scores [10, 11].

Ingenuity Pathway Analysis:
The Knowledgebase in Ingenuity Pathway Analysis (IPA) software (Qiagen, USA) was used to obtain the list of genes implicated in the LS. The canonical pathways, upstream regulators, and the differential regulation of gene networks in the LS were further deduced by applying the Fisher's Exact Test (P<0.05) in IPA. Thelog P values were plotted in the x-axis and the differentially expressed canonical pathways in the y-axis to derive top canonical pathways implicated in the LS.

Discussion:
Lynch syndrome is the most common hereditary CRC that account for more than 3% of all the colon cancer cases [12]. The genetic heterogeneity of this syndrome is related to the mutations in different genes especially in four mismatch repair genes; MLH1, MSH2, MSH6, and PMS2. The mismatch repair genes contributeto various cellular functions including repairing double-stranded DNA breaks, repairing or errors during DNA synthesis, antirecombination and destabilization of DNA and apoptosis. MMR proteins serve the job of maintenance of genetic material therefore vital for the regulation of the cellular cycle. When the MMR protein is defective or lost altogether, it decreases apoptosis and increases cell survival. This leads to the selective growth advantage to the cells that lead to the more susceptibility to tissue-specific cancers [12].  that allowed the prioritization of the genes based on the strength of their association with LS. The most important genes for the hereditary LS cancer were found to be the MLH1, MSH2, and MSH6. Together they account for more than 90% of the mutations in LS. MLH1 was on top with 731 mutations (589 pathogenic and 142 likely pathogenic), followed by MSH2 with 653 mutations (546 pathogenic and 107 likely pathogenic) and MSH6 with 414 mutations (367 pathogenic and 47 likely pathogenic). While the rest other genes including PMS2, MLH3, EPCAM, PRSS58 and TGFBR2 accounted for less than 10% of total mutations underlying LS. Canonical pathways in LSalso confirmed MMR involvement in syndrome, while other pathways included colorectal cancer metastasis signaling and ovarian cancer signaling (Figure 2).While the analysis of upstream regulators of the target molecules yield to be the MBD4, PTTG1, CHI3L1, TP53,and MYC ( Table 2 shows complete list).

Conclusion:
The current study has highlighted the mutation spectrum of different genes involved in Lynch syndrome and their association with different upstream regulators and involvement in canonical pathways. This study will further pave the way to accumulate all the data and genetic studies together for better prognostic and treatment options.