Abberent expression analysis of LMNA gene in hutchinson-gilford progeria syndrome

Hutchinson-Gilford progeria syndrome (HGPS) is caused by de novo dominant point mutations of the genes encoding nuclear lamina proteins, leading towards premature aging. A protein sequence is subjected to mutations in nature which can affect the function and folding pattern of the protein by different ways. Mutations involved in HGPS were identified and were substituted in the seed sequence retrieved from the UniProt database to get the mutated versions. Tertiary structure of the Lamin A protein was previously unpredicted so was performed for all the mutated as well as for the seed protein to analyze the effects of mutations on the protein structure, folding and interactions. All the predicted models were refined and validated through multiple servers for multiple parameters. The validated 3D structure of seed protein was then successfully submitted to the Protein Model Database and was assigned with the PMDB ID PM0077829. All the predicted structures were superimposed with a root mean square deviation value of 7.0 Å and a high Dali Z-score of 1.9. It was observed that mutations affected physiochemical properties as well as instability index and thus is affecting the domains in specific and the whole structure in general. It was further analyzed that HGPS is the result of affected Lamin a protein interactions with other integral and binding proteins in the inner nuclear membrane affecting the link in between the nuclear membrane and the network of the lamina.


Background:
Hutchinson-Gilford progeria syndrome (HGPS) is an infrequent syndrome of the genes that encode the proteins of nuclear lamina (laminopathy) [1].It is caused by de novo dominant point mutations in lamin A gene leading towards the premature aging [2-4].The phenotype of the affected individuals resemble old people, having minimal fats, wrinkles on their faces, having cardiac dysfunction, weak bones and osteoporosis problem which is heart-wrenching and very conspicuous.Most of the affected people that survive to some extent often die in teen age due to heart failure [5].
The LMNA gene is present on chromosome 1q21.1-21. of the nucleus in higher eukaryotes.A number of integral proteins in the inner nuclear membrane binds to the lamins and make a link in between the nuclear membrane and the network of the lamina [11].Proteins are subjected to mutations in nature which can affect them directly or indirectly in different ways.Protein Properties are affected either by change in their amino acids, interactions or folding pattern.This study will provide a comprehensive insight into the molecular level and 3D level changes and folding patterns in Lamin a proteins which will provide a platform for future therapeutics against HGPS.

Methodology:
Changes at molecular level in a protein can affect the phenotype of the cells, tissues and finally the organisms.The sequence of Lamin A protein was retrieved from the UniProt (http://www.uniprot.org/)database as a prerequisite to analyze the effect of mutations on Lamin A protein structure, its function and folding pattern.Mutations were identified from the literature and were substituted in seed sequence of Lamin A using MUTATE_MODEL [12] to get the mutated versions of the protein for investigating changes in the structure, function and physiochemical properties.Patterns are useful tools to identify short and well-conserved regions, such as catalytic sites, binding sites, post-transcriptional modifications (PTMs) or zinc fingers etc [13] were predicted through ScanProsite (http://prosite.expasy.org/scanprosite).Domains were identified using Pfam online tool [14] to check that, either the mutations are in the domain, active site or in other non binding sites.
Assignment of the secondary structural elements is an essential step in characterizing the three dimensional structures, which also serves as a departure point in many theoretical studies devoted modeling, description of folding motifs, etc [15].Secondary structures of the seed along with the mutated versions were predicted through Hierarchical Neural Network (HNN) tool (http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page =/NPSA/npsa_hnn.html).All the predicted secondary structures were compared to analyze the changes and their impacts on the structures.Tertiary structure of the Lamin A protein was previously unpredicted, so the 3D structure of Lamin A protein and its filament domain was predicted as a seed by using combination of threading and ab initio technique through I_TASSER web based server [16].3D structures for the mutated sequences and both of the domains were also predicted through the same technique.
All the predicted structures were refined by using Chiron server [17] and were evaluated for Z score through ProSA-web [18], further validated for different parameters through WHAT IF [19] and for rotatable angles through RAMPAGE [20].The Ramachandran plot describes the two angles called phi and psi that describes the rotation of polypeptides.The less the residues in disallowed region the less will be the steric hindrance so the energy will be lower.As a final check, physiochemical properties were predicted through ProtParam (http://web.expasy.org/protparam)for all the structures to analyze the effect of mutations on them because these properties can affect the protein interactions.The computed parameters include the molecular weight, theoretical pI (isoelectric point), amino acid composition, atomic composition, extinction coefficient, estimated half-life, instability index, aliphatic index and grand average of hydropathicity (GRAVY).The sequence of Lamin A protein consisting of 664 amino acids was retrieved from UniProt with accession No. P02545 in FASTA format, this protein is the product of LMNA gene and belongs to the family of intermediate filament, involved in making a two-dimensional filamentous network at the periphery of the nucleus maintaining shape of the nucleus.Mutation analysis of the protein shows the type of changes and how they affect the properties and structures of the protein.For this purpose, the sequence of Lamin A protein was subjected to substitution of identified mutations i.e.E145K, R471C and R527C.The structure of Lamin A protein is divided into 2 domains, residues 30 to 386 called filament (domain A) and residues 425 to 540 the IF_tail domain (domain B).It was observed that mutation E145K lies in the filament domain, while mutations R471C and R527C are in IF_tail domain.Secondary structures were predicted for seed protein as well as for mutated Lamin A proteins and significant changes were observed in them Table 1 (see supplementary material).It can therefore be inferred that, these changes may be translated in the tertiary structures due to their effects on the folding pattern and thus tertiary structures were predicted for all the sequences (Figure 1).
The I-TASSER predicted 5 models for each sequence, out of which the best one was selected on the basis of C-score, which is an assurance for estimating the quality of predicted models, calculated based on the significance of threading template alignments and the convergence parameters of the structure The structures of Lamin A seed protein and mutated versions were compared and superimposed using DALILITE () (http://www.ebi.ac.uk/Tools/dalilite/) by superimposing with a root mean square deviation (RMSD) value of 7.0 Å and a high Dali Z-score of 1.9 (48 Cα atoms were aligned, 10% sequence identity to lamin A seed protein).E145K was observed with RMSD of 1.9 Å, Z-score of 27.2 (611 Cα atoms, 95% sequence identity); R527C with RMSD of 7.0 Å, Z score of 1.9 (48 Cα atoms, 10% sequence identity).As the mutations are in the domain so the domain structures were also predicted through M4T online tool to further analyze and compare changes taking place in mutated domains.Slight changes were observed in the energy profiles of IF_tail domain, where in E145K the energy was increased from residue 64 to 72 while in R471C energy reduction from residue 30 to 50 and then increases from 64 to 72 was observed as compared to the seed protein.The accuracy of the predicted models was quite acceptable but to avoid any inconsistency all the models were refined and validated for multiple parameters.The structure of seed protein after validation was submitted to Protein Model Database (PMDB) and was assigned with the PMDB ID PM0077829.Z-score was calculated through ProSA-web which is -4.2.The residues of seed protein, E145K, R471C and R527C in favored region are 519, 581, 519 and 581, in the allowed region are 110, 56, 103 and 56 while in outlier region are 33, 35, 40 and 25 respectively.According to physiochemical properties, mutations affected atoms and molecular weights as well as the isoelectric point, GRAVY and instability index.The instability index and isoelectric point of R471C is the lowest (5.11 and 6.44) but its hydropathicity is highest (-0.852) (Tab).The isoelectric point and hydropathicity both are important properties.Any change in them can affect the protein function, folding pattern and finally the protein-protein interaction as well.It was also observed that the mutations under studies are of non synonymous in nature.It was further confirmed that the mutation E145K is a non-synonymous conservative substitution while R471C and R527C are nonsynonymous radical substitutions.

Conclusion:
Our current study focuses on the structural analysis of Lamin A and the types of changes enforced by the mutations under study.Lamin A protein is divided into 2 domains, filament domain from residues 30 to 386 and IF_tail domain with residues from 425 to 540.It was observed that mutation E145K lies in the filament domain while mutations R471C and R527C are in IF_tail domain respectively.This study bridges computational biology to molecular, structural biology and experimental biology, which may deepen our understanding towards novel therapeutics against genetic disorders.Structure and sequence based computations were systematically evaluated and have provided a comprehensive structural explanations for impacts of the mutations under study.These changes affect the protein secondary and tertiary structure as well as the energy profiles and physiochemical properties.All these changes effect the Lamin A protein interactions with other integral and binding proteins in the inner nuclear membrane thus affecting the link in between the nuclear membrane and the network of the lamina causing the HGPS.
2 having 12 exons encompassing around 25 kp [6, 7] encoding a protein of 664 amino acids with molecular weight of 70 kDa [8].Lamins are classified as type A and B. The type B lamin is mostly expressed in all types of cells during development while in case of type A, it is expressed only in differentiated cells and are three in number i.e. lamin A (LA), lamin C and Lamin A delta 10.They are involved in multiple functions like nuclear envelope assembly, DNA synthesis, transcription, apoptosis, etc [9].Lamin A is produced as a prelamin A which has a motif called CAAX-box at Cterminus which undergoes farnesylation and then an internal proteolytic cleavage which removes the terminal 18 coding amino acids, producing mature lamin a [8].It is supposed that permanently farnesylated mutant form of prelamin a (progerin) acts as a dominant-negative which directs the progressive defects in the structure of nucleus showing HGPS [2].Lamin A form heterodimer together with Lamin C through their rod domains in order to form the filamentous structure found in the nuclear lamin [10].This filamentous network is directly underlying the inner nucleus membrane at the fringe ISSN 0973-2063 (online) 0973-8894 (print) Bioinformation 8(5): 221-224 (2012) 222 © 2012 Biomedical Informatics

Figure 1 :
Figure 1: Tertiary structure of refined seed and mutated lamin a protein and their comparison.Arrows are showing the major point of effects.(A) Tertiary structure of the seed protein; (B).Tertiary structure of mutated lamin a protein E145K; (C) Tertiary structure of mutated lamin a protein R471C; (D) Tertiary structure of mutated lamin a protein R527C Discussion:The sequence of Lamin A protein consisting of 664 amino acids was retrieved from UniProt with accession No. P02545 in FASTA format, this protein is the product of LMNA gene and belongs to the family of intermediate filament, involved in making a two-dimensional filamentous network at the periphery of the nucleus maintaining shape of the nucleus.Mutation analysis of the protein shows the type of changes and how they affect the properties and structures of the protein.For this purpose, the sequence of Lamin A protein was subjected to substitution of identified mutations i.e.E145K, R471C and R527C.The structure of Lamin A protein is divided into 2 domains, residues 30 to 386 called filament (domain A) and residues 425 to 540 the IF_tail domain (domain B).It was observed that mutation E145K lies in the filament domain, while mutations R471C and R527C are in IF_tail domain.Secondary structures were predicted for seed protein as well as for mutated Lamin A proteins and significant changes were observed in them Table1(see supplementary material).It can therefore be inferred that, these changes may be translated in the tertiary structures due to their effects on the folding pattern and thus tertiary structures were predicted for all the sequences (Figure1).