Protein molecular modeling of genetic markers for thyroid cancer

Introduction: The advances in thyroid molecular biology studies provide not only insight into thyroid diseases but accurate diagnosis of thyroid cancer. Objective: Design a tutorial on protein molecular modeling of genetic markers for thyroid cancer. Methods: The proteins were selected using the Protein Data Bank sequence and the basic local alignment search tool (BLAST) algorithm. The obtained sequences were aligned with the Clustal W multiple alignment algorithms. For the molecular modeling, three-dimensional structures were generated from this set of constraints with the SWISS-MODEL, which is a fully automated protein structure homology-modeling server, accessible via the ExPASy web server. Results: We demonstrated protein analysis, projection of the molecular structure and protein homology of the following molecular markers of thyroid cancer: receptor tyrosine kinase (RET) proto-oncogene; neurotrophic tyrosine kinase receptor 1 (NTRK1) proto-oncogene; phosphatase and tensin homolog (PTEN); tumor protein p53 (TP53) gene; phosphoinositide 3-kinase/threonine protein kinase (PI3K/AKT); catenin beta 1 (CTNNB1); paired box 8-peroxisome proliferator-activated receptor gamma (PAX8-PPARG); rat sarcoma viral oncogene (RAS); B-raf proto-oncogene, serine/threonine kinase (BRAF); and thyroid-stimulating hormone receptor (TSHR). Conclusion: This study shows the importance of understanding the molecular structure of the markers for thyroid cancer through bioinformatics, and consequently, the development of more effective new molecules as alternative tools for thyroid cancer treatment.

Thyroid cancer incidence is increasing worldwide, while mortality is stable or decreasing.Advances in thyroid molecular biology studies not only provide insight into development of thyroid diseases but offer an accurate diagnosis of thyroid cancer.
The evolution of knowledge in molecular diagnosis of thyroid cancer has accelerated in the latest years, especially with the advent of high-throughput sequencing technologies, when new molecular abnormalities have been diagnosed.The molecular markers of thyroid cancer are found in more than 70% of differentiated carcinomas, and understanding the molecular mechanisms of thyroid cancer offers new perspectives for its diagnosis and treatment (1,2) .
Progress in protein sequencing has led to an increase of protein databases, with a variety of protein sequences never seen before.Many molecular markers of thyroid cancer are proteins present at very low levels.Similar sequences and protein structures enable their evaluation with the use of computational comparative methods (3) .
The bioinformatics program currently available to perform molecular modeling and protein analysis provides a tool for assembly of the molecular markers of thyroid cancer, and understanding of its molecular mechanisms.The purpose of this study is to perform a tutorial about protein molecular modeling of genetic markers for thyroid cancer.
The proteins were selected using the Protein Data Bank sequence and the basic local alignment search tool (BLAST) algorithm (4) .The obtained sequences were aligned with the Clustal W multiple alignment algorithms.For molecular modeling, three-dimensional (3-D) structures were generated from this set of constraints with the SWISS-MODEL (http://swissmodel.expasy.org), which is a fully automated protein structure homologymodeling server, accessible via the ExPASy web server.

Protein database search and sequence analysis -BLAST
BLAST is a used protein sequence analysis tool available in the public domain (http://www.ncbi.nlm.nih.gov/Blast/), and a wide choice of BLAST algorithms is used to search many different sequence databases via BLAST.BLAST was introduced as a sequence alignment heuristic that was an order of magnitude faster than earlier approaches for analyzing biological information.Very quickly, this software became a landmark enabling technique for bioinformatics.Thus, the BLAST 2p refers to a program used to generate alignments between a nucleotide or protein sequence, referred to as a query and nucleotide sequences against other database of nucleotide, referred to as subject sequences.

Building 3-D molecular models
The structure and function of proteins are determined by their amino acid sequences, and the 3-D structure prediction still remains a significant challenge, with a great demand for highresolution structure prediction methods.
Homology modeling is currently the most accurate computational method to generate reliable structural models and is routinely used in many biological applications.

Modeling with SWISS-MODEL
SWISS-MODEL is a server for automated comparative modeling of 3-D protein structures.The process begins with the identification of suitable template structures based on their sequence similarity with the target sequence.This is achieved by searching a database of sequences with known 3-D structures using the sequence alignment tools BLAST and FASTA.
SWISS-MODEL provides several levels of user interaction through its World Wide Web interface: the "first approach mode", the "alignment mode", and the "project mode" using DeepView (Swiss-PdbViewer), an integrated sequence-to-structure workbench.The Swiss-PdbViewer not only provides advanced molecular display features and real-time visual feedback during structure modeling, but also features direct submission to the SWISS-MODEL model server.

rESuLtS
We demonstrated a search for protein sequence, the projection of the molecular structure and protein homology of the following molecular markers of thyroid cancer: RET proto-oncogene, NTRK1 proto-oncogene, PTEN, TP53 gene, PI3K/AKT, CTNNB1, PAX8-PPARG, RAS, BRAF, and TSHR.

Search for protein sequence of RET protooncogene
The main data source used for reconstructing the RET protooncogene was the protein sequence file in FASTA format.The full-length protein of RET proto-oncogene was obtained from the GenBank database under the identifier NCBI: AAH04257.1.The RET proto-oncogene was predicted to encode a 1072-amino acid protein.All coding sequences were selected and exported as amino acids in FASTA format, using the annotation of the SWISS-MODEL model server.RET proto-oncogene [Homo sapiens] protein analysis is shown in Figure 1.

Building a 3-D molecular model of RET protooncogene
Protein sequences of RET proto-oncogene [Homo sapiens] were obtained using BLAST; 3-D modeling was conducted using the Swiss-PdbViewer suite programs, which were adjusted and optimized for alignment between RET proto-oncogene protein and structural templates.On the basis of a sequence alignment between the RET proto-oncogene protein and the template structure, a 3-D model for the target protein was generated.Model quality assessment tools were used to estimate the reliability of the resulting model.Thus, using the SWISS-MODEL automated comparative protein modeling server, we constructed a homology model of the RET proto-oncogene [Homo sapiens] (Figure 2).

Search for protein sequence of NTRK1 protooncogene
The NTRK1 proto-oncogene was predicted to encode a 786-amino acid transmembrane tyrosine kinase expressed in neural tissues.The reconstruction of the NTRK1 proto-oncogene was made from a protein sequence file in FASTA format.The full-length protein of NTRK1 proto-oncogene was obtained from the GenBank database under the identifier NCBI: P04629.4.All coding sequences were selected and exported as amino acids in FASTA format, using the annotation of the SWISS-MODEL model server.NTRK1 proto-oncogene [Homo sapiens] protein analysis is shown in Figure 3.

Building a 3-D molecular model of NTRK1 protooncogene
Template search with BLAST was performed against the SWISS-MODEL template library.The NTRK1 sequence was searched with BLAST (5) .A total of 500 templates were found.An initial profile was built using the procedure outlined by Remmert et al. (2011) (6)   .The templates with the highest quality were then selected for model building.Ligands present in the template structure were transferred by homology to the model when the proposed criteria were met.The homo-oligomeric structure of the NTRK1 protein is predicted based on the analysis of pairwise interfaces of the identified template structures.According to the above-described criteria, a 3-D model for the NTRK1 proto-oncogene was generated (Figure 4).

Search for protein sequence of PTEN
We conducted BLAST analysis using the PTEN phosphatase domain as a query, and subsequently identified a predicted

Building a 3-D molecular model of PTEN
For molecular modeling, a sequence alignment was obtained by using Clustal W and a threading approach.Template search with BLAST was conducted against the SWISS-MODEL template library.On the basis of a sequence alignment between the PTEN protein and the template structure, a 3-D model for the target protein was generated.Model quality assessment tools were used to estimate the reliability of the resulting model.Thus, using the SWISS-MODEL automated comparative protein modeling server, we constructed a homology model of the PTEN protein [Homo sapiens] (Figure 6).

Search for protein sequence of TP53 gene
Protein analysis of TP53 gene was performed using BLAST and subsequently identified a predicted protein by NCBI accession: AEX20383.1.The TP53 gene [Homo sapiens] from all sources has 354 basic amino acids in the highlighted positions.The reconstruction of TP53 gene [Homo sapiens] was made from a protein sequence file in FASTA format.All coding sequences were selected and exported as amino acids in FASTA format, using the annotation of the SWISS-MODEL model server.TP53 gene [Homo sapiens] protein analysis is shown in Figure 7.

Building a 3-D molecular model of TP53 gene
The target sequence was searched with BLAST against the primary amino acid sequence.A total of 97 templates were found.For each identified template, the template quality was predicted from features of the target-template alignment.The template with the highest quality was used for model building.
Structural homology modeling was used to model the TP53 gene [Homo sapiens] amino acid sequence, based on the high-resolution crystal structure, and this structural model was calculated using the SWISS-MODEL, producing a high quality 3-D structure with a model-template C-α root mean square deviation of 2.9 angstroms (Figure 8).

Search for protein sequence of PI3K/AKT
We conducted BLAST analysis using the PI3K/AKT domain as a query, and subsequently identified a predicted protein by NCBI accession: AGC00787.1.The PI3K/AKT from all sources has 527 basic amino acids in the highlighted positions.The reconstruction of the PI3K/AKT was made from a protein sequence file in FASTA format.All coding sequences were selected and exported as amino acids in FASTA format, using the annotation of the SWISS-MODEL model server.PI3K/AKT protein analysis is shown in Figure 9.

Building a 3-D molecular model of PI3K/AKT
A 3-D model of PI3K/AKT (NCBI GI number: AGC00787.1)comprising 527 amino acids was built using the SWISS-MODEL program based on homology modeling (Figure 10).The SWISS-MODEL program automatically provides an all-atom model using alignments between the query sequence and known homologous structures.

Search for protein sequence of CTNNB1
Protein analysis of catenin beta 1 [Homo sapiens] was performed using BLAST, and subsequently identified a predicted protein by NCBI Reference Sequence: NP_001091679.1.The catenin beta 1 [Homo sapiens] from all sources has 781 basic amino acids in the highlighted positions.The reconstruction of catenin beta 1 [Homo sapiens] was made from a protein sequence file in FASTA format.All coding sequences were selected and exported as amino acids in FASTA format, using the annotation of the SWISS-MODEL model server.CTNNB1 catenin beta 1 [Homo sapiens] protein analysis is shown in Figure 11.

Building a 3-D molecular model of PAX8/PPARG
For molecular modeling, a sequence alignment was obtained by using Clustal W and a threading approach.Template search with BLAST was performed against the SWISS-MODEL template library.On the basis of a sequence alignment between the PAX8/PPARG fusion oncogene and the template structure, a 3-D model for the target protein was generated.Model quality assessment tools were used to estimate reliability of the resulting model.Thus, using the SWISS-MODEL automated comparative protein modeling server, we constructed a homology model of the chimeric PAX8/PPARG protein (Figure 14).

Search for protein sequence of RAS
The protein sequence was accessed with the NCBI Reference Sequence: P62834.1 Protein Data Bank using the BLAST tool.Sequence alignments for this protein were generated with the Clustal W program and the alignments scrutinized.The RAS gene was predicted to encode a 184-amino acid protein.The reconstruction of the RAS gene was made from a protein sequence file in FASTA format.All coding sequences were selected and exported as amino acids in FASTA format, using the annotation of the SWISS-MODEL model server.The RAS gene protein analysis is shown in Figure 15.

Building a 3-D molecular model of RAS
For molecular modeling, a sequence alignment was obtained by using Clustal W and a threading approach.Template search with BLAST was performed against the SWISS-MODEL template library.On the basis of a sequence alignment between the RAS gene and the template structure, a 3-D model for the target protein was generated.Model quality assessment tools were used to estimate the reliability of the resulting model.Thus, using the SWISS-MODEL automated comparative protein modeling server, a homology model of the RAS gene (Figure 16) was constructed.

Search for protein sequence of BRAF
The main data source used for reconstructing BRAF [Homo sapiens] was a protein sequence file in FASTA format.The fulllength protein of BRAF [Homo sapiens] was obtained from the GenBank database under the identifier NCBI: NP_004324.2.BRAF was predicted to encode a 766-amino acid protein.All coding sequences were selected and exported as amino acids in FASTA format, using the annotation SWISS-MODEL model server.BRAF [Homo sapiens] protein analysis is shown in Figure 17.

Building a 3-D molecular model of BRAF
The target sequence was searched with BLAST against the primary amino acid sequence.A total of 500 templates were found.For each identified template, the template quality was predicted from features of the target-template alignment.The template with the highest quality was then selected for model building.
Structural homology modeling was used to model BRAF [Homo sapiens] amino acid sequence, based on the high-resolution crystal structure, and this structural model was calculated using the SWISS-MODEL, producing a high-quality 3-D structure.This document lists the results for the homology modeling project submitted to the SWISS-MODEL workspace.The submitted primary amino acid sequence is given in Figure 18.

Search for protein sequence of TSHR
The main data source used for reconstructing the TSHR protein [Homo sapiens] was a protein sequence file in FASTA format.The full-length protein of TSHR [Homo sapiens] was obtained from the GenBank database under the identifier NCBI: AAI27629.1.The TSHR [Homo sapiens] was predicted to encode a 274-amino acid protein.All coding sequences were selected and exported as amino acids in FASTA format, using the annotation SWISS-MODEL model server.TSHR [Homo sapiens] protein analysis is shown in Figure 19.

Building a 3-D molecular model of TSHR
For the molecular modeling, a sequence alignment was obtained by using Clustal W and a threading approach.Template search with BLAST was performed against the SWISS-MODEL template library.On the basis of a sequence alignment between the TSHR protein [Homo sapiens] and the template structure, a 3-D model for the target protein was generated.Model quality assessment tools were used to estimate reliability of the resulting model.Thus, using the SWISS-MODEL automated comparative protein modeling server, we constructed a homology model of the TSHR protein [Homo sapiens] (Figure 20).

DiSCuSSion
In this study, we developed a tutorial on protein molecular modeling of genetic markers of thyroid cancer.One of the most important objectives of evaluation of the molecular markers in thyroid cancer is accurate diagnosis, besides the possible identification of individuals at greatest risk for thyroid cancer, in order to allow better management, recommendation concerning the most appropriate therapy, and prognosis (2) .
The molecular markers for thyroid cancer diagnosis have been an investigative focus that can likely lead to diagnostic improvements, as well as provide more accurate prognostic information before and after surgery.
The RET proto-oncogene, located on chromosome 10q11.2,encodes a membrane receptor of extracellular domain with a tyrosine kinase activity.RET, an abbreviation for "rearranged during transfection", exhibits an oncogenic potential and plays an important role in thyroid human cancers.In papillary thyroid carcinoma (PTC), genomic rearrangements juxtapose the RET kinase and COOH-terminus encoding domains (exons 11-21) to unrelated genes, thereby creating dominantly transforming oncogenes called RET/PTC (4,7) .
A correlation between specific RET mutation type and organspecific tumor development has been described (8) .The RET protooncogene is involved in the molecular evolution of sporadic medullary and PTC, but has also been involved in three subtypes of the inherited cancer syndrome multiple endocrine neoplasia type 2 (MEN2), as well as in Hirschsprung's disease, and each variant of MEN2 results from different RET gene mutations (9) .
The post-genomic era is featured by biological information filling the databases, and large-scale biology projects such as the sequencing of the human genome and gene expression surveys using ribonucleic acid (RNA)-seq, microarrays and other technologies have created a wealth of data for biologists.However, the challenge facing scientists is to analyze and even to access these data to extract useful information pertaining to the system being studied.We researched the RET proto-oncogene sequences in the database using BLAST software, identifying all the proteins encoded in RET proto-oncogene genome, and predicted their structures using domain analysis tools.
A study using bioinformatics tools investigated the structural organization of the extracellular region of the RET receptor tyrosine.It used the BLAST tool with multiple sequence alignments of seven vertebrate sequences and one invertebrate RET sequence delineated four distinct N-terminal domains, each of about 110 residues (10) .In our analysis, we carried out extensive BLAST searches of the current databases, and a 3-D model of RET protooncogene was built the SWISS-MODEL template library.
The use of proteomic technologies in drug discovery and development is called pharmacoproteomics.Proteomic technologies have contributed to molecular diagnosis, which is a basis of personalized medicine, and the individualized therapy may be based on differential protein expression rather than a genetic polymorphism (11) .Currently, several kinds of therapeutic approaches have been developed for the treatment of RETassociated cancers, including tyrosine kinase inhibitors, and molecular models have been potential targets for the development of more selective RET inhibitors.
The NTRK1 proto-oncogene is located on the 1q21-q22 chromosome and encodes the high affinity transmembrane receptor for nerve growth factor.Constitutive activation of NTRK1 has been detected in several tumors types.The fusion of the NTRK1 TK domain to 5' sequences with at least three different genes (TPM3 gene, TPR gene and TFG gene) leads to oncogenic activation of NTRK1.Somatic rearrangements of NTRK1, producing chimeric oncogenes (an activated version of the proto-oncogene was generated by a somatic intrachromosomal rearrangement fusing the tyrosine kinase domain of NTRK1 with 5' sequences of the non-muscular tropomyosin gene) with constitutive tyrosine kinase activity, have been frequently found in a consistent fraction of PTC (12) .However, PTC arising in patients with a history of exposure to elevated levels of ionizing irradiation does not carry these known abnormalities.In the same way NTRK1 rearrangements are rare in cases of sporadic PTC (13) .The frequency of NTRK1 rearrangements in post-Chernobyl papillary thyroid tumors was observed to be equivalent to that in sporadic tumors (14) .
Among different genetic factors involved in the pathogenesis of the PTC, rearrangements of NTRK1 proto-oncogene are best known.An explanation for the probability of thyrocytes to undergo gene rearrangements of NTRK1 was proposed when the recombination between RET and H4 was shown to be favored by the loci proximity in interphase nuclei (15) .
Reviewing the literature, we observed that in only one study a 3-D structural model of the NTRK1 (16) was built.In the study, the sequencing analysis of the NTRK1 was evaluated, protein modeling was conducted based on the recent data of NTRK1 structure in the Protein Data Bank, and mutation-related residues were positioned in the 3-D structural model using the PyMOL Molecular Graphics System (17) .In our study, the protein analysis of NTRK1 was performed in BLAST, and the 3-D modeling used the Protein molecular modeling of genetic markers for thyroid cancer Swiss-PdbViewer suite program, which was adjusted and optimized for alignment between NTRK1 and structural templates.Thus, in order to build a 3-D structural model of NTRK1 we used the Protein Data Bank sequence and a structural homology analysis strategy that recognizes structural similarity among proteins with low sequence identity based on 3-D position-specific scoring algorithms.The increased use of 3-D molecular models should promote advances in drug engineering and could also facilitate the development of new therapies and screening tests.
The tumor suppressor phosphatase PTEN, identified in 1997 as a tumor-suppressor gene located on 10q23.3, is a member of the protein tyrosine phosphatase family and, following activating mutations or amplifications of the genes encoding the effector proteins of PI3K/AKT pathway, inhibits PI3K signaling, thereby reducing the level of activated AKT.Structurally, PTEN protein is composed of an N-terminal dual specificity phosphatase-like enzyme domain and a C-terminal regulatory domain, which binds to phospholipid membranes (18) .The dysregulation of the PI3K/ AKT signaling pathway contributes to many cancers in humans with antiapoptotic action.Recent findings have demonstrated that PTEN also plays a critical role in deoxyribonucleic acid (DNA) damage repair and DNA damage response (19) .The importance of PTEN catalytic activity in its tumor suppressor function is underscored by the fact that the majority of PTEN missense mutations detected in tumor specimens target the phosphatase domain and cause a loss in PTEN phosphatase activity.These data suggest that genetic loss of PTEN is sufficient to induce thyroid cancer in vivo.
A study showed that the 3-D structure of PTEN provides a number of insights into the potential mechanisms by which phosphatases recognizes and dephosphorylates its 3-phosphatecontaining phospholipid substrates (20) , and the current model of PTEN, together with its crystal structure, should prove useful for structural predictions that can be tested in enzymatic assays (21) .We conducted NCBI-BLAST database searches using the PTEN domain as a query, which was identified by sequencing, and a high-quality structural model of PTEN was built with the SWISS-MODEL program (22) .The 3-D models have also been making important contributions to growing appreciation of oncologic therapeutics, including thyroid cancer.
The TP53 gene is located on chromosome 17p13.1 and consists of 11 exons, coding for a nuclear phosphoprotein that can bind to specific DNA sequences, and acts as a transcription factor.The TP53 is recognized as a tumor-suppressor gene because it encodes protein p53 participating in the processes of cell-cycle arrest in the G1 phase of the cell cycle via DNA repair, and also in apoptosis (23) .Inactivation of TP53 in immortalized cells results in an important instability of chromosome structure, including translocations, deletions, telomeric associations, and ring chromosomes (24) .The TP53 gene is frequently affected by loss of alleles and by point mutations in almost all cancers.A principle of this function for cancer progression is provided by the fact that mice without functional TP53 gene breed and develop (almost) normally, but die at an early age from multiple cancers (25) .Over 20,000 alterations in TP53 have been discovered in human tumors.The role of TP53 in cancer has been studied, and the presence of a TP53 mutation may be predictive of the tumor response to treatment and patient survival (26) .
The TP53 mutation is a common event in many cancers, including thyroid carcinoma, and it also has an effect on infiltration, lymphatic metastasis and prognosis of thyroid carcinoma.The mutations of TP53 are detectable in 15% of malignant thyroid tumors and are associated with the evolution from differentiated to anaplastic carcinoma (27) .Thus, inactivation of p53 has been considered a hallmark of advanced thyroid tumors.
A study evaluated the TP53 structure through a 3-D computer model constructed with the NOC program (28) .In that study, TP53 (residues 219-292) was modeled with the SWISS-MODEL software using the crystal structure of human TP53 (PDB accession code 2qxa, chain B) as a template (22) .Similarly, we built a 3-D structural model of TP53 gene [Homo sapiens] based on the high-resolution crystal structure, and this structural model was elaborated using SWISS-MODEL.
Computational 3-D screening has been employed for the discovery of novel small-molecule inhibitors of the MDM2-p53 interaction.It includes 3-D pharmacophore searching, structure-based database searching and the combination of both approaches (29) .
There are several classes of PI3Ks, among which class I is the best characterized and most important in human tumorigenesis.The PI3K is a lipid kinase that generates a messenger essential for the translocation of AKT to the plasma membrane (30) .AKT is a Ser/Thr kinase, and three types of its isoforms are found in human tissues: AKT-1, AKT-2, and AKT-3, with AKT-1, and AKT-2 being the most abundant and important in thyroid cancer.Activation of AKT plays a pivotal role in cellular functions such as cell proliferation and survival by phosphorylating a variety of substrates (31) .The PI3K/AKT signaling pathway plays an important role in transmission of cell signals through transduction systems to cell nucleus, where they influence the expression of genes that regulate crucial cellular processes as: cell growth, proliferation and apoptosis.Therefore, a major consequence of activating PI3K/ AKT signaling is the inhibition of cell cycle inhibitors.Genetic and epigenetic alternations, concerning PI3K/AKT signaling pathways, contribute to their activation and interaction in consequence of malignant cell transformation (32) .Thus, the alterations to the PI3K/AKT signaling pathway are frequent in human cancer.
Structural templates for the PI3K domain were identified by searching the PDB database with HHpred using default parameters.A study created alignments from the template structure of human phosphatidylinositol 3-kinase catalytic subunit type 3, which resulted in the highest probability score.Next, alignments were manually converted to FASTA alignments, and 3-D models were built with the SWISS-MODEL workspace using the alignment mode.Surface representations of the PI3K domain were created in PyMOL (17,33) .In our study, BLAST searches of current databases were followed by the construction of a 3-D model of PI3K/AKT signaling pathway with the SWISS-MODEL template library.
The CTNNB1 is a dominantly acting cancer gene located on chromosome 3p22-p21.3,composed of 16 exons, which has two main functions in cell regulation: as a cadherin-mediated adhesion regulator and as a mediator of WNT/CTNNB1 signaling (2) .
Catenin beta 1 is an oncoprotein, encoded by the CTNNB1 gene, and is responsible for anchoring the cadherin with the cytoskeleton.Loss of cadherin-mediated adhesion and activation of the WNT/catenin beta 1 signaling pathway are important steps in the development and progression of many neoplasms (34) .
The basic aspects of the role of catenin beta 1 in malignant transformation have been studied in various tumors.Catenin beta 1 is well-characterized as a major player in canonical WNT signaling, and once activated, it goes into the nucleus and stimulates transcription of target genes.The aberrant activation of WNT/catenin beta 1 signaling related to diverse mutations in catenin beta 1 has been demonstrated in thyroid cancer (35) .
We built the 3-D molecular model of CTNNB1 using the homology modeling method based on the CTNNB1 sequence and the high-homology structure as the template with SWISS-MODEL web server based on the protein sequence of CTNNB1.The used computational method can be of significant value for knowledge of thyroid cancer pathogenesis and may provide useful structural insights to facilitate rational drug design with personalized therapies.A study using the IntFOLD2, a recognition method (36) , built a full atomic 3-D model for the CTNNB1 sequence, and the phosphorylated tyrosine side chain structure was modeled using the Builder tool in PyMOL (37) .
The Pax8 belongs to the mammalian Pax protein family, a group of important developmental regulators featured by the presence of a highly conserved DNA-binding motif of 128 amino acids (38) .The peroxisome proliferator activated receptor gamma (PPARG) is one of the nuclear receptors that play an important role in insulin sensitivity and in adipocyte differentiation (39) .
The PAX8/PPARG fusion oncogene was created by a balanced translocation between chromosomes 2 and 3, where the 2q13-qter region is translocated to 3p25, resulting in an in-frame fusion between most of the coding sequence of the thyroid-specific pairedbox transcription factor PAX8 (2q13) and the entire translated reading-frame of the gene of the liganded nuclear receptor-family member PPARG (3p25).Thus, the fusion of the PAX8 gene with the PPARG gene results in strong overexpression of the chimeric PAX8-PPARG protein (40) .
Studies have shown that PAX8-PPARG may be useful in the diagnosis and treatment of thyroid carcinoma.The PAX8/ PPARG fusion oncogene is a common genetic alteration in follicular thyroid carcinoma; it is found in 30%-40% of thyroid follicular carcinoma and in 2%-10% of follicular adenomas, and has been reported of low frequency in papillary thyroid carcinoma with < 5% in non-classical thyroid papillary carcinoma (41) .However, the PAX8/PPARG rearrangement in itself may not be sufficient for the development of a malignant phenotype, and additional genetic or epigenetic studies may be required to enable the full phenotypic expression of follicular thyroid carcinoma.
In this study, the amino acid sequence of PAX8/PPARG was retrieved from NCBI sequence viewer, and this sequence was converted to FASTA format.The 3-D structure of PAX8/PPARG as built using SWISS-MODEL.We found no studies demonstrating the 3-D structure of PAX8/PPARG fusion oncogene.Thus, in absence of three-dimensional structures for most of the sequenced protein, homology modeling experimentally forms the basis for the resolution of the structure.
The RAS mutations are common in thyroid cancer; however, the incidence of RAS mutations in thyroid tumors and their frequency in specific histologic types vary widely in different series.The RAS mutations appear to be associated with aggressive tumors and point mutations that have been identified and localized at codons 12, 13 or 61 of three RAS genes (Harvey rat sarcoma viral oncogene: H-RAS, neuroblastoma rat sarcoma viral oncogene: N-RAS, and Kristen rat sarcoma viral oncogene: K-RAS), with mutations of N-RAS and H-RAS at codon 61, and of K-RAS at codon 12/13 being the most common (42) .The three members of the RAS gene family (H-RAS, N-RAS, AND K-RAS) were identified more than 25 years ago because of their frequent oncogenic activation in human tumors.Constitutive activation of H-RS, N-RAS, and K-RAS mutations has been reported as marker for aggressive thyroid cancer behavior.However, some studies have shown a similar prevalence of RAS mutations in benign and malignant thyroid neoplasm, suggesting that RAS activation may represent an early event (43) .The RAS signaling through downstream ERK and AKT pathways stimulate epithelial to mesenchymal transition in various cells (44) .These data suggest that RAS is involved in both thyroid tumor formation and in thyroid cancer progression.
A study of bio-computational analysis was performed using web-based tools and servers.Multiple sequence alignment of selected human RAS subfamily proteins with other homologous sequences revealed highly conserved regions, with the 3-D structure prediction done by using SWISS-MODEL (45) online tools.
In the present study, RAS protein sequences were taken from NCBI, and we built the 3-D molecular model of RAS using the homology modeling method based on the RAS sequence with SWISS-MODEL based on the protein sequence of RAS.
The BRAF is located on chromosome 7q24 and encodes a serine-threonine kinase, playing a critical role in cell signaling.It is dependent on the mitogen-activated protein kinase (MAPK) signaling pathway, and has been implicated in human cancers, what occurs in approximately 40%-45% of papillary cancers, 10%-15% of poorly differentiated carcinoma, and 20%-30% of anaplastic carcinoma (46) .Thus, BRAF has occupied a central role in thyroid cancer pathogenesis since the original identification of BRAF mutations in papillary and anaplastic thyroid cancer.Activated versions of BRAF can also be generated by intrachromosomal inversions that fuse the kinase domain of BRAF to the NH2-terminal portion of A-kinase anchoring protein 9 (AKAP9), resulting in BRAF-AKAP fusion proteins that are similar in structure to the RET/PTC fusion proteins, and are found in about 11% of patients whose thyroid cancers are thought to be caused by the Chernobyl nuclear power station disaster in 1986 (47) .
There are numerous BRAF mutations, but BRAF V600E, resulting from the BRAFT1799A transversion mutation in exon 15, is the more prevalent, where a substitution of a valine by a glutamic acid at position 600 occurs, and the BRAF mutation is almost exclusively detected with RET/PTC rearrangement (48) .A large number of studies have tried to evaluate the relevance and the function of the V600E mutation in controlling oncogenesis and progression of thyroid cancer.
The BRAF is composed of three conserved domains characteristic of the Raf kinase family: conserved region 1, a Ras-GTP-binding self-regulatory domain; conserved region 2, a serine-rich hinge region; and conserved region 3, a catalytic protein kinase domain that phosphorylates a consensus sequence on protein substrates.In its active conformation, BRAF forms dimers via hydrogen-bonding and electrostatic interactions of its kinase domains (49) .In 2004, the crystal structure of BRAF in complex with Bay 43-9006 was resolved and revealed a distinctive protein-ligand interaction mode (50) .
A study of alignment BRAF performed using the automatic mode of the SWISS-MODEL server built a 3-D molecular model of BRAF (51) .In our study, we constructed a 3-D model of BRAF with the SWISS-MODEL template library.Thus, the molecular model may predict diverse BRAF inhibitors for clinical use.
The TSHR is a key protein in the control of thyroid function and a major thyroid autoantigen.TSHR regulates thyroid growth and differentiation at late developmental stages.TSHR molecules in the membrane are quite stable, and signaling in the thyrocyte will be controlled mainly through circulating TSH levels.The mature TSHR is encoded by a single gene with 10 exons.The protein contains two subunits: a large ectodomain also called A subunit encoded by exons 1-8 and binds TSH, and an intracellular domain encoded by exons 9-10 called B subunit that will interact with G proteins to initiate signaling (52) .
The expression of TSHR gene is not an infrequent feature in thyroid cancers during the process of dedifferentiation that involves perturbation of several nuclear transcription factors (53) .
Somatic mutations in the TSHR gene have been identified in benign and malignant thyroid tumors.As a result of these somatic mutations, the TSHR is continuously activated, what could prompt the overgrowth of thyroid cells.Mutations of the TSHR gene are possibly related to thyroid cancer, and the first case of clear cell follicular carcinoma of the thyroid with next generation sequencing analysis reported showed that gain-of-function mutation of TSHR can overstimulate the thyroid follicular cells as the elevated level of TSH does and might have contributed to the development of clear cell morphology (54) .
The structural and functional dimensions of the extracellular TSHR region were studied in detail by mutagenesis (55) .The use of the crystal structure of the FSHR/FSH complex enabled extended and structurally supported insights into extracellular hormone binding, signal transduction and organization of the TSHR (56) .A study combining sequence and structure of molecular modeling of genetic markers for thyroid cancer has been recently published with the objective to predict whether specific missense variants in the protein kinase region have pathological significance (57) .Other studies reported the prediction of phenotypic severity of uncertain gene variants in the proto-oncogene using a computational method (58) .
The increase insight into the genetics and molecular biology of cancer has resulted in the identification of an increasing number of molecular markers for anticancer drug discovery and development.Currently, a number of recent successful drugs have partially or totally emerged from a structure-based research approach, and several advances, including crystallography and bioinformatics, are behind these successes (59) .
Although the methods of molecular modeling of genetic markers for thyroid cancer by bioinformatics have the advantages of manageable dimensionality, ease of testing, and clinical import, they are ultimately hampered by the complexity of their interaction.However, tools of computational models currently developed are of high impact and will be able to function as the bridge between tumor molecular biology and the individualized treatment of cancer patients.
There is growing evidence giving prominence to computational modeling in the field of biomedicine, in special, applied to the in silico analysis of cancer dynamics.Bioinformatics and molecular modeling tools have proved powerful towards the prediction of plausible cytotoxic T cell epitopes as well as other epitopes, cutting down time and cost in cancer treatment (60) .Thus, in the era of modern medicine, molecular modeling may allow the discovery of new molecular targets useful for the design of novel cancer drugs.That is because the use of advanced computer models allows the simulation of complex biological processes, providing hypotheses and supporting experimental design.

ConCLuSion
The structure and function of proteins are determined by their amino acid sequences.The 3-D structure prediction still remains a huge challenge, and there is still a great demand for highresolution structure prediction methods.Thus, with determination of protein structure and construction of a 3-D model, it will be possible to identify the location of binding sites on proteins of fundamental importance for a wide range of applications, including molecular docking, de novo drug design, structure identification, and comparison of functional sites.
This study shows the importance of knowledge of molecular structure of the markers for thyroid cancer through bioinformatics, and consequently, the development of more effective new molecules as alternative tools for thyroid cancer treatment.