Was this page helpful?

Gene 3 (SAM Methyltransferase)

                                                                                                                                                          Back to main page

    Putative S-adenosyl-L-methionine-dependent methyltransferase 

    Analysis of the sequence using the prediction with the masked sequence

    Comparison FGENESH (Query) vs GeneMark (Sbjct) predicted proteins:

    GeneMark, 1047bp, 348aa FGENESH, 1359bp, 452aa
    5 + 22626 25964 3 + 22626 25964
    5_1 + 22626 22905 3_1 + 22626 22904
    5_2 + 23005 23066 3_2 + 23007 23066
    5_3 + 23155 23298 3_3 + 23155 23298
    5_4 + 23515 23568 3_4 + 23515 23625
    5_5 + 23709 23840 3_5 + 23709 23903
            3_6 + 24827 25015
    5_6 + 25505 25613 3_7 + 25505 25612
    5_7 + 25699 25964 3_8 + 25701 25964

     

    The predicted protein by GeneMark misses around 19 residues in exon 5 and the whole exon 6 (predicted by FGENESH).

    Gene3_GENESHvsGenemark.jpg

    Gene3_GENESHvsGenemark_Blastp.jpg

     

    Alignment: Predicted Protein FGENESH

    Gene3_GENESH_Blastp.jpg

    Putative conserved domains detected:

    Gene3_GENESH_conserved domains.jpg

     

    Alignment: Predicted Protein GeneMark

    Gene3_GenMark_Blastp.jpg

    Putative conserved domains detected:

    Gene3_GenMark_conserved domains.jpg

     

    The protein that showed the best E-value (0) was the one predicted by FGENESH. However, it is observed a gap between the region 266-313 of the query sequence. For that reason, we decided to run the analysis with the unmasked sequence, in order to include masked exons, if they exist.

    Analysis of the sequence using the prediction with the unmasked sequence

    It was identified the location of each of the predicted exons by the both programs GeneMark and FGENESH. ESTs support the following gene model (here for details):

     

    Gen 3 Exon. Position
    ATGCTTCGTGGCTCCGCCGCCCTCC Exon 1. 22626 - 22905
    TCCGCCGCCTCGCCCCTCGTATCTCGGGCGGTGGCTGCGGAAACACGACT  
    CACCGGCGCCTTCTTCCTCCCATCGCCCCCTTCCTCCGCGCCCGCTTCTT  
    GTCTGCCCCGACTACCTCTTCACCGCCTTCTTCGTCTTCAGCTGCCTGCC  
    ACGAGGAGGATGCGGAGGACAGAGACCTCCCCGAGATCTCCAATGGTGAT  
    GCAGGTGCCCGGCTCAGCATCTCCGTCGACCGCTCCGGGCTCTACAACCCCACAG  
       
    AGCACTCGCACGAGCCGTCGTCGGAATCCGAGCTCGTCAAGCACATCAAGAGCATCATAAAG Exon 2. 23005 - 23066
       
    TTTCGGAGTGGGCCAATCAGCATAGCTGAATACATGGAGGAGGTGC Exon 3. 23155 - 23298
    TGACGAACCCACAGTCAGGATTCTACATCAACCGTGACGTGTTTGGGGAA  
    TCAGGGGATTTCATCACCTCGCCAGAGGTCAGCCAGATGTTTGGAGAG….  
       
    ATGATTGGTGTATGGGCAATGTGCCTCTGGGAGCAAATGGGAAAGCCAGCGAAGGTG  Exon 4. 23515 - 23625
    AATCTGATTGAGCTTGGCCCAGGACGAGGAACACTCTTGGCTGATTTACTTAGG  
       
    GGATCGGCCAAGTTTGCTAACTTCACCAAGGCACTGAGCATT  Exon 5. 23709 - 23904
    AACTTAGTTGAGTGCAGCCCTACATTGCAAAAAATCCAGTATAATACTCT  
    GAAATGTGAAGATGAACATGTTGGTGATGGCAAGAGAACAGTTAGCAAGA  
    TTTGTGGAGCCCCTGTTTGTTGGCATGCCTCCCTCGAACAGGTTCCCTCAGGAT  
       
    CACCAACCATAATCATTGCACATGAATTCTATGATGCCTTACCAATCCATCAGTTTCAG  Exon 6. 24496 - 24554
       
    AAAGCATCACGTGGATGGTGCGAAAAGATGGTGGATATTGCAGAAGATTCTTT Exon 7. 24671 - 24723
       
    CAGGTTTCGGTTTGTTTTGTCTCCCCATCCCACAGCCTCTTTAATCTACCTCGCCAAGCGT Exon 8. 24790 - 25015
    TGTGGTTGGGCTAGCTCCGAGGAGCTTGAGAAGATTGAGCACATTGAAGT  
    CTGCCCCAAAGCAATGGAGCTTACCGAACAAATTGCAGATAGAATTAGTT  
    CGGACGGCGGAGGCGCTCTTATTATTGACTATGGAAAAAATGGAATAGTG  
    TCCGATAGTCTCCAG  
       
    GCCATCCGCAAACACAAGTTTGTGGACATATTAGACGATCCTGGCT Exon 9. 25505 - 25613
    CCGCTGATTTGAGTGCCTACGTTGATTTTGCTTCGATCAAGCGCTCCGCC  
    GAGGAAGCTTCAG  
     
    ACGATATCTCGGTCCATGGGCCAATGACGCAGTCCCAGTTCCTGGGTTCTCT  Exon 10. 25699 - 25964
    CGGCATCAACTTCAGGGTAGAAGCTCTCCTGCAGAACTGCACCGAGGAGC  
    AGGCGGAATCATTGCGGACGGGCTACTGGCGACTGGTTGGAGACGGGGAA  
    GCCCCGTTCTGGGAAGGCCCCGAGGACCAGGCGGCACCTGTTGGAATGGG  
    CACCAGGTACTTGGCCATGGCCATTGTCAACAAGAAGCAGGGCACGCCCA  
    TTCCGTTCGAGTGA  

     

     

    Alignment of the predicted proteins

    The predicted proteins where aligned using BLASTp (here for details).

    GeneMark

    There was not observed 100% of coverage of the proteins in the database. The region between 245-376 aminoacids of the protein that showed the best match in the database was missed by the GeneMark gene model. This is related with the exons that were predicted by FGENESH but not by GeneMark.

    FGENESH

    It was observed 99% of identity between the predicted protein using the gene model by FGENESH and a protein from the database (uncharacterized protein LOC100273691).

    The gen model by FGENESH seems more correct than the one by GeneMark.

    Interpro_Gen3.jpg

    Was this page helpful?
    Tag page (Edit tags)
    • No tags
    You must login to post a comment.