Was this page helpful?

Gene 4 (Carboxy lyase)

                                                                                                                                                          Back to main page

    Carboxy - lyase

     

    GeneMark, 696bp, 231aa FGENESH, 609bp, 202aa
    6 - 29854 35618 4   30033 34733
    6_1 - 29854 29970        
    6_2 - 30045 30188 4_1   30033 30188
    6_3 - 30657 30727 4_2   30657 30725
    6_4 - 30825 30924 4_3   30826 30924
    6_5 - 32127 32194 4_4   31834 31863
    6_6 - 32338 32377        
    6_7 - 32469 32558 4_5   32463 32558
    6_8 - 33147 33204 4_6   34581 34733
    6_9 - 35611 35618        

     

    The predicted protein by FGENESH misses the exons 1, 6 and 9 predicted by GeneMark. Also, there are some discrepances between the predicted exons 5 and 8 of the GeneMark exons prediction.

    Alignment: Predicted Protein FGENESH

    Gene4_GENESH_Blastp.jpg

    Putative conserved domains detected:

    Gene4_GENESH_conserved domains.jpg 

    Alignment: Predicted Protein GeneMark 

    Gene4_GenMark_Blastp.jpg 

    Putative conserved domains detected: 

    Gene4_GenMark_conserved domains.jpg

     

    The protein that showed the best E-value (4 e-129) was the one predicted by GeneMark. This predicted protein has a bigger coverage that the FGENESH one. The exon 1 (GeneMark) showed homology with an Uncharacterized protein, however it seems that some upstream information is still being missed since the alignment with the protein of the database started at its residue No. 49. It is needed to run the analysis by using the predicted protein (unmasked sequence) at the predicted region (Gene 5, see section 4.1).

     

    Analysis with the unmasked sequence

    It was identified the location of each of the predicted exons by the both programs GeneMark and FGENESH. ESTs support the following gene model (here for details):

    Gene model Gene 4.jpg

     

    Genemark and FGENESH predictions

    Protein 4 comparison.jpg

    Final predicted sequence

    MAMDAAVAEKTGGGGAAGVAAGAQVGLNGGGVGGGERRSRFHRICVYCGSAKGRKPSYQD

    AAVELGKELVERGIDLVYGGGSIGLMGLVSHAVHDGGRHVIGVIPRSLMPREVTGEPVGEVRA

    VSGMHERKAEMARFADAFIALPGGYGTLEEVLEVITWAQLGIHRKPVGLVNVDGFYDPLLSFI

    DMAVNEGFIKEDARRIVVSAPTAKELVLKLEEYVPEYEVGLVWEDQMPAAELESPGSPPPRL

    WTSAALL

     

    Protein Function

    It was analyzed the final gene model by Blastp. The matches are strong, in the 2e-160 to 1e-89 range, identified as a carboxy-lyase and found throughout monocotyledons (ZeaSorghum and Oryza) and eudicotyledons (e.g., ArabidopsisPopulos).

    Typical alignment:

     >ref|NP_001130253.1| UniGene info linked to NP_001130253.1Gene info linked to NP_001130253.1 uncharacterized protein LOC100191347 [Zea mays]
    Length=255

    GENE ID: 100191347 LOC100191347 | uncharacterized LOC100191347 [Zea mays]
    (10 or fewer PubMed links)

    Score =   507 bits (1305),  Expect = 0.0, Method: Compositional matrix adjust.
    Identities = 255/255 (100%), Positives = 255/255 (100%), Gaps = 0/255 (0%)

    Query  1    MAMDAAVAEKTGGGGAAGVAAGAQVGLNGGGVGGGERRSRFHRICVYCGSAKGRKPSYQD  60
                MAMDAAVAEKTGGGGAAGVAAGAQVGLNGGGVGGGERRSRFHRICVYCGSAKGRKPSYQD
    Sbjct  1    MAMDAAVAEKTGGGGAAGVAAGAQVGLNGGGVGGGERRSRFHRICVYCGSAKGRKPSYQD  60

    Query  61   AAVELGKELVERGIDLVYGGGSIGLMGLVSHAVHDGGRHVIGVIPRSLMPREVTGEPVGE  120
                AAVELGKELVERGIDLVYGGGSIGLMGLVSHAVHDGGRHVIGVIPRSLMPREVTGEPVGE
    Sbjct  61   AAVELGKELVERGIDLVYGGGSIGLMGLVSHAVHDGGRHVIGVIPRSLMPREVTGEPVGE  120

    Query  121  VRAVSGMHERKAEMARFADAFIALPGGYGTLEEVLEVITWAQLGIHRKPVGLVNVDGFYD  180
                VRAVSGMHERKAEMARFADAFIALPGGYGTLEEVLEVITWAQLGIHRKPVGLVNVDGFYD
    Sbjct  121  VRAVSGMHERKAEMARFADAFIALPGGYGTLEEVLEVITWAQLGIHRKPVGLVNVDGFYD  180

    Query  181  PLLSFIDMAVNEGFIKEDARRIVVSAPTAKELVLKLEEYVPEYEVGLVWEDQMPAAELES  240
                PLLSFIDMAVNEGFIKEDARRIVVSAPTAKELVLKLEEYVPEYEVGLVWEDQMPAAELES
    Sbjct  181  PLLSFIDMAVNEGFIKEDARRIVVSAPTAKELVLKLEEYVPEYEVGLVWEDQMPAAELES  240

    Query  241  PGSPPPRLWTSAALL  255
                PGSPPPRLWTSAALL
    Sbjct  241  PGSPPPRLWTSAALL  255


     

    This enzyme was found possibly to belong to a family of lysine decarboxylase, which members share a highly conserved motif PGGXGTXXE (highlighted in the final predicted sequence above). This motif is likely functionally important.

    http://www.ncbi.nlm.nih.gov/Structur...XU016&mode=all

    Interpro_Gen3.png

    Was this page helpful?
    Tag page (Edit tags)
    • No tags
    You must login to post a comment.