Was this page helpful?

Most Promising Genes

    Table of contents
    No headers

    We used the 27 genes predicted by Genemark in the unmasked sequence for BLASTX searches.
     The searches revealed the following possible genes in Sequence 6. Mobile elements have been excluded from this list.

    The genes have properly placed start and stop codons. The splice sites were identified, introns were removed and the exons for the genes were joined for BLASTX searches.

    GENE1: (Not considered in final model)

    3 exons Positive strand (6958-9457)

    -exon1: 6958-7977

    -exon2: 8564-8653

    -exon3: 9389-9457

    Evidence:
    It is predicted to be CACTA Tnp transposase (Zea mays) with E value 1e-141,  max identity 66% and Query coverage 92%. Accession Number: gb/ABF67953.1/  Conserved domains: Transposase associated super family and Trasposase_21. The first exon starts with start codon and ends before splice site GT. The second exon starts after AG and ends before GT. The third exon ends at stop codon. The GC content is 46.9%. This gene has been predicted by both the programs GeneMark and FGENESH  and showed exactly same EXON 2. Moreover, the sequence also resembles a transposon protein with similar conserved domain in Oryza sativa. ( ABA93762.1). However the RepeatMasker masked the gene as the repetitive sequence.This proves that the genes can be mistaken for repeat sequences , leading to unprediction of genes if we depend solely on RepeatMasker to exclude all the repeated sequence.

    GENE 2: (Link to final model- Gene I)

    3 exons Positive strand (12115-13479)

    -exon1: 12115-12514

    -exon2: 12882-13078

    -exon3: 13162-13479

    Evidence:

    It is predicted to be MYB related transcription factor with E value 5e-107, max identity 69% and query coverage 99%. Accession Number: gb/AFU81575.1/Conserved domain: SANT super family myb_SHAQKYF. The first exon starts with start codon and ends before splice site GT. The second exon starts after AG and ends before GT. The third exon ends at stop codon. No repeat region was seen in the sequence. It has high GC content of 64.26%. GeneMark and FGENESH showed exactly same Gene with exactly matching three exons

     

    GENE 3: (Not considered in final model)

    7 exons Positive strand (27711-32065)

    -exon1: 27711-27859

    -exon2: 27959-28154

    -exon3: 28555-28606

    -exon4: 28944-29296

    -exon5: 31160-31271

    -exon6:31712-31911

    -exon7: 31979-32065

    Evidence:

    It is predicted to be Uncharacterized protein (Zea mays) with E value 2e-97, max identity 86% and Query coverage 100%. Accession Number: ref/NP_001141693.1/Conserved domains: C3HC  Zn finger like super family. GC content is 49.5%. No repeat regions are seen in this sequence. Both GeneMark and FGENESH showed exactly same EXON 2 and exons 1, 3 and 4 looking close.

     

    GENE 6: (Link to final model - Gene II)

    -6 exons, positive strand (48105 - 50575)

    -exon 1: 48105 - 48147

    -exon 2: 48284 - 48320

    -exon 3: 48434 - 48650

    -exon 4: 48758 - 48919

    -exon 5: 50183 - 50302

    -exon 6: 50423 - 50575

    Evidence:

    -As it is currently annotated, the first exon begins with a start codon and the final exon ends in a stop codon. All of the splice sites between exons look correct.

    -Based on the results of BLASTx, it is predicted that this gene is a seven-transmembrane domain protein. More specifically it is believed to be a sugar efflux transporter protein based on its conserved domains. The top result was "seven-transmembrane domain protein 1 (Zea mays)" (Accession: NP_001150719.1) (Query coverage: 99%; E Value: 5e-135; Max ident: 99%). It had the following conserved domains:  PQ-loop superfamily and MtN3_slv.

    -There was no repetitive sequences found by RepeatMasker in the annotated gene 6 region.

    -A gene was predicted in this region by both GeneMark and FGENESH. The beginning of the gene was annotated differently by the two programs. Exons 4, 5, and 6 were annotated exactly the same.

    -The percentage of CG in this gene region is 58%, which is higher than the 46% CG content of our entire sequence.

     

    GENE 13: (Link to final model- Gene III)

    -2 exons, positive strand (90198 - 93372)

    -exon 1: 90198 - 92108

    -exon 2: 93322 - 93372

    Evidence:

    -As it is currently annotated, the first exon begins with a start codon and the final exon ends in a stop codon. The splice site between the two exons looks correct.

    -Based on the results of BLASTx, it is predicted that this gene is starch branching enzyme III. The top result was "starch branching enzyme III (Zea mays)"  (Accession: NP_001108121.1) (Query coverage: 98%; E value: 0.0; Max ident: 99%). It had the following conserved domains: AmyAc_family superfamily and E_set superfamily.

    -The BLASTx results also included branching enzymes in other plant species. One of the top results was "starch branching enzyme III (Triticum aestivum) (Accession: AFH58741.1) (Query coverage: 86%; E value: 0.0; Max ident: 88%). There was also a "starch branching enzyme" in Ipomoea batatas (Accession: BAB40334.1) (Query coverage: 56%; E value: 9e-97; Max ident: 50%).

    -There was no repetitive sequences found by RepeatMasker in the annotated gene 13 region.

    -A gene was predicted in this region by both GeneMark and FGENESH. The annotations by the programs was a bit different, but there was significant overlap between the two.

     

    GENE 16: (LInk to final model- Gene IV)

    -5 exons, positive strand (107366- 111989)

    -exon 1: 107366- 107487

    -exon 2: 107628– 107893

    -exon 3: 110482– 111032

    -exon 4: 111128– 111547

    -exon 5: 111642– 111989

     

    Evidence:

    -As it is currently annotated, the first exon begins with a start codon and the final exon ends in a stop codon. The splice site between the each exon looks correct (GT..AG).

    -Based on the results of BLASTx, it is predicted that this gene encodes probably a protein related to an amino acid permease. When we blasted all the five exons joined without the nucleotides between, the top result was "uncharacterized protein LOC100193963 [Zea mays]"  (Accession: NP_001132503.1) (Query coverage: 99%; E value: 0.0; Max ident: 84%). It had the following conserved domains: Spore permease superfamily, amino acid permease (GABA permease) and others. To see all the putative conserved domains click here.

    -A similar protein was resemblemd when we blastxed against other plant species. A putative aminoacid permease (Accession:ACD46668.1)(E=4e-117) in positions 110556..111056, 111125..111544, 111636..111982 was detect in the Triticum aestivum genome.

    -GC content: The graphic that Artemis Realease 14.0.0 of the %GC shows that all the exons of gene 16 are in areas where the %GC content is between 45% and 75%. To see the graphic click here.

    -There was no repetitive sequences found by RepeatMasker in the predicted gene 16 region of GeneMark, which supports the idea that this gene could be an actual gene.

    -A gene was predicted in this region by both GeneMark and FGENESH (gene 21 for FGENESH). However, this is another good evidence that supports gene 16 of GeneMark.

     

    GENE 23: (Not considered in final model)

    -6 exons, positive strand (132695- 14068)

    -exon 1: 132695- 133035

    -exon 2: 133179– 133243

    -exon 3: 133305– 133458

    -exon 4: 133544– 133600

    -exon 5: 133844– 133929

    -exon 6: 134031- 134068

     

    Evidence:

    -As it is currently annotated, the first exon begins with a start codon and the final exon ends in a stop codon. The splice sites between the each exon looks correct (GT..AG).

    -Based on the results of BLASTx, it is predicted that this gene might be (or might be not) a putative growth-regulating factor 1 (but just in exon 3 because the other exons do not resemble for anything). When we blasted just exon 3, the best match obtained was “putative growth-regulating factor 1 [Zea mays]” (Accession: AAT08013.1) (Query coverage: 55%; E value: 2e-7; Max ident: 89%). However, when we blasted all the six exons joined without the nucleotides between, the top result was still the "putative growth-regulating factor 1 [Zea mays]" (Accession: AAT08013.1) but with lower values (Query coverage: 13%; E value: 8e-6; Max ident: 77%) suggesting the other exons probably don’t represent any coding regions.There was not any putative conserved domain detected.

    -GC content: The graphic that Artemis Realease 14.0.0 of the %GC shows that all the exons of gene 16 (especially exon 3 in this case) are in areas where the %GC content is between 45% and 75%. To see the graphic click here.

    -There was no repetitive sequences found by RepeatMasker in the predicted gene 16 region of GeneMark, and also overlaps with predicted gene 25 of FGENESH.

     

    REFERENCES:

    Haberer G. et al.(2005). Structure and architecture of the maize genome. Plant physiology 139: 1612-1624.

    Rutherford KParkhill JCrook JHorsnell TRice PRajandream MA and Barrell B. Bioinformatics (Oxford, England) 2000;16;10;944-5

    Was this page helpful?
    Tag page (Edit tags)
    • No tags

    Files 15

    FileSizeDateAttached by 
     Best match gene 16.png
    Best match gene 16
    69.2 kB01:32, 25 Oct 2012jdiazvalActions
     Best match gene 23 exon 3.png
    No description
    26.45 kB02:15, 25 Oct 2012jdiazvalActions
     GC content gene 23 exon 3.png
    No description
    17.72 kB02:04, 25 Oct 2012jdiazvalActions
     Gene 13 BLASTx figure.docx
    No description
    72.96 kB00:36, 25 Oct 2012lang8Actions
     Gene 16 GC content.png
    %GC graph gene 16
    22.08 kB00:32, 25 Oct 2012jdiazvalActions
    Gene 1a.png
    No description
    25.89 kB11:26, 24 Oct 2012bshakyaActions
    Gene 1b.png
    No description
    78.91 kB11:26, 24 Oct 2012bshakyaActions
    Gene 2a.png
    No description
    57.15 kB11:27, 24 Oct 2012bshakyaActions
    Gene 2b.png
    No description
    25.91 kB11:27, 24 Oct 2012bshakyaActions
    Gene 3a.png
    No description
    97.9 kB11:27, 24 Oct 2012bshakyaActions
    Gene 3b.png
    No description
    25.48 kB11:27, 24 Oct 2012bshakyaActions
     GM 13 BLASTx domains.png
    No description
    6.82 kB00:42, 25 Oct 2012lang8Actions
     GM 6 BLAST x result figure 1.png
    No description
    49.23 kB20:02, 24 Oct 2012lang8Actions
    GM 6 BLASTx domains.png
    No description
    62.4 kB23:50, 24 Oct 2012lang8Actions
     Putative conserved domains Gene 16.png
    Putative conserved domains in gene 16
    57.22 kB00:31, 25 Oct 2012jdiazvalActions
    You must login to post a comment.