Group Members

    Christopher Dugard (cdugard)

    Yichun Qian (qian24)

    Norman Best (nbbest)

     

    Our preliminary candidate genes can be found on the "Candidate Genes- Group 5" page.

    Our final predicted genes can be found on the "Final Predictions - Group 5" page.

    Our group presentation can found here, or on the Final Predictions page.

    Preliminary Results

     

    So I found another fgenesh program to predict my genes and along with BLAST was able to at least get some predicted genes for my portion of the sequence.  You can see my chunk of genome here: genomics_predictedgenes.docx


    If you want to see the fgenesh file, it's here: fgenesh_first50kbp.pdf


    We can meet up sometime before Thursday to make sure all of our portions are at the same point.  I'm not sure how much further we are supposed to have this before then...


    cdugard 22 Oct 2012

     

     

    Sequence length is 150,000bp

    nbbest 01 October 2012

     

    Here is a saved search strategy for BLAST.  Just download this, go to NCBI BLAST, click "Saved Strategies" and upload this file.  Just click BLAST and it will take you to the same results I'm currently looking at.  It's too big to save the actual alignment results. It's just a simple Megablast against Zea Mays, but it should be pretty helpful.

    BLAST Search Strategy (Our sequence vs. Zea Mays Megablast)

    cdugard 25 Sep 2012

     

    Megablast nucleotide blast

    Megablast search vs. Zea mays (1st page)/excluding Zea mays (2nd page)

    • The megablast against Zea Mays showed the most hits on chromosome 5.  However, there were hits on chromsome 7, 8, 9, and 10.  The first hit in the list was on chromosome 8.
    • The megablast excluding Zea mays showed high homology to Zea luxurians (38% query coverage), Zea diploperennis (29%), and Gossypium raimondii (39%).

    nbbest 01 October 2012

    Gene Predictions

    FGENESH HMM based gene structure prediction through softberry (multiple genes, both strands)

    • I used the organsim Monocot plants (Corn, Rice, Wheat, Barley).  I used this organism because I had blasted our sequence and had the best evalue hits on Maize.  
    • 26 predicted genes, 122 exons.

    Genemark  HMM based gene structure prediction against species Zea mays.

    • 48% GC content
    • 40 predicted genes, 137 exons

    nbbest 01 October 2012

     

    Blastx  Excluding maize against the swissprot database.  Many of the hits came up as transposons (retro-virus, copia protein)

    • Main gene candidates include:
      • ABC transporter B Family member

        Multidrug resistance protein

        Bile salt export pump

        ATP-binding cassette sub-family B member

        Lipid A export ATP-binding/permeaseprotein

          *same locus

      • Serine/threonine-protein kinase

      • Uncharacterized Mitochondrial Protein (2 different loci)

    nbbest 10 October 2012

     

    To determine the genomic function of this sequence, I used the genes predicted by FGENESH as queries and ran BLAST on NCBI. The database I used is "Mapped DNA sequences from all listed plants". Most hits fell on the 2 genes named as:

    • Zea mays alcohol dehydrogenase 1 (adh1) gene, adh1-F allele, complete cds" (Assession # AF123535.1)
    • Zea mays 22 kDa alpha zein gene cluster, complete sequence" (Assession # 090447.2).

    There are a lot of overlaps between these 2 genes. Other hits fell on some Maize cDNA sequnces described as EST. In the document there is an typical example of the BLAST results.

    Just to play around, I picked up some random sequences from the repeat prediction results which contain certain properties such as AT-rich. I used these sequences as queries and ran the same BLAST program. The results are pretty much the same as above. In the document ther is an example result.

    qian24  02 October 2012

     

    Repeats predictions

    I used the online server of RepeatMasker to predict potential repeats such as transposons. The organism tamplates I tried are Arabidopsis, Wheats, Maize and Rice. According to the results, it is probably that this sequence comes from a plant most close to Maize. Upleaded documents are detailed information of the repeats.

    Bases masked:

    Arabidopsis: 8277bp (5.52%)

    Wheats: 20546bp (13.7%)

    Maize: 87777bp (58.52%)

    Rice: 26130bp (17.42%)

    qian24   01 October 2012

     

    Dotplot Shows rich repeat regions.

    • First dotplot is with word lengths of 14.
    • Second dotplot is with word lengths of 20. Highlighted red regions indicate repeat rich areas.

    nbbest 10 October 2012

     

    Transcription factor prediction

    Transposon prediction

    • So this one is a little hard to interpret, and it only predicts helitrons...but it should still be useful.  It separates them based on how confident the prediction is...looks like there are a couple high quality ones in our sequence.

    cdugard 10 Oct 2012

    Was this page helpful?
    Tag page (Edit tags)
    • No tags

    Files 21

    FileSizeDateAttached by 
     645GUJD201R_search_strategy.asn
    No description
    151 kB16:54, 25 Sep 2012cdugardActions
     Agry600_Group 5_Presentation.pdf
    No description
    418.5 kB10:12, 6 Dec 2012nbbestActions
     blastn-megablast.pdf
    No description
    60.12 kB20:38, 1 Oct 2012nbbestActions
     Blastx.pdf
    No description
    75.5 kB00:06, 10 Oct 2012nbbestActions
     dotplot.pdf
    No description
    26.3 kB00:20, 10 Oct 2012nbbestActions
     fgenesh_first50kbp.pdf
    No description
    192.23 kB22:17, 22 Oct 2012cdugardActions
     genemark_seq5.pdf
    No description
    535.66 kB21:04, 1 Oct 2012nbbestActions
     genomics_predictedgenes.docx
    No description
    213.81 kB22:17, 22 Oct 2012cdugardActions
     helitron_transposon_prediction.rtf
    No description
    147.54 kB18:49, 10 Oct 2012cdugardActions
     NCBI Blast_prediction 51836-59247.pdf
    BLAST result with the "gene" predicted by FGENESH
    142.68 kB19:43, 2 Oct 2012qian24Actions
     NCBI Blast_prediction random sequences.pdf
    No description
    151.65 kB19:56, 2 Oct 2012qian24Actions
     nucleotidenumber.pl
    No description
    939 bytes21:18, 1 Oct 2012nbbestActions
     RepeatMasker Results Arabidopsis.pdf
    RepeatMasker Arabidopsis Result Summary
    488.33 kB00:19, 2 Oct 2012qian24Actions
     RepeatMasker Results maize output (2).pdf
    Detailed information of repeats (Maize as template)
    1825.89 kB00:56, 2 Oct 2012qian24Actions
     RepeatMasker Results maize.pdf
    RepeatMasker Maize Result Summary
    492.36 kB00:19, 2 Oct 2012qian24Actions
     RepeatMasker Results Rice.pdf
    RepeatMasker Rice Result Summary
    491.01 kB00:19, 2 Oct 2012qian24Actions
     RepeatMasker Results website link.docx
    No description
    10.24 kB00:57, 2 Oct 2012qian24Actions
     RepeatMasker Results wheat.pdf
    RepeatMasker Wheats Result Summary
    495.71 kB00:19, 2 Oct 2012qian24Actions
     seq5.fa.txt
    No description
    152.36 kB16:50, 17 Sep 2012gribskovActions
     seq5fgenesh.pdf
    No description
    439.98 kB21:52, 19 Sep 2012nbbestActions
     transcriptionfactor_blast.pdf
    No description
    95.12 kB18:49, 10 Oct 2012cdugardActions
    You must login to post a comment.