Table of contents
    No headers

    The 400,000 base pair sequence, seq201, is a recently released section of the maize genome on the short arm of chromosome 10. Our analysis began by determining the GC content of the sequence which is 46%. 

    gcplot_seq201.1.png

    The GC content of the sequence over the entire sequence. There seems to be a higher GC content in certain areas of the sequence such as around 10,000; 190,000; and 210,000.

    We also ran the sequence through repeatmasker to find out the percentage of the sequence which is retroelements and transposons. See supplemental data for the output. We found about 83% of the sequence is retroelements and transposons. 

    FGeneSH found 3 proteins in the masked sequence, all on the positive strand.

    >FGENESH:   1   2 exon (s) 160825  - 162012   175 aa, chain +

    MEVVETTEATPSALEIIPVGATEAATAQLEKSEPKSSRTKQQPKLQSLAAMAGLSKTSTV

    PAATPRKERRIASVLDVVLKPSKVSTPASTKVSEDKIEELGEVAAASASPSRAEGGPSKT

    RSVEQVKENLLEKLEQKEKEADAKHIADLEYALSIQVGLRRSKEAGLERKLDEIT

    >FGENESH:   2  12 exon (s) 212181  - 221125   833 aa, chain +

    MASGSSRPQIDQFFPAKKRRPSSRKGESPRPGTQHESPGGAKGSLEGYLVRSPCNRAVAA

    ASVVPVGSPRGGDPGARRSLSAAMDVDVGSSASVAAAAGDGADFDLRRSTADFLSHSCSA

    ILPLRGDCDYSEQCQMKQKRSASQSFLVPCDNDYVKKQRVAHCGGFEALKELDDNVASMK

    QCINHHGGSEAVEESVEGEHVSCVGFSALQRCSFTPSTAQIKVGFSLAPVETLKSVSRNS

    LTSPGEEFWNAAIELADGISAQADKVRGRPEFDAAEDKSSCAVAAFFKTLPRSGRDELDC

    QNTVGSNDTHHIEKLSNKVESLAVNCQHIISSPLPVKQLDFFHEDDIKVSGSKFETKGSY

    ETCNAQADRVPLKGSGRLEKENLMDPVDVLKRSAINLHTDSAAMRCQGVFKSTTEGKVHP

    TQEAERDSNLIRRDLNQPTHNENKSLAAYSNNRKTWIDSKSKVASQEVEASTPTSSIPLK

    DHSKISSWLPPDLCAIYMKKGISELYPWQASPRLTLYRLKDKRGLVALCRLTALITMIKG

    EKERGEERQLWSMTGARGISCLIVPKVECLLVEGVLGKRNLVYCASTSAGKSFVAEVLML

    RQILSTGNMAILVLPYVSICAEKAEHLEQLLEPLGRHVRSFYGNQGGGSLPKDTAVAVCT

    IEKANSLVNKLLEDGRLSELGVIVIDELHMVGDQHRGYLLELMLTKLRYAAGEGTLQSCS

    GEVSSSSSGKIASHGLQIIGMSATMPNIAAVADWLQPDQASPVTTDLRRATTLVVPTSFS

    SMFGCFSALALGQLWLGIGMWRKCNSNCVASLHGHQGWVLLVRQTSAKPATMT

    >FGENESH:   3  18 exon (s) 230091  - 256674  1330 aa, chain +

    MNVVRVLPKVADHGGKDPDHIVELCNEVVLQGHSVLLFCSSRKGCESTARHVAKFLKLPS

    VGTTDVSSEFSDAAAAIEALRRCPAGLDPILGETLPFGVAYHHAGLTVEEREIVESCYRK

    GLVRVLTATSTLAAGVNLPARRVIFRQPKIGRDFIDGTRYRQMAGRAGRTGIDTKGESIL

    VCRPEEVKRVTGIIRSNCPPLESCLSEDKNGMTHAIMEVVAGGIVQTANDIHRYVRCTLL

    NSTRPFDDVVKSAQDSLRWLCHKSFLEWNHETKIYSATPLGRAAFGSSLNPEESLVAYLS

    FAPSLFYCIEKGPLAELVVLDDLSRAREGFVLASDLHLVYLVTPINVDLEPDWELYYEKF

    MQLSSLEQSVGNRVGVIEPFLMHMAHGASIPFRARPERNTGLSNKSAQAAGNTLVSEHTI

    RVSKRFYVALMLSRLAQEIPIADVCETFKVARGMIQALQENAGRFASMVSAFCQRLGWND

    LEGLVSKFQNRVSFGVRAEIAELTSIPFVKGSRARALYKSGLRTPVAIAEASIPEIAKAL

    FESSAWSGQGDSGLRRMQFGVAKKIKNGARKIVLEEAEAARVAAFSAFKSLGVEVPQFTA

    PPLAAIEEYPTHDTIDQAKLNKLASGTHSRDDKNMNTCSDYATPRASTYSLRKDTSPAPS

    IQMKENAGLSKNVQITRQGASSSLSTEIADGSSRDVAEKGPVHAYNFPGGFDCFLDQWST

    ASNFFFDVHFIKRSMKPSLNLFEVFGLAVCWEKSPIYYCNFPKDLVTTGNNNIKEMWGNF

    QRRWEKIADIMKQKSVQKMTWNLKIQIQALKSPYISCQRLERFHLDHKMLDKVEVLNNSY

    MLLSPVSVYNGLDICLLAWVLWPDEESRAVPNLEKFVKRRLHSQEAAAANRDGRWRNQMH

    KAAHNGCCRRAAQTRALYTVLKKLLTQNLSVLVETIEAPLVNVLADMELWGIGADMDACL

    RARHIITKKLKELEKEAYRLAGKNFSLNATADIADILYTHLKLPVPKGCEKGKLHPSTDK

    QSLDHLRYIIHGNWLQTSTATGRLSMEEPNLQCVEHVVEFDTGKSDKDYSSISVDDHHKI

    NARDFFIPTKENWLFITADYSQIELRLMAHFSKDHMLVELLKKPDGDVFTMIASRWAEKE

    EALISSKERENTKRFIYGILYGMGANSLAEQLQCSTEEAAQKIQSFRRFFPGVSSWLHEV

    VLSCRQKGFVETLLGRRRVLTKITAGNSKEKAKAQRQAVNSICQGSAADIIKVAMIRVHS

    IITNRTSAADSTDEVTRKFSELGGQCHLILQVHDELVLEVDPCMVAQAGSTFAHKDKGGK

    NLGFTRTIPS

     

    and Genscan predicted 14 genes from the masked sequence:

    1.01 Intr +  25222  25323  102  1  0   20   28   170 0.001   6.25

     1.02 Term +  35701  36094  394  1  1   44   45   212 0.501  11.64

     1.03 PlyA +  57769  57774    6                              0.77

     

     2.00 Prom +  74097  74136   40                              1.22

     2.01 Init +  75894  76003  110  2  2   94   77    52 0.687   7.37

     2.02 Term +  94973  95129  157  2  1   55   46   118 0.013   4.92

     2.03 PlyA +  95271  95276    6                               0.77

     

     3.04 PlyA -  96350  96345    6                               2.27

     3.03 Term -  96365  96357    9  2  0   23   45     0 0.207 -10.28

     3.02 Intr -  96897  96797  101  1  2   67   14   212 0.747  14.28

     3.01 Init -  97259  97220   40  2  1   76   46    71 0.830   3.26

     3.00 Prom -  97983  97944   40                               4.82

     

     4.03 PlyA - 101959 101954    6                               2.27

     4.02 Term - 103101 102941  161  1  2   95   46    39 0.136   1.74

     4.01 Init - 106579 106420  160  1  1   64   62    86 0.132   6.79

     4.00 Prom - 111968 111929   40                               3.92

     

     5.00 Prom + 137239 137278   40                              -0.58

     5.01 Init + 168883 168943   61  0  1   51   95    47 0.731   3.81

     5.02 Intr + 169023 169234  212  1  2  -25   39   252 0.501  11.05

     5.03 Term + 173771 173827   57  1  0   16   45    62 0.494  -4.99

     5.04 PlyA + 182208 182213    6                               2.27

     

     6.05 PlyA - 182933 182928    6                               2.27

     6.04 Term - 183334 183215  120  1  0   71   45   122 0.003   7.67

     6.03 Intr - 202188 202050  139  2  1   71   27   199 0.002  15.54

     6.02 Intr - 202473 202304  170  0  2  -71   52   238 0.002   6.21

     6.01 Init - 206541 206488   54  0  0   55   49    54 0.053   0.06

     6.00 Prom - 211346 211307   40                               1.62

     

     7.00 Prom + 212085 212124   40                               4.72

     7.01 Init + 212329 212364   36  0  0  116   98    -4 0.997   5.66

     7.02 Intr + 212447 212766  320  1  2   55   99   399 0.984  36.52

     7.03 Intr + 212861 213014  154  2  1   57   53    83 0.935   4.84

     7.04 Term + 213106 213180   75  0  0   91   46    17 0.901  -1.64

     7.05 PlyA + 213634 213639    6                               2.27

     

     8.03 PlyA - 215887 215882    6                               2.27

     8.02 Term - 216961 216955    7  0  1  111   35     0 0.062  -3.34

     8.01 Init - 219955 219948    8  1  2   80   92     0 0.378   2.37

     8.00 Prom - 222236 222197   40                               3.02

     

     9.03 PlyA - 222481 222476    6                               0.77

     9.02 Term - 223454 223229  226  1  1  -30   46   233 0.272   7.18

     9.01 Init - 231008 230935   74  2  2  104   66   -25 0.374   0.28

     9.00 Prom - 233666 233627   40                               2.52

     

    10.00 Prom + 234367 234406   40                               4.62

    10.01 Init + 234465 234507   43  2  1   40   81    -6 0.414  -3.00

    10.02 Term + 236652 236716   65  1  2   34   46   100 0.754   1.11

    10.03 PlyA + 248107 248112    6                               2.27

     

    11.08 PlyA - 250909 250904    6                               2.27

    11.07 Term - 251119 251097   23  2  2   56   46    -2 0.124  -6.72

    11.06 Intr - 253978 253761  218  0  2   11   88   243 0.782  18.20

    11.05 Intr - 262954 262864   91  2  1   37   16   217 0.668  11.14

    11.04 Intr - 263096 262989  108  0  0   38   32   285 0.955  20.95

    11.03 Intr - 263261 263151  111  0  0  -23   61   241 0.918  13.46

    11.02 Intr - 265373 265162  212  1  2    6    3   270 0.002  12.27

    11.01 Init - 265394 265384   11  2  2  111   -5    23 0.001  -3.56

    11.00 Prom - 366751 366712   40                               7.32

     

    12.00 Prom + 367384 367423   40                              -0.58

    12.01 Init + 368707 368820  114  0  0   79   84   155 0.264  17.33

    12.02 Term + 378120 378302  183  2  0  -43   45   125 0.141  -3.99

    12.03 PlyA + 378402 378407    6                              -2.42

     

    13.00 Prom + 378753 378792   40                               0.32

    13.01 Init + 385559 385709  151  1  1   70   62   126 0.922  11.57

    13.02 Term + 386028 386248  221  1  2  -40   45   353 0.853  19.00

    13.03 PlyA + 386671 386676    6                               2.27

     

    14.03 PlyA - 386715 386710    6                               2.27

    14.02 Term - 387339 387317   23  1  2   58   45    -5 0.133  -6.92

    14.01 Init - 395539 395362  178  1  1   87   23   144 0.172  10.83

    14.00 Prom - 399312 399273   40                              -1.08

     

    Gene 1 from FGeneSH is not found in Genscan's predicted proteins.

    Gene 2 from FGeneSH is split into genes 7 and 8 in Genscan's predicted proteins.

    Gene 3 from FGeneSH encompasses gene 10 in Genscan's predicted proteins but extends well before and beyond it.

     

    I found it strange that only 3 genes were predicted using FGeneSH so I decided to run the whole sequence through the gene predictor software instead of the masked sequence. I discovered 3 genes of interest that I annotated on their respective pages.

    201-1

    201-2

    201-3

    Was this page helpful?
    Tag page (Edit tags)
    • No tags

    Files 10

    FileSizeDateAttached by 
     chaos_graph.1.png
    Chaos game graph
    42.92 kB15:22, 14 Sep 2010nhoodActions
     fgenesh gene info.xls
    fgenesh protein info
    27.5 kB15:32, 25 Oct 2010nhoodActions
     fgenesh masked.rtf
    fgenesh predicted proteins on masked sequence
    13.22 kB15:30, 25 Oct 2010nhoodActions
     gcplot_seq201.1.png
    GC plot of sequence 201
    8.71 kB11:41, 14 Sep 2010nhoodActions
     genescan masked.rtf
    Genescan predicted proteins of masked sequence
    13.96 kB15:32, 25 Oct 2010nhoodActions
     genscan gene info.xls
    genscan predicted protein info
    28 kB15:32, 25 Oct 2010nhoodActions
     isochore_xygraph.1.png
    Isochore graph
    6.13 kB15:22, 14 Sep 2010nhoodActions
     seq 201.txt
    regular text file of our sequence
    403.67 kB15:30, 25 Oct 2010nhoodActions
     seq201 400001.doc
    No description
    558 kB14:44, 14 Sep 2010nhoodActions
     seq201 masked.rtf
    Sequence masked with repeatmasker
    406.51 kB15:30, 25 Oct 2010nhoodActions
    You must login to post a comment.