Was this page helpful?

Gene 7-1

    Synopsis

    7-1_est.png


    This appears to be a member of the RENi (see Protein Function, below).  There is strong support for the downstream region of this gene, roughly corresponding to the last half of exon 6 and exon 7, matching to SUMO with good but not outstanding E-values (~1e-09).  ESTs support the following gene model

          43209           43361           UTR
          43363           43370           exon 1
          43832           43925           exon 2
          43994           44061           exon 3
          44147           44286           exon 4
          45831           46044           exon 5
          46961           47123           exon 6
          47124           47476           UTR

    Many ESTs cross multiple exons suggesting that this is a single gene.  Note that the hit around 46400 is reproducably about 90% match, and matches no other exon indicating it is a heterologous match.  Blast searches over a wider region, bases 42000 to 49000, show no further EST matches indicating this is the complete gene.  ESTs show no support for an intron at 47108-47155 as in the FGENESH model, nor is there any BLASTX match to the region of the last FGENESH predicted exon.  There are three clusters of 5' ends in the ESTs: 43209-43211 (5 ESTs), 43216-43217 (6 ESTs), and 43220-43222 (7 ESTs).  The start of the gene model reflects the end of the most 5', the other two sites may be alternate  sites.  The end of the gene model reflects the end of the single longest EST on the coding strand.

    Gene Model Details

    Gene Genmark
    Begin
    Genemark
    End
    FGENESH
    Begin
    FGENESH
    END
    Blast
    7-1 43362 47128 43362 47425 hypothetical gene conserved across plants.  Based on alignments to proteins in other species, exon 6 may not be real.  Otherwise appears to be a full length gene.
    7-1.1 43362 43370 43362 43370 EST: >10, 43210..43371
    7-1.2 43832 43925 43832 43925 EST: >10, 43831-43925
    7-1.3 43994 44061 43994 44061 EST: >10 43993-44062
    7-1.4 44147 44286 44147 44286 EST: >10, 44146-44288,
    7-1.5 45831 46044 45831 46044 EST: >10, 45829-46045
    7-1.6 46813 46857 46401 46496 FGENESH and Genemark disagree.  Poor EST support for an exon in this region.  some ESTs show a 90% match to a sequence 46363-46448.  These do not match to other exons so most likely is a match to a different gene.
    7-1.7 46961 47128 46961 47425 EST: >10, 46953-47476, 100% match beginning 46959, ESTs are seen on the opposite strand covering the region beginning 46953 (overlapping last exon)
               

    Final predicted sequence

    Possible SUMO interacting motifs in blue, SUMO-like domain (SLD) in green.

    MTTAGEVDSAAGGEDLEPLFDYKRVQPRMTFCFDDSDLEVADIFKYCNKRPKVHTSTEEE
    GKPDEEVAAAKVVVLDEEDWLQPPPPKAAFRATAEEDSAFRELRLKKQEWAKFAESAEDI
    LQKLDEITNKEVGPKEPPEQIILDEESEPQVEKAREKIVISIQDKDGQQQMRVYKDEKFD
    KLLKVYAKKAKLNPSDLSFVFDGEKINPSSTPQDLDLEDEDMIEVRRKQS*

    Genemark and FGENESH predictions.  

    Differing portions shown in red.Note that Neither FGENSH nor Genmark model looks completely correct.

    >GeneMark.hmm|gene 6|245_aa
    MTTAGEVDSAAGGEDLEPLFDYKRVQPRMTFCFDDSDLEVADIFKYCNKRPKVHTSTEEE
    GKPDEEVAAAKVVVLDEEDWLQPPPPKAAFRATAEEDSAFRELRLKKQEWAKFAESAEDI
    LQKLDEITNKEVGPKEPPEQIILDEESEPQVEKAREKIVISIQDKDGQQQMRVYKLDSMK
    KSKCKAYVGLDEKFDKLLKVYAKKAKLNPSDLSFVFDGEKINPSSTPQDLDLEDEDMIEV
    RRKQS
    >FGENESH:   1   8 exon (s)  43362  -  47425   345 aa, chain +
    MTTAGEVDSAAGGEDLEPLFDYKRVQPRMTFCFDDSDLEVADIFKYCNKRPKVHTSTEEE
    GKPDEEVAAAKVVVLDEEDWLQPPPPKAAFRATAEEDSAFRELRLKKQEWAKFAESAEDI
    LQKLDEITNKEVGPKEPPEQIILDEESEPQVEKAREKIVISIQDKDGQQQMRVYKFKNEL
    AGDKYSRTDVVITGAYATLFRQFRLSDDEKFDKLLKVYAKKAKLNPSDLSFVFDGEKINP
    SSTPQDLDLEDEDMIETNSAFSLSSFASNKGLFVYISYNKDVEAILDICVKPSRYIAAVP
    CTWGFGWQKPSAEMPEQPLLLDVSIQDASQLAMPTEVTKDGANCS
    

    Protein Function

    BlastP

    Using the final gene model, matches are seen throughout the land plants including Pyscomitrella, monocots, and dictots.  This strongly supports the presence of a protein in which a SUMO-like domain is fused to an upstream domain.  This overall organization is reminiscent of RAD60 or NIP45, which have been termed RENi proteins (Rad60-Esc2p-Nip45). This family of proteins possesses both SUMO-like domains and SUMO binding motifs; in RAD60 these domains apparently function in self-association1 .  While the family was originally identified in animals, plant proteins belonging to this family were identified in Arabidopsis, Zea, and Oryza2.

    Typical alignment

    >dbj|BAK07370.1|  predicted protein [Hordeum vulgare subsp. vulgare]
    Length=235
    
     Score =   224 bits (570),  Expect = 9e-70, Method: Compositional matrix adjust.
     Identities = 125/226 (55%), Positives = 157/226 (69%), Gaps = 10/226 (4%)
    
    Query  9    SAAGGEDLEPLFDYKRVQPRMTFCFDDSDLEVADIFKYCNKRPKVH--TSTEEEGKPDEE  66
                + A  E+LEPLFDY RVQP + FCFDDSDLE +DIF +CNKRPKV    + +E GK +E+
    Sbjct  12   AGADSEELEPLFDYSRVQPTIDFCFDDSDLEKSDIFVHCNKRPKVADDANADEGGKGNEK  71
    
    Query  67   ----VAAAKVVVLDEEDWLQPPPPKAAFRATAEEDSAFRELRLKKQEWAKFAESAEDILQ  122
                    +  A VV LD+EDWL PPP K    A   +D    E RLKKQE  K  +   D  Q
    Sbjct  72   GDTGIKKAMVVNLDDEDWLAPPPLKPVLSADVCKDKTMHE-RLKKQEVQKLVD---DEFQ  127
    
    Query  123  KLDEITNKEVGPKEPPEQIILDEESEPQVEKAREKIVISIQDKDGQQQMRVYKDEKFDKL  182
                K+ E   K++  K+PPE I+LDE +E + +K++EKI I  Q+KD +QQ RV  DEKFDKL
    Sbjct  128  KVVENVKKDMLAKKPPEPIVLDEPTETETKKSKEKICIMFQEKDARQQFRVSMDEKFDKL  187
    
    Query  183  LKVYAKKAKLNPSDLSFVFDGEKINPSSTPQDLDLEDEDMIEVRRK  228
                 KVYAKK +L+PSDL F+FDG+KIN +ST QDL+LE+ DMIEVRRK
    Sbjct  188  FKVYAKKVQLSPSDLIFIFDGDKINSASTLQDLELENGDMIEVRRK  233
    

     

    N-terminal domain

    Searching with just residues 1-130 shows matches primarily to hypothetical proteins in plants.  The matches are strong, in the 1e-80 to 1e-15 range.  The only annotated sequence is from medicago, which is labeled "NFATC2-interacting protein".  NFAT is "Nuclear factor of activated T-cells, cytoplasmic 2-interacting protein" an animal transcription factor interacting protein that contains a ubiquitin domain.  

    SUMO domain

    Residues from approximately 130-230 are a strong match to small ubiquitin-like modifier proteins.

    References

    1. Raffa GD, Wohlschlegel J, Yates JR 3rd, Boddy MN. SUMO-binding motifs mediate the Rad60-dependent response to replicative stress and self-association. J Biol Chem 281:27973-27981, 2006.
    2. Novatchkova M, Bachmair A, Eisenhaber B, Eisenhaber F. Proteins with two SUMO-like domains in chromatin-associated complexes: the RENi (Rad60-Esc2-NIP45) family. BMC Bioinformatics 6:22, 2005.
    Was this page helpful?
    Tag page (Edit tags)
    • No tags

    Files 3

    FileSizeDateAttached by 
     7-1_est.png
    EST search with gene region
    51.01 kB08:48, 23 Oct 2012gribskovActions
     7-1_est_Alignment.txt
    EST blast search
    175.97 kB12:54, 23 Oct 2012gribskovActions
     7-1f_msa.png
    Multiple alignment of FGENESH model (BlastP)
    388.55 kB09:48, 20 Oct 2012gribskovActions
    You must login to post a comment.