SPRINT Home UMBER Home Contents Standard Search Advanced Search Relation Search

==SPRINT==> PRINTS View



  selected as


PR00887

Identifier
SSRCOGNITION  [View Relations]  [View Alignment]  
Accession
PR00887
No. of Motifs
8
Creation Date
09-JUN-1998  (UPDATE 18-JUN-1999)
Title
Structure-specific recognition protein signature
Database References

INTERPRO; IPR000969
Literature References
1. BRUHN, S.L., PIL, P.M., ESSIGMANN, J.M., HOUSMAN, D.E. AND LIPPARD, S.J.
Isolation and characterization of human cDNA clones encoding a high mobility
group box protein that recognizes structural distortions to DNA caused by
binding of the anticancer agent cisplatin. 
PROC.NATL.ACAD.SCI.U.S.A. 89 2307-2311 (1992). 
 
2. SHIRAKATA, M., HUPPI, K., USUDA, S., OKAZAKI, K., YOSHIDA, K.
AND SAKANO, H.
HMG1-related DNA-binding protein isolated with V-(D)-J recombination 
signal probes. 
MOL.CELL.BIOL. 11 4528-4536 (1991). 
 
3. HSU, T., KING, D.L., LABONNE, C. AND KAFATOS, F.C.
A Drosophila single-strand DNA/RNA-binding factor contains a high-mobility-
group box and is enriched in the nucleolus. 
PROC.NATL.ACAD.SCI.U.S.A. 90 6488-6492 (1993).
 
4. BRUHN, S.L., HOUSMAN, D.E. AND LIPPARD, S.J.
Isolation and characterization of cdna clones encoding the Drosophila
homolog of the HMG-box SSRP family that recognizes specific DNA structures. 
NUCLEIC ACIDS RES. 21 1643-1646 (1993). 

Documentation
Human cDNA encoding a structure-specific recognition protein, SSRP1, has
been characterised [1]; the protein binds specifically to DNA modified with
the anti-cancer drug cisplatin. An 81kDa protein is predicted, containing
several highly-charged domains and a stretch of 75 residues that share 47%
identity with a portion of the high mobility group (HMG) protein HMG1 [1].
This HMG box probably constitutes the structure recognition element for 
cisplatin-modified DNA, the probable recognition motif being the local
duplex unwinding and bending that occurs on formation of intra-strand 
cross-links [1].
 
SSRP1 is the human homologue of a recently-identified mouse protein that
binds to recombination signal sequences [2]. These sequences have been
postulated to form stem-loop structures, further implicating local bends
and unwinding in DNA as a recognition target for HMG-box proteins.
 
A Drosophila melanogaster cDNA encoding an HMG-box-containing protein has
also been isolated [3,4]. This protein shares 50% sequence identity with
human SSRP1. In vitro binding studies using Drosophila SSRP showed that
the protein binds to single-stranded DNA and RNA, with highest affinity
for nucleotides G and U. 
 
Comparison of the predicted amino acid sequences among SSRP family members
reveals 48% identity, with structural conservation in the C-terminus of the
HMG box, as well as domains of highly charged residues [4]. The most highly
conserved regions lie in the poorly understood N-terminus, suggesting that
this portion of the protein is critical for its function [4].
 
SSRCOGNITION is an 8-element fingerprint that provides a signature for
structure-specific DNA recognition proteins. The fingerprint was derived
from an initial alignment of 6 sequences: the motifs were drawn from 
conserved sections spanning the N-terminal half of the alignment, focusing
on those regions that characterise the structure recognition proteins but
distinguish them from the HMG-like family. Two iterations on OWL30.2 were
required to reach convergence, at which point a true set comprising 12 
sequences was identified. Two partial matches were also found: BTU84139
is a bovine SSRP protein fragment that lacks the portion of sequence 
bearing the first two motifs, and YMG9_YEAST is a yeast hypothetical
protein that makes strong matches with motifs 1,3,5,6 and 8.
 
An update on SPTR37_9f identified a true set of 8 sequences, and 1
partial match.
Summary Information
   8 codes involving  8 elements
0 codes involving 7 elements
0 codes involving 6 elements
1 codes involving 5 elements
0 codes involving 4 elements
0 codes involving 3 elements
0 codes involving 2 elements
Composite Feature Index
888888888
700000000
600000000
510101101
400000000
300000000
200000000
12345678
True Positives
O01683        O04235        SSRP_ARATH    SSRP_CAEEL    
SSRP_DROME SSRP_HUMAN SSRP_MOUSE SSTP_CATRO
True Positive Partials
Codes involving 5 elements
YMG9_YEAST
Sequence Titles
O01683      SIMILAR TO SINGLE-STRAND RECOGNITION PROTEIN - CAENORHABDITIS ELEGANS. 
O04235 TRANSCRIPTION FACTOR - VICIA FABA (BROAD BEAN).
SSRP_ARATH STRUCTURE-SPECIFIC RECOGNITION PROTEIN 1 HOMOLOG (HMG PROTEIN) - ARABIDOPSIS THALIANA (MOUSE-EAR CRESS).
SSRP_CAEEL PROBABLE STRUCTURE-SPECIFIC RECOGNITION PROTEIN 1 (SSRP1) (RECOMBINATION SIGNAL SEQUENCE RECOGNITION PROTEIN) - CAENORHABDITIS ELEGANS.
SSRP_DROME SINGLE-STRAND RECOGNITION PROTEIN (SSRP) (CHORION-FACTOR 5) - DROSOPHILA MELANOGASTER (FRUIT FLY).
SSRP_HUMAN STRUCTURE-SPECIFIC RECOGNITION PROTEIN 1 (SSRP1) (RECOMBINATION SIGNAL SEQUENCE RECOGNITION PROTEIN) (T160) - HOMO SAPIENS (HUMAN).
SSRP_MOUSE STRUCTURE-SPECIFIC RECOGNITION PROTEIN 1 (SSRP1) (RECOMBINATION SIGNAL SEQUENCE RECOGNITION PROTEIN) (T160) - MUS MUSCULUS (MOUSE).
SSTP_CATRO STRUCTURE-SPECIFIC RECOGNITION PROTEIN 1 HOMOLOG (HMG PROTEIN) - CATHARANTHUS ROSEUS (ROSY PERIWINKLE) (MADAGASCAR PERIWINKLE).

YMG9_YEAST HYPOTHETICAL 63.0 KD PROTEIN IN DAK1-ORC1 INTERGENIC REGION - SACCHAROMYCES CEREVISIAE (BAKER'S YEAST).
Scan History
OWL30_2    2  100  NSINGLE    
SPTR37_9f 2 50 NSINGLE
Initial Motifs
Motif 1  width=17
Element Seqn Id St Int Rpt
TPRGRYDIRIYPTFLHL SSRP_MOUSE 209 209 -
TPRGRYDIRIYPTFLHL SSRP_HUMAN 209 209 -
TPRGRYNVELHLSFLRL SSTP_CATRO 219 219 -
TPRGRYDIKVYPTSIAL SSRP_CAEEL 210 210 -
TPRGRYDIRIYPTFLHL SSRP_RAT 61 61 -
TPRGRYNVELHLSFLRL SSRP_ARATH 219 219 -

Motif 2 width=17
Element Seqn Id St Int Rpt
DYKIPIKSINRLFLVPH SSRP_CAEEL 232 5 -
DYKIPYTTVLRLFLLPH SSRP_RAT 83 5 -
DYKIPYTTVLRLFLLPH SSRP_HUMAN 231 5 -
DYKIPYTTVLRLFLLPH SSRP_MOUSE 231 5 -
DFKIQYSSVVRLFLLPK SSRP_ARATH 241 5 -
DFKIQYSSVVRLFLLPK SSTP_CATRO 241 5 -

Motif 3 width=17
Element Seqn Id St Int Rpt
FFVISLDPPIKQGQTRY SSRP_MOUSE 254 6 -
FFVISLDPPIKQGQTRY SSRP_HUMAN 254 6 -
FVVISLDPPIRKGQTMY SSRP_ARATH 264 6 -
FVVVTLDPPIRKGQTLY SSTP_CATRO 264 6 -
YFVLSLNPPIRQGQTRY SSRP_CAEEL 255 6 -
FFVISLDPPIKQGQTRY SSRP_RAT 106 6 -

Motif 4 width=14
Element Seqn Id St Int Rpt
KALVNRKITVPGNF SSRP_MOUSE 319 48 -
KALVNRKITVPGNF SSRP_RAT 171 48 -
KALVNRKITVPGNF SSRP_HUMAN 319 48 -
RWLSGAKITKPGKF SSRP_ARATH 329 48 -
RGLSGAKVTRPGKF SSTP_CATRO 329 48 -
KSICNLKITVPGRF SSRP_CAEEL 319 47 -

Motif 5 width=19
Element Seqn Id St Int Rpt
KAEDGVLYPLEKGFFFLPK SSRP_ARATH 356 13 -
KASSGLLYPLERGFIYVHK SSRP_RAT 198 13 -
RQNPGLLYPMEKGFLFIHK SSRP_CAEEL 346 13 -
KAEDGVLYPLEKSFFFLPK SSTP_CATRO 356 13 -
KASSGLLYPLERGFIYVHK SSRP_HUMAN 346 13 -
KASSGLLYPLERGFIYVHK SSRP_MOUSE 346 13 -

Motif 6 width=18
Element Seqn Id St Int Rpt
KPPTLILHEEIDYVEFER SSTP_CATRO 374 -1 -
KPPVHIRFDEISFVNFAR SSRP_HUMAN 364 -1 -
KPPVHIRFDEISFVNFAR SSRP_MOUSE 364 -1 -
KPPTLILHDEIDYVEFER SSRP_ARATH 374 -1 -
KPAMYIRFEEISSCHFAR SSRP_CAEEL 364 -1 -
KPPVHIRFDEISFVNFAR SSRP_RAT 216 -1 -

Motif 7 width=17
Element Seqn Id St Int Rpt
RSFDSEIETKQGTQYTF SSRP_RAT 239 5 -
RSFDFEIETKQGTQYTF SSRP_HUMAN 387 5 -
RSFDFEIETKQGTQYTF SSRP_MOUSE 387 5 -
HYFDLLIRLKTDHEHLF SSRP_ARATH 400 8 -
HYFDLLIRLKTEQEHLF SSTP_CATRO 400 8 -
RTFDFEIDLKYGGPLTF SSRP_CAEEL 389 7 -

Motif 8 width=19
Element Seqn Id St Int Rpt
FRNIQRNEYHNLYTFISSK SSRP_ARATH 416 -1 -
FSSIEREEYGKLFDFVNAK SSRP_MOUSE 403 -1 -
FNAMEKEENNKLFDYLNKK SSRP_CAEEL 405 -1 -
FRNIQRNEYHNLFDFISSK SSTP_CATRO 416 -1 -
FSSIEREEYGKLFDFVNAK SSRP_HUMAN 403 -1 -
FSSIEREEYGKLFDFVNAK SSRP_RAT 255 -1 -
Final Motifs
Motif 1  width=17
Element Seqn Id St Int Rpt
TPRGRYDIRIYPTFLHL SSRP_HUMAN 209 209 -
TPRGRYDIRIYPTFLHL SSRP_MOUSE 209 209 -
TPRGRYNVELHLSFLRL SSRP_ARATH 219 219 -
TPRGRYNVELHLSFLRL SSTP_CATRO 219 219 -
TPRRRYDIKIFSTFFQL SSRP_DROME 209 209 -
TPRGRYSVELHLSFLRL O04235 219 219 -
TPRGRYDIKVYPTSIAL SSRP_CAEEL 210 210 -
TPRGRYDIKVYPTSIAL O01683 210 210 -

Motif 2 width=17
Element Seqn Id St Int Rpt
DYKIPYTTVLRLFLLPH SSRP_HUMAN 231 5 -
DYKIPYTTVLRLFLLPH SSRP_MOUSE 231 5 -
DFKIQYSSVVRLFLLPK SSRP_ARATH 241 5 -
DFKIQYSSVVRLFLLPK SSTP_CATRO 241 5 -
DYKIPMDSVLRLFMLPH SSRP_DROME 231 5 -
DFKIQYSSVVRLFLLPK O04235 241 5 -
DYKIPIKSINRLFLVPH SSRP_CAEEL 232 5 -
DYKIPVKTINRLFLVPH O01683 232 5 -

Motif 3 width=17
Element Seqn Id St Int Rpt
FFVISLDPPIKQGQTRY SSRP_HUMAN 254 6 -
FFVISLDPPIKQGQTRY SSRP_MOUSE 254 6 -
FVVISLDPPIRKGQTMY SSRP_ARATH 264 6 -
FVVVTLDPPIRKGQTLY SSTP_CATRO 264 6 -
FFVLSLDPPIKQGQTRY SSRP_DROME 254 6 -
FVIISLDPPIRKGQTLY O04235 264 6 -
YFVLSLNPPIRQGQTRY SSRP_CAEEL 255 6 -
YFVLSLNPPIRQGQTHY O01683 255 6 -

Motif 4 width=14
Element Seqn Id St Int Rpt
KALVNRKITVPGNF SSRP_HUMAN 319 48 -
KALVNRKITVPGNF SSRP_MOUSE 319 48 -
RWLSGAKITKPGKF SSRP_ARATH 329 48 -
RGLSGAKVTRPGKF SSTP_CATRO 329 48 -
KVLIGRKITGPGNF SSRP_DROME 319 48 -
RGLSGGKVTKPGNF O04235 329 48 -
KSICNLKITVPGRF SSRP_CAEEL 319 47 -
KSICNLKVTVPGRF O01683 319 47 -

Motif 5 width=19
Element Seqn Id St Int Rpt
KASSGLLYPLERGFIYVHK SSRP_HUMAN 346 13 -
KASSGLLYPLERGFIYVHK SSRP_MOUSE 346 13 -
KAEDGVLYPLEKGFFFLPK SSRP_ARATH 356 13 -
KAEDGVLYPLEKSFFFLPK SSTP_CATRO 356 13 -
KAAAGYLYPLERGFIYIHK SSRP_DROME 346 13 -
KAEDGILYPLEKSFFFLPK O04235 356 13 -
RQNPGLLYPMEKGFLFIHK SSRP_CAEEL 346 13 -
RQNLGLLYPMEKGFLFIQK O01683 346 13 -

Motif 6 width=18
Element Seqn Id St Int Rpt
KPPVHIRFDEISFVNFAR SSRP_HUMAN 364 -1 -
KPPVHIRFDEISFVNFAR SSRP_MOUSE 364 -1 -
KPPTLILHDEIDYVEFER SSRP_ARATH 374 -1 -
KPPTLILHEEIDYVEFER SSTP_CATRO 374 -1 -
KPPLHIRFEEISSVNFAR SSRP_DROME 364 -1 -
KPPTLILHEEIDYVEFER O04235 374 -1 -
KPAMYIRFEEISSCHFAR SSRP_CAEEL 364 -1 -
KPVMYIRFEEISSCHFAR O01683 364 -1 -

Motif 7 width=17
Element Seqn Id St Int Rpt
RSFDFEIETKQGTQYTF SSRP_HUMAN 387 5 -
RSFDFEIETKQGTQYTF SSRP_MOUSE 387 5 -
HYFDLLIRLKTDHEHLF SSRP_ARATH 400 8 -
HYFDLLIRLKTEQEHLF SSTP_CATRO 400 8 -
RSFDFEVTLKNGTVHIF SSRP_DROME 387 5 -
HYFDLLIRLKSEQEHLF O04235 400 8 -
RTFDFEIDLKYGGPLTF SSRP_CAEEL 389 7 -
RTFDFEIDLKTGSSLTF O01683 389 7 -

Motif 8 width=19
Element Seqn Id St Int Rpt
FSSIEREEYGKLFDFVNAK SSRP_HUMAN 403 -1 -
FSSIEREEYGKLFDFVNAK SSRP_MOUSE 403 -1 -
FRNIQRNEYHNLYTFISSK SSRP_ARATH 416 -1 -
FRNIQRNEYHNLFDFISSK SSTP_CATRO 416 -1 -
FSSIEKEEYAKLFDYITQK SSRP_DROME 403 -1 -
FRNIQRNEYHNLYGFISSK O04235 416 -1 -
FNAMEKEENNKLFDYLNKK SSRP_CAEEL 405 -1 -
FSAMDKEENNKLFDYLNKK O01683 405 -1 -