SPRINT Home UMBER Home Contents Standard Search Advanced Search Relation Search

==SPRINT==> PRINTS View



  selected as


PR00394

Identifier
RHSPROTEIN  [View Relations]  [View Alignment]  
Accession
PR00394
No. of Motifs
5
Creation Date
18-MAY-1995  (UPDATE 07-JUN-1999)
Title
RHS protein signature
Database References

INTERPRO; IPR001826
Literature References
1. FEULNER, G., GRAY, J.A., KIRSCHMAN, J.A., LEHNER, A.F., SADOSKY, A.B., 
VLAZNY, D.A., ZHANG, J., ZHAO, S. AND HILL, C.W.
Structure of the rhsA locus from Escherichia coli K-12 and comparison of
rhsA with other members of the rhs multigene family.
J.BACTERIOL. 172 446-456 (1990). 
 
2. HILL, C.W., SANDT, C.H. AND VLAZNY, D.A.
Rhs elements of Escherichia coli: a family of genetic composites each
encoding a large mosaic protein. 
MOL.MICROBIOL. 12 865-871 (1994). 

Documentation
RHS elements are proteins of non-essential function believed to play an
important role in the natural ecology of the cell. The protein sequences
comprise highly conserved N-terminal domains containing multiple tandem
22-residue repeats, followed by divergent C-terminal domains [1,2]. 
 
RHSPROTEIN is a 5-element fingerprint that provides a signature for RHS
precursor proteins. The fingerprint was derived from an initial alignment
of 3 sequences: the motifs span the highly-conserved region towards the
C-terminal portion of the alignment following the tandem repeats. Two
iterations on OWL OWL26.0 were required to reach convergence, at which
point a true set comprising 8 sequences was identified. A single partial
match was also found, an RHS protein fragment lacking the first 3 motifs.
 
An update on SPTR37_9f identified a true set of 13 sequences, and 1
partial match.
Summary Information
  13 codes involving  5 elements
0 codes involving 4 elements
1 codes involving 3 elements
0 codes involving 2 elements
Composite Feature Index
51313131313
400000
300111
200000
12345
True Positives
O52661        O52666        O52668        O52673        
P76701 P77779 Q46748 RHSA_ECOLI
RHSB_ECOLI RHSC_ECOLI RHSD_ECOLI RHSE_ECOLI
YIBJ_ECOLI
True Positive Partials
Codes involving 3 elements
P77759
Sequence Titles
O52661      CORE PROTEIN - ESCHERICHIA COLI.              
O52666 CORE PROTEIN - ESCHERICHIA COLI.
O52668 CORE PROTEIN - ESCHERICHIA COLI.
O52673 CORE PROTEIN - ESCHERICHIA COLI.
P76701 RHSB CORE PROTEIN WITH UNIQUE EXTENSION - ESCHERICHIA COLI.
P77779 FROM BASES 733355 TO 747118 (SECTION 64 OF 400) OF THE COMPLETE GENOME (SECTION 64 OF 400) - ESCHERICHIA COLI.
Q46748 RHS CORE PROTEIN WITH EXTENSION - ESCHERICHIA COLI.
RHSA_ECOLI RHSA PROTEIN PRECURSOR - ESCHERICHIA COLI.
RHSB_ECOLI RHSB PROTEIN PRECURSOR - ESCHERICHIA COLI.
RHSC_ECOLI RHSC PROTEIN PRECURSOR - ESCHERICHIA COLI.
RHSD_ECOLI RHSD PROTEIN PRECURSOR - ESCHERICHIA COLI.
RHSE_ECOLI RHSE PROTEIN - ESCHERICHIA COLI.
YIBJ_ECOLI HYPOTHETICAL 26.4 KD PROTEIN IN RHSA-MTLA INTERGENIC REGION - ESCHERICHIA COLI.

P77759 FROM BASES 522240 TO 533123 (SECTION 46 OF 400) OF THE COMPLETE GENOME (SECTION 46 OF 400) - ESCHERICHIA COLI.
Scan History
OWL26_0    2  50   NSINGLE    
SPTR37_9f 2 15 NSINGLE
Initial Motifs
Motif 1  width=20
Element Seqn Id St Int Rpt
LDRLESEILADRVSEESRRW RHSA_ECOLI 1099 1099 -
LDRLESEILADRVSEESRRW RHSB_ECOLI 1099 1099 -
LDRLESEILADRVSEESRRW RHSC_ECOLI 1099 1099 -

Motif 2 width=21
Element Seqn Id St Int Rpt
YTPARKIHLYHCDHRGLPLAL RHSA_ECOLI 1137 18 -
YTPARKIHLYHCDHRGLPLAL RHSB_ECOLI 1137 18 -
YTPARKIHLYHCDHRGLPLAL RHSC_ECOLI 1137 18 -

Motif 3 width=16
Element Seqn Id St Int Rpt
AEYDEWGNLLNEENPH RHSA_ECOLI 1168 10 -
AEYDEWGNLLNEENPH RHSB_ECOLI 1168 10 -
AEYDEWGNLLNEENPH RHSC_ECOLI 1168 10 -

Motif 4 width=21
Element Seqn Id St Int Rpt
PGQQYDEESGLYYNRHRYYDP RHSA_ECOLI 1192 8 -
PGQQYDEESGLYYNRHRYYDP RHSB_ECOLI 1192 8 -
PGQQYDEESGLYYNRHRYYDP RHSC_ECOLI 1192 8 -

Motif 5 width=20
Element Seqn Id St Int Rpt
GRYITQDPIGLKGGWNFYQY RHSA_ECOLI 1215 2 -
GRYITQDPIGLKGGWNLYGY RHSB_ECOLI 1215 2 -
GRYITQDPIGLKGGWNFYQY RHSC_ECOLI 1215 2 -
Final Motifs
Motif 1  width=20
Element Seqn Id St Int Rpt
LDRLESEILADRVSEESRRW P76701 1099 1099 -
LDRLESEILADRVSEESRRW P77779 177 177 -
LDRLESEILADRVSEESRRW Q46748 1099 1099 -
LDRLESEILADRVSEESRRW RHSA_ECOLI 1099 1099 -
LDRLESEILADRVSEESRRW RHSB_ECOLI 1099 1099 -
LDRLESEILADRVSEESRRW RHSC_ECOLI 1099 1099 -
LDRLESEILADRVSEESRRW YIBJ_ECOLI 4 4 -
LDRLESEILADRVSEESRRW O52668 1099 1099 -
LDRLEEEIRADRVSSESRAW RHSD_ECOLI 1110 1110 -
LDRLEEEIRADRVSSESRAW O52661 1113 1113 -
LDRLEEEIRADRVSSESRAW RHSE_ECOLI 385 385 -
LGRLERELRAGAVSAESEAW O52673 1109 1109 -
LGRLERELRQGIVSEESQQW O52666 1112 1112 -

Motif 2 width=21
Element Seqn Id St Int Rpt
YTPARKIHLYHCDHRGLPLAL P76701 1137 18 -
YTPARKIHLYHCDHRGLPLAL P77779 215 18 -
YTPARKIHLYHCDHRGLPLAL Q46748 1137 18 -
YTPARKIHLYHCDHRGLPLAL RHSA_ECOLI 1137 18 -
YTPARKIHLYHCDHRGLPLAL RHSB_ECOLI 1137 18 -
YTPARKIHLYHCDHRGLPLAL RHSC_ECOLI 1137 18 -
YTPARKIHLYHCDHRGLPLAL YIBJ_ECOLI 42 18 -
YTPARKIHLYHCDHRGLPLAL O52668 1137 18 -
YTPARKAHLYHCDHRGLPLAL RHSD_ECOLI 1148 18 -
YTPARKVHLYHCDHRGLPLAL O52661 1151 18 -
YTPARKVHFYHCDHRGLPLAL RHSE_ECOLI 423 18 -
YIPERRLHLYHCDHRGLPQAL O52673 1147 18 -
YIPERKLHLYHCDHRGLPLAL O52666 1150 18 -

Motif 3 width=16
Element Seqn Id St Int Rpt
AEYDEWGNLLNEENPH P76701 1168 10 -
AEYDEWGNLLNEENPH P77779 246 10 -
AEYDEWGNLLNEENPH Q46748 1168 10 -
AEYDEWGNLLNEENPH RHSA_ECOLI 1168 10 -
AEYDEWGNLLNEENPH RHSB_ECOLI 1168 10 -
AEYDEWGNLLNEENPH RHSC_ECOLI 1168 10 -
AEYDEWGNLLNEENPH YIBJ_ECOLI 73 10 -
AEYDEWGNLLNEENPH O52668 1168 10 -
AEYDEWGNQLNEENPH RHSD_ECOLI 1179 10 -
GEYDEWGNLLNEENPH O52661 1182 10 -
GEYDEWGNQLNEENPH RHSE_ECOLI 454 10 -
GEYDEWGNQLNEENPH O52673 1178 10 -
GEYDEWGNLLGEESAQ O52666 1181 10 -

Motif 4 width=21
Element Seqn Id St Int Rpt
PGQQYDEESGLYYNRHRYYDP P76701 1192 8 -
PGQQYDEESGLYYNRHRYYDP P77779 270 8 -
PGQQYDEESGLYYNRHRYYDP Q46748 1192 8 -
PGQQYDEESGLYYNRHRYYDP RHSA_ECOLI 1192 8 -
PGQQYDEESGLYYNRHRYYDP RHSB_ECOLI 1192 8 -
PGQQYDEESGLYYNRHRYYDP RHSC_ECOLI 1192 8 -
PGQQYDEESGLYYNRHRYYDP YIBJ_ECOLI 97 8 -
PGQQYDEESGLYYNRHRYYDP O52668 1192 8 -
PGQQHDEESGLYYNRHRYYDP RHSD_ECOLI 1203 8 -
PGQQHDEESGLYYNRHRYYDP O52661 1206 8 -
PGQQHDEESGLYYNRHRHYDP RHSE_ECOLI 478 8 -
PGQQYDEESGLYYNRHRYYDP O52673 1202 8 -
PGQQYDEESGLYYNRNRYYDP O52666 1205 8 -

Motif 5 width=20
Element Seqn Id St Int Rpt
GRYITQDPIGLKGGWNLYGY P76701 1215 2 -
GRYITQDPIGLKGGWNFYQY P77779 293 2 -
GRYITQDPIGLKGGWNLYGY Q46748 1215 2 -
GRYITQDPIGLKGGWNFYQY RHSA_ECOLI 1215 2 -
GRYITQDPIGLKGGWNLYGY RHSB_ECOLI 1215 2 -
GRYITQDPIGLKGGWNFYQY RHSC_ECOLI 1215 2 -
GRYITQDPIGLKGGWNFYQY YIBJ_ECOLI 120 2 -
GRYITQDPIGLKGGWNLYTY O52668 1215 2 -
GRYITQDPMGLKGGWNLYQY RHSD_ECOLI 1226 2 -
GRYIIQDPIGGDGGWNLYQY O52661 1229 2 -
GRYITPDPIGLRGGWNMYQY RHSE_ECOLI 501 2 -
GRYITQDPIGLKGGINLYTY O52673 1225 2 -
GRYITQDPIGLRGEWNLYKY O52666 1228 2 -