SPRINT Home UMBER Home Contents Standard Search Advanced Search Relation Search

==SPRINT==> PRINTS View



  selected as


PR00917

Identifier
SRSVCYSPTASE  [View Relations]  [View Alignment]  
Accession
PR00917
No. of Motifs
9
Creation Date
29-JUN-1998  (UPDATE 07-JUN-1999)
Title
Small round structured virus (C37) cysteine protease family signature
Database References

INTERPRO; IPR001665
Literature References
1. RAWLINGS, N.D. AND BARRETT, A.J.
Families of cysteine peptidases.
METHODS ENZYMOL. 244 461-486 (1994).
 
2. BARRETT, A.J. AND RAWLINGS, N.D.
Families and clans of cysteine peptidases
PERSPECTIVES DRUG DISCOVERY DESIGN 6 1-11 (1996).
 
3. RAWLINGS, N.D. AND BARRETT, A.J.
Family C37 - Clan PA - Processing peptidase
http://www.bi.bbsrc.ac.uk/merops/famcards/c37.htm
 
4. FEDERHEN, S., HOTTON, C., LEIPE, D. AND SOUSSOV, V.
Calicivirus - NCBI Taxonomy Browser
http://www3.ncbi.nlm.nih.gov/htbin-post/Taxonomy/wgetorg?id=11975&lvl=3
 
5. LIU, B., CLARKE, I.N. AND LAMBDEN, P.R.
Polyprotein processing in Southampton Virus: identification of 3C-like
protease cleavage sites by in vitro mutagenesis.
J.VIROL. 70(4) 2605-2610 (1996).
 
6. KOONIN, E.V. AND GORBALENYA, A.E.
An insect picornavirus may have genome organization similar to that of 
caliciviruses.
FEBS 297 81-86 (1992).

Documentation
Cysteine protease activity is dependent on an active dyad of cysteine and
histidine, the order and spacing of these residues varying in the known 
families. Nearly half of all cysteine proteases are found exclusively
in viruses [1]. Cysteine protease families have been grouped into five 
clans (designated CA, CB, CC, CD and CE) on the basis of structural and
functional similarity. Families C1, C2 and C10, which belong to the CA clan,
have a Cys/His catalytic diad, and are loosely termed papain-like. Families
in the CB clan have a His/Cys diad, and contain enzymes from RNA viruses
distantly related to chymotrypsin. Enzymes in clan CC are also from RNA
viruses, but have a papain like Cys/His active site. The remaining two
clans, CD and CE, contain only one family each [2]. Some families have not
yet been asigned to a clan. 
 
Two additional clans (PA and PB) have been identified, these containing a
mixture of serine, cysteine and threonine proteases. Clan PA contains a 
catalytically active serine or cysteine nucleophilic residue as part of the
ordered triad His, Asp, Ser (or Cys). Clan PB contains a serine, cysteine or
threonine active residue at the N-terminus of the mature protease [3]. 
 
Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis
[4]. The calicivirus genome contains two open reading frames, ORF1 and ORF2.
ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine
protease and RNA polymerase activity [5]. The regions of the polyprotein in 
which these activities lie are similar to proteins produced by the picorna-
viruses [6]. ORF2 encodes a structural, capsid protein. Two different 
families of caliciviruses can be distinguished on the basis of sequence
similarity, namely the Norwalk-like viruses or small round structured 
viruses (SRSVs), and those classed as non-SRSVs. 
 
Calicivirus proteases from the SRSV group, which are members of the PA
protease clan, constitute family C37 of the cysteine proteases (proteases 
from non-SRSVs belong to the C24 family). As mentioned above, the protease 
activity resides within a polyprotein. The enzyme cleaves the polyprotein 
at sites N-terminal to itself, liberating the polyprotein helicase. 
 
SRSVCYSPTASE is a 9-element fingerprint that provides a signature for the 
cysteine protease (C37) of small round structured caliciviruses. The 
fingerprint was derived from an initial alignment of 2 sequences: the motifs
were drawn from conserved regions spanning the full length of the poly-
protein protease, focusing on those regions that characterise members of 
the C37 family but distinguish them from the C24 proteases - motif 3 encodes
the active site His residue; and motif 4 includes the catalytic Asp (the Cys
residue is located between motifs 7 and 8). Two iterations on OWL30.1 were
required to reach convergence, at which point a true set comprising 4 
sequences was identified.
 
An update on SPTR37_9f identified a true set of 3 sequences.
Summary Information
3 codes involving  9 elements
0 codes involving 8 elements
0 codes involving 7 elements
0 codes involving 6 elements
0 codes involving 5 elements
0 codes involving 4 elements
0 codes involving 3 elements
0 codes involving 2 elements
Composite Feature Index
9333333333
8000000000
7000000000
6000000000
5000000000
4000000000
3000000000
2000000000
123456789
True Positives
POLN_LORDV    POLN_SOUV3    Q83883        
Sequence Titles
POLN_LORDV  NON-STRUCTURAL POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYMERASE (EC 2.7.7.48); THIOL PROTEASE 3C (EC 3.4.22.-); HELICASE (2C LIKE PROTEIN)] - LORDSDALE VIRUS (HUMAN ENTERIC CALICIVIRUS). 
POLN_SOUV3 NON-STRUCTURAL POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYMERASE (EC 2.7.7.48); THIOL PROTEASE (EC 3.4.22.-); HELICASE (2C LIKE PROTEIN)] - SOUTHAMPTON VIRUS (SEROTYPE 3).
Q83883 NONSTRUCTURAL POLYPROTEIN - NORWALK VIRUS.
Scan History
OWL30_2    1  20   NSINGLE    
SPTR37_9f 2 4 NSINGLE
Initial Motifs
Motif 1  width=20
Element Seqn Id St Int Rpt
WADDEREVDYNEKISFEAPP POLN_SOUV3 1083 1083 -
WADDDRSVDYNEKLDFEAPP POLN_LORDV 992 992 -

Motif 2 width=20
Element Seqn Id St Int Rpt
WSRVTKFGSGWGFWVSPTVF POLN_SOUV3 1105 2 -
WSRIVNFGSGWGFWVSPSLF POLN_LORDV 1014 2 -

Motif 3 width=22
Element Seqn Id St Int Rpt
ITTTHVIPTSAKEFFGEPLTSI POLN_SOUV3 1125 0 -
ITSTHVIPQGAQEFFGVPVKQI POLN_LORDV 1034 0 -

Motif 4 width=23
Element Seqn Id St Int Rpt
IHRAGEFTLFRFSKKIRPDLTGM POLN_SOUV3 1148 1 -
IHKSGEFCRLRFPKPIRTDVTGM POLN_LORDV 1057 1 -

Motif 5 width=22
Element Seqn Id St Int Rpt
LEEGCPEGTVCSVLIKRDSGEL POLN_SOUV3 1172 1 -
LEEGAPEGTVVTLLIKRSTGEL POLN_LORDV 1081 1 -

Motif 6 width=19
Element Seqn Id St Int Rpt
PLAVRMGAIASMRIQGRLV POLN_SOUV3 1195 1 -
PLAARMGTHATMKIQGRTV POLN_LORDV 1104 1 -

Motif 7 width=18
Element Seqn Id St Int Rpt
GQSGMLLTGANAKGMDLG POLN_SOUV3 1215 1 -
GQMGMLLTGSNAKSMDLG POLN_LORDV 1124 1 -

Motif 8 width=20
Element Seqn Id St Int Rpt
YKRANDWVVCGVHAAATKSG POLN_SOUV3 1244 11 -
YKRENDYVVIGVHTAAARGG POLN_LORDV 1153 11 -

Motif 9 width=20
Element Seqn Id St Int Rpt
NTVVCAVQASEGETTLEGGD POLN_SOUV3 1264 0 -
NTVICATQGSEGEATLEGGD POLN_LORDV 1173 0 -
Final Motifs
Motif 1  width=20
Element Seqn Id St Int Rpt
WADDEREVDYNEKISFEAPP POLN_SOUV3 1083 1083 -
WADDDREVDYNEKINFEAPP Q83883 1084 1084 -
WADDDRSVDYNEKLDFEAPP POLN_LORDV 992 992 -

Motif 2 width=20
Element Seqn Id St Int Rpt
WSRVTKFGSGWGFWVSPTVF POLN_SOUV3 1105 2 -
WSRVTKFGSGWGFWVSPTVF Q83883 1106 2 -
WSRIVNFGSGWGFWVSPSLF POLN_LORDV 1014 2 -

Motif 3 width=22
Element Seqn Id St Int Rpt
ITTTHVIPTSAKEFFGEPLTSI POLN_SOUV3 1125 0 -
ITTTHVVPTGVKEFFGEPLSSI Q83883 1126 0 -
ITSTHVIPQGAQEFFGVPVKQI POLN_LORDV 1034 0 -

Motif 4 width=23
Element Seqn Id St Int Rpt
IHRAGEFTLFRFSKKIRPDLTGM POLN_SOUV3 1148 1 -
IHQAGEFTQFRFSKKMRPDLTGM Q83883 1149 1 -
IHKSGEFCRLRFPKPIRTDVTGM POLN_LORDV 1057 1 -

Motif 5 width=22
Element Seqn Id St Int Rpt
LEEGCPEGTVCSVLIKRDSGEL POLN_SOUV3 1172 1 -
LEEGCPEGTVCSVLIKRDSGEL Q83883 1173 1 -
LEEGAPEGTVVTLLIKRSTGEL POLN_LORDV 1081 1 -

Motif 6 width=19
Element Seqn Id St Int Rpt
PLAVRMGAIASMRIQGRLV POLN_SOUV3 1195 1 -
PLAVRMGAIASMRIQGRLV Q83883 1196 1 -
PLAARMGTHATMKIQGRTV POLN_LORDV 1104 1 -

Motif 7 width=18
Element Seqn Id St Int Rpt
GQSGMLLTGANAKGMDLG POLN_SOUV3 1215 1 -
GQSGMLLTGANAKGMDLG Q83883 1216 1 -
GQMGMLLTGSNAKSMDLG POLN_LORDV 1124 1 -

Motif 8 width=20
Element Seqn Id St Int Rpt
YKRANDWVVCGVHAAATKSG POLN_SOUV3 1244 11 -
HKRGNDWVVCGVHAAATKSG Q83883 1245 11 -
YKRENDYVVIGVHTAAARGG POLN_LORDV 1153 11 -

Motif 9 width=20
Element Seqn Id St Int Rpt
NTVVCAVQASEGETTLEGGD POLN_SOUV3 1264 0 -
NTVVCAVQAGEGETALEGGD Q83883 1265 0 -
NTVICATQGSEGEATLEGGD POLN_LORDV 1173 0 -