SPRINT Home UMBER Home Contents Standard Search Advanced Search Relation Search

==SPRINT==> PRINTS View



  selected as


PR00916

Identifier
2CENDOPTASE  [View Relations]  [View Alignment]  
Accession
PR00916
No. of Motifs
4
Creation Date
29-JUN-1998  (UPDATE 06-JUN-1999)
Title
2C endopeptidase (C24) cysteine protease family signature
Database References

INTERPRO; IPR000317
Literature References
1. RAWLINGS, N.D. AND BARRETT, A.J.
Families of cysteine peptidases.
METHODS ENZYMOL. 244 461-486 (1994).
 
2. BARRETT, A.J. AND RAWLINGS, N.D.
Families and clans of cysteine peptidases
PERSPECTIVES DRUG DISCOVERY DESIGN 6 1-11 (1996).
 
3. RAWLINGS, N.D. AND BARRETT, A.J.
Family C24 - Clan PA - 3C endopeptidase 
http://www.bi.bbsrc.ac.uk/merops/famcards/c24.htm
 
4. FEDERHEN, S., HOTTON, C., LEIPE, D. AND SOUSSOV, V.
Calicivirus - NCBI Taxonomy Browser
http://www3.ncbi.nlm.nih.gov/htbin-post/Taxonomy/wgetorg?id=11975&lvl=3
 
5. WIRBLICH, C., THIEL,H. AND MEYERS, G.
Genetic map of the calicivirus rabbit hemorrhagic diesease virus as detected
from in vitro translation studies.
J.VIROL. 70(11) 7974-7983 (1996).

Documentation
Cysteine protease activity is dependent on an active dyad of cysteine and
histidine, the order and spacing of these residues varing in the known 
families. Nearly half of all cysteine proteases are found exclusively
in viruses [1]. Cysteine protease families have been grouped into five 
clans (designated CA, CB, CC, CD and CE) on the basis of structural and
functional similarity. Families C1, C2 and C10, which belong to the CA clan,
have a Cys/His catalytic diad, and are loosely termed papain-like. Families
in the CB clan have a His/Cys diad, and contain enzymes from RNA viruses
distantly related to chymotrypsin. Enzymes in clan CC are also from RNA
viruses, but have a papain-like Cys/His active site. The remaining two
clans, CD and CE, contain only one family each [2]. Some families have not
yet been asigned to a clan. 
 
Two additional clans (PA and PB) have been identified, these containing a
mixture of serine, cysteine and threonine proteases. Clan PA contains a
catalytically-active serine or cysteine nucleophilic residue as part of the
ordered triad His, Asp, Ser (or Cys). Clan PB contains a serine, cysteine or
threonine active residue at the N-terminus of the mature protease [3]. 
 
Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis
[4]. The calicivirus genome contains two open reading frames, ORF1 and ORF2.
ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine
protease and RNA polymerase activity. The regions of the polyprotein in 
which these activities lie are similar to proteins produced by the picorna-
viruses. ORF2 encodes a structural protein [5]. Two different families of
caliciviruses can be distinguished on the basis of sequence similarity, 
namely those classified as small round structured viruses (SRSVs) and those
classed as non-SRSVs. 
 
Calicivirus proteases from the non-SRSV group, which are members of the PA
protease clan, constitute family C24 of the cysteine proteases (proteases 
from SRSVs belong to the C37 family). As mentioned above, the protease 
activity resides within a polyprotein. The enzyme cleaves the polyprotein
at sites N-terminal to itself, liberating the polyprotein helicase.
 
2CENDOPTASE is a 4-element fingerprint that provides a signature for the 
cysteine protease (C24) of non-SRSV caliciviruses. The fingerprint was 
derived from an initial alignment of 4 sequences: the motifs were drawn 
from conserved regions spanning the full length of the polyprotein protease,
focusing on those regions that characterise members of the C24 family but
distinguish them from the C37 proteases - motif 1 includes the active site
histidine residue; and motif 3 contains the catalytic cysteine. Two 
iterations on OWL30.2 were required to reach convergence, at which point
a true set comprising 14 sequences was identified. 
 
An update on SPTR37_9f identified a true set of 12 sequences.
Summary Information
12 codes involving  4 elements
0 codes involving 3 elements
0 codes involving 2 elements
Composite Feature Index
412121212
30000
20000
1234
True Positives
O92368        POLN_FCVC6    POLN_FCVF9    POLN_MANCV    
POLN_RHDV Q66913 Q66914 Q86114
Q86117 Q86119 Q89273 Q96725
Sequence Titles
O92368      NON-STRUCTURAL POLYPROTEIN - VESV-LIKE CALICIVIRUS. 
POLN_FCVC6 NON-STRUCTURAL POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYMERASE (EC 2.7.7.48); THIOL PROTEASE (EC 3.4.22.-); HELICASE (2C LIKE PROTEIN)] - FELINE CALICIVIRUS (STRAIN CFI/68 FIV) (FCV).
POLN_FCVF9 NON-STRUCTURAL POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYMERASE (EC 2.7.7.48); THIOL PROTEASE (EC 3.4.22.-); HELICASE (2C LIKE PROTEIN)] - FELINE CALICIVIRUS (STRAIN F9) (FCV).
POLN_MANCV GENOME POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYMERASE (EC 2.7.7.48); THIOL PROTEASE 3C (EC 3.4.22.-); HELICASE (2C LIKE PROTEIN); COAT PROTEIN] - MANCHESTER VIRUS (HUMAN ENTERIC CALICIVIRUS).
POLN_RHDV NON-STRUCTURAL POLYPROTEIN [CONTAINS: RNA-DIRECTED RNA POLYMERASE (EC 2.7.7.48); THIOL PROTEASE P3C (EC 3.4.22.-); HELICASE (2C LIKE PROTEIN); COAT PROTEIN] - RABBIT HEMORRHAGIC DISEASE VIRUS (RHDV).
Q66913 NON-STRUCTURAL PROTEINS - FELINE CALICIVIRUS.
Q66914 POLYPROTEIN - FELINE CALICIVIRUS.
Q86114 POLYPROTEIN - RABBIT HEMORRHAGIC DISEASE VIRUS (RHDV).
Q86117 (SD) - RABBIT HEMORRHAGIC DISEASE VIRUS (RHDV).
Q86119 POLYPROTEIN - RABBIT HEMORRHAGIC DISEASE VIRUS (RHDV).
Q89273 POLYPROTEIN - RABBIT HEMORRHAGIC DISEASE VIRUS (RHDV).
Q96725 RNA - EUROPEAN BROWN HARE SYNDROME VIRUS.
Scan History
OWL30_2    2  50   NSINGLE    
SPTR37_9f 2 13 NSINGLE
Initial Motifs
Motif 1  width=18
Element Seqn Id St Int Rpt
GWMIHIGNGLYISNTHTA POLN_RHDV 1120 1120 -
GYCIHMGHGVYASVAHVV POLN_FCVF9 1095 1095 -
GYCVHMGHGVYASVAHVV POLN_FCVC6 1097 1097 -
GYGVHIGNGNVITVTHVA POLN_MANCV 997 997 -

Motif 2 width=17
Element Seqn Id St Int Rpt
AQIAEGTPVCDWKKSPI POLN_RHDV 1165 27 -
APFFSGKPTRDPWGSPV POLN_FCVF9 1145 32 -
APFFSGRPTRDPWGSPV POLN_FCVC6 1147 32 -
GPFSQLPHMQIGSGSPV POLN_MANCV 1039 24 -

Motif 3 width=12
Element Seqn Id St Int Rpt
TTHGDCGLPLYD POLN_RHDV 1207 25 -
THPGDCGLPYID POLN_FCVF9 1188 26 -
THPGDCGLPYID POLN_FCVC6 1190 26 -
TKKGDCGLPYFN POLN_MANCV 1092 36 -

Motif 4 width=11
Element Seqn Id St Int Rpt
SSGKIVAIHTG POLN_RHDV 1219 0 -
DNGRVTGLHTG POLN_FCVF9 1200 0 -
DNGRVTGLHTG POLN_FCVC6 1202 0 -
SNRQLVALHAG POLN_MANCV 1104 0 -
Final Motifs
Motif 1  width=18
Element Seqn Id St Int Rpt
GWMIHIGNGLYISNTHTA POLN_RHDV 1120 1120 -
GWMIHIGNGLYISNTHTA Q86117 1120 1120 -
GWMIHIGNGLYISNTHTA Q86119 1120 1120 -
GWMIHIGNGLYISNTHTA Q89273 1120 1120 -
GRMIHIGNGLYISNTHTA Q86114 1120 1120 -
GYCIHMGHGVYASVAHVV POLN_FCVF9 1095 1095 -
GWMIHIGNGMYLSNTHTA Q96725 1113 1113 -
GYCVHMGHGVYASVAHVV Q66913 1095 1095 -
GYCVHMGHGVYASVAHVV POLN_FCVC6 1097 1097 -
GYCVHMGHGVYATVAHVA Q66914 1095 1095 -
GYAIHIGHGVYISLKHVV O92368 1208 1208 -
GYGVHIGNGNVITVTHVA POLN_MANCV 997 997 -

Motif 2 width=17
Element Seqn Id St Int Rpt
AQIAEGTPVCDWKKSPI POLN_RHDV 1165 27 -
AQIAEGTPVCDWKKSPI Q86117 1165 27 -
AQIAEGTPVCDWKKSPI Q86119 1165 27 -
AQIAEGTPVCDWKKSPI Q89273 1165 27 -
AQIAEGTPVCDWKKSPI Q86114 1165 27 -
APFFSGKPTRDPWGSPV POLN_FCVF9 1145 32 -
AQIAEGTPVRDWKRASI Q96725 1158 27 -
APFFSGKPTRDPWGSPV Q66913 1145 32 -
APFFSGRPTRDPWGSPV POLN_FCVC6 1147 32 -
APFFPGKPTRDPWGSPV Q66914 1145 32 -
VPVGTSKPIKDPWGNPV O92368 1258 32 -
GPFSQLPHMQIGSGSPV POLN_MANCV 1039 24 -

Motif 3 width=12
Element Seqn Id St Int Rpt
TTHGDCGLPLYD POLN_RHDV 1207 25 -
TTHGDCGLPLYD Q86117 1207 25 -
TTHGDCGLPLYD Q86119 1207 25 -
TTHGDCGLPLYD Q89273 1207 25 -
TTHGDCGLPLYD Q86114 1207 25 -
THPGDCGLPYID POLN_FCVF9 1188 26 -
TTHGDCGLPLFD Q96725 1200 25 -
THPGDCGLPYID Q66913 1188 26 -
THPGDCGLPYID POLN_FCVC6 1190 26 -
THPGDCGLPYID Q66914 1188 26 -
TRQGDCGLPYVD O92368 1301 26 -
TKKGDCGLPYFN POLN_MANCV 1092 36 -

Motif 4 width=11
Element Seqn Id St Int Rpt
SSGKIVAIHTG POLN_RHDV 1219 0 -
SSGKIVAIHTG Q86117 1219 0 -
SSGKIVAIHTG Q86119 1219 0 -
SSGKIVAIHTG Q89273 1219 0 -
SSGKIVAIHTG Q86114 1219 0 -
DNGRVTGLHTG POLN_FCVF9 1200 0 -
EAGKVVAIHTG Q96725 1212 0 -
DNGRVTGLHTG Q66913 1200 0 -
DNGRVTGLHTG POLN_FCVC6 1202 0 -
DNGRVTGLHTG Q66914 1200 0 -
DHGVVVGLHAG O92368 1313 0 -
SNRQLVALHAG POLN_MANCV 1104 0 -