SPRINT Home UMBER Home Contents Standard Search Advanced Search Relation Search

==SPRINT==> PRINTS View



  selected as


PR00861

Identifier
ALYTICPTASE  [View Relations]  [View Alignment]  
Accession
PR00861
No. of Motifs
5
Creation Date
26-APR-1998  (UPDATE 06-JUN-1999)
Title
Alpha-lytic endopeptidase serine protease (S2A) signature
Database References

PROSITE; PS00134 TRYPSIN_HIS; PS00135 TRYPSIN_SER
BLOCKS; BL00134
INTERPRO; IPR001316
PDB; 2SFA; 1SGC;
SCOP; 2SFA; 1SGC;
CATH; 2SFA; 1SGC;
Literature References
1. RAWLINGS, N.D. AND BARRETT, A.J.
Families of serine peptidases.
METHODS ENZYMOL. 244 19-61 (1994).
 
2. RAWLINGS, N.D. AND BARRETT, A.J.
Evolutionary families of peptidases.
BIOCHEM.J. 290 205-218 (1993).
 
3. BAIROCH, A. AND RAWLINGS, N.
Classification of peptidase families and index of peptidase entries in
SWISS-PROT.
http://expasy.hcuge.ch/cgi-bin/lists?peptidas.txt

Documentation
Proteolytic enzymes that use serine in their catalytic machinery are 
widespread and numerous, being found in viruses, bacteria and eukaryotes
[1]. They encompass a range of peptidase activity, including exopeptidase,
endopeptidase, oligopeptidase and omega-peptidase. More than 20 serine
protease families (denoted S1 - S27) have been identified, which have been
grouped into 6 clans (SA, SB, SC, SE, SF and SG) on the basis of structural
and functional similarities [1]. Structures from four clans have been
examined (SA, SB, SC and SE): these appear to be unrelated, suggesting at 
least four evolutionary origins of serine peptidase, and possibly many more
[1]. Since that examination, structural representations from the other two 
clan members (SF, SG) have been determined [3].
 
Notwithstanding their different evolutionary origins, there are similarities
in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin
and carboxypeptidase C clans have a catalytic triad of serine, aspartate and
histidine in common: serine acts as a nucleophile, aspartate as an
electrophile, and histidine as a base [1]. The geometric orientations of
the catalytic residues are similar between families, despite different 
protein folds [1]. The linear arrangements of the catalytic residues
commonly reflect clan relationships. For example the catalytic triad in 
the chymotrypsin clan (SA) is ordered HDS, but is ordered DHS in the
subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [1,2].
 
Alpha-lytic endopeptidases belong to the chymotrypisin (SA) clan, within
which they have been assigned to subfamily A of the S2 family (S2A) [2,3].
Since the original classification, the S2 family has been split into three
subfamilies [3] on the basis of sequence similarity [1,2]. S2 proteases
are only known to be expressed in bacteria, from which they are secreted to
act externally [1]. These proteases have endopeptidase activity, which is
specific for basic, hydrophobic or alanine P1 residues. The alpha-lytic 
endopeptidase family also contains members that can cleave glutamyl
bonds [1]. 
 
ALYTICPTASE is a 5-element fingerprint that provides a signature for the
alpha-lytic endopeptidase (S2A) family of serine proteases. The fingerprint
was derived from an initial alignment of 13 sequences: the motifs were drawn
from conserved regions around the active site residues - motif 1 spans the 
C-terminus of the fourth beta-strand, helix 1 and the N-terminus of the 
fifth strand (it includes the region encoded by the PROSITE pattern 
TRYPSIN_HIS (PS00135), which contains the catalytic histidine); motif 2 
spans strands 10, 11 and the N-terminus of strand 12; motif 3 encodes the 
remainder of strand 12; motif 4 spans strands 13 and 14, and includes the 
region encoded by the PROSITE pattern TRYPSIN_SER (PS00135), which contains
the catalytic serine; and motif 5 encodes strand 15. Two iterations on 
OWL30.0 were required to reach convergence, at which point a true set
comprising 22 sequences was identified. 
 
An update on SPTR37_9f identified a true set of 18 sequences.
Summary Information
18 codes involving  5 elements
0 codes involving 4 elements
0 codes involving 3 elements
0 codes involving 2 elements
Composite Feature Index
51818181818
400000
300000
200000
12345
True Positives
GLUP_STRGR    O32439        O54109        O86984        
PRLA_LYSEN PRTA_STRGR PRTB_STRGR PRTC_STRGR
PRTD_STRGR Q53756 Q54211 Q54392
Q54395 Q55352 Q55353 SFA1_STRFR
SFA2_STRFR SP1_RARFA
Sequence Titles
GLUP_STRGR  GLUTAMYL ENDOPEPTIDASE II (EC 3.4.21.82) (GLUTAMIC ACID-SPECIFIC PROTEASE) (GLUSGP) (PROTEASE E) (SGPE) - STREPTOMYCES GRISEUS. 
O32439 SERINE PROTEASE PRECURSOR - STREPTOMYCES ALBOGRISEOLUS.
O54109 PUTATIVE SECRETED PROTEASE - STREPTOMYCES COELICOLOR.
O86984 ALKALINE SERINE PROTEASE PRECURSOR - THERMOMONOSPORA FUSCA.
PRLA_LYSEN ALPHA-LYTIC PROTEASE PRECURSOR (EC 3.4.21.12) (ALPHA-LYTIC ENDOPEPTIDASE) - LYSOBACTER ENZYMOGENES.
PRTA_STRGR STREPTOGRISIN A PRECURSOR (EC 3.4.21.80) (PROTEASE A) (SGPA) (PRONASE ENZYME A) - STREPTOMYCES GRISEUS.
PRTB_STRGR STREPTOGRISIN B PRECURSOR (EC 3.4.21.81) (PROTEASE B) (SGPB) (PRONASE ENZYME B) - STREPTOMYCES GRISEUS.
PRTC_STRGR SERINE PROTEASE C PRECURSOR (EC 3.4.21.-) (SGPC) - STREPTOMYCES GRISEUS.
PRTD_STRGR SERINE PROTEASE D PRECURSOR (EC 3.4.21.-) (SGPD) - STREPTOMYCES GRISEUS.
Q53756 SAM-P20 SERINE PROTEASE PRECURSOR - STREPTOMYCES ALBOGRISEOLUS.
Q54211 GLUTAMIC ACID SPECIFIC PROTEASE PRECURSOR - STREPTOMYCES GRISEUS.
Q54392 PROTEASE - STREPTOMYCES LIVIDANS.
Q54395 SALO PRECURSOR - STREPTOMYCES LIVIDANS.
Q55352 ALKALINE SERINE PROTEASE I - STREPTOMYCES SP.
Q55353 ALKALINE SERINE PROTEASE II - STREPTOMYCES SP.
SFA1_STRFR SERINE PROTEASE 1 PRECURSOR (EC 3.4.21.-) (GLUTAMIC ACID-SPECIFIC PROTEASE) (SFASE-1) - STREPTOMYCES FRADIAE.
SFA2_STRFR SERINE PROTEASE 2 (EC 3.4.21.-) (SFASE-2) - STREPTOMYCES FRADIAE.
SP1_RARFA SERINE PROTEASE I PRECURSOR (EC 3.4.21.-) (RPI) - RAROBACTER FAECITABIDUS.
Scan History
OWL30_0    2  40   NSINGLE    
SPTR37_9f 2 21 NSINGLE
Initial Motifs
Motif 1  width=15
Element Seqn Id St Int Rpt
YFLTAGHCTNLSSTW SFA1_STRFR 197 197 -
GFVTAGHCGTVNATA PRLA_LYSEN 229 229 -
GFLTAGHCAVEGKGH SP1_RARFA 232 232 -
GFATAGHCGTVGTST PC2053 224 224 -
AFLTRGHCGGGATMW STMSALO 185 185 -
GFVTAGHCGSVGNAT S34672 228 228 -
YMMTAGHCAEDSSYW SC10A516 250 250 -
YALTAGHCTEIASTW SFA2_STRFR 29 29 -
YFLTAGHCTDGATTW PRTB_STRGR 141 141 -
YFLTAGHCTESVTSW PRTD_STRGR 231 231 -
HALTAGHCTNISASW PRTA_STRGR 143 143 -
GFATAGHCGRVGTTT PRTC_STRGR 232 232 -
YFVTAGHCTNISANW GLUP_STRGR 27 27 -

Motif 2 width=21
Element Seqn Id St Int Rpt
VRGSTEAAVGAAVCRSGRTTG PRLA_LYSEN 287 43 -
IASAADAVVGQAIKKSGSTTK SFA1_STRFR 258 46 -
ISSAANAVVGQAIKKSGSTTK GLUP_STRGR 88 46 -
ITGAGNAYVGQTVQRSGSTTG SFA2_STRFR 91 47 -
VSGSTEAAVGASICRSGSTTG PC2053 281 42 -
ISGAAEASVGQEVFRMGSTTG STMSALO 247 47 -
VTGSTQATVGSSICRSGSTTG S34672 287 44 -
ITNWDYDYVGQYVCKQGSTTG SC10A516 313 48 -
VAGSTASVVGASVCRSGSTTG PRTC_STRGR 292 45 -
ITTAGNAFVGQAVQRSGSTTG PRTA_STRGR 197 39 -
ITQAGDATVGQAVTRSGSTTQ PRTD_STRGR 292 46 -
ITSAANATVGMAVTRRGSTTG PRTB_STRGR 199 43 -
IKGSNEAAVGAHMCKSGRTTK SP1_RARFA 297 50 -

Motif 3 width=18
Element Seqn Id St Int Rpt
WHCGTIQQHNTSVTYPEG PC2053 302 0 -
LADGQVLGLDVTVNYPEG STMSALO 268 0 -
WRCGTIQQHNTSVTYPQG S34672 308 0 -
YTCGQITETNATVSYPGR SC10A516 334 0 -
LHSGRVTGLNATVNYGGG SFA2_STRFR 112 0 -
THSGSVTALNATVNYGGG PRTB_STRGR 220 0 -
VHDGEVTALDATVNYGNG PRTD_STRGR 313 0 -
LRSGSVTGLNATVNYGSS PRTA_STRGR 218 0 -
WHCGTIQQLNTSVTYPEG PRTC_STRGR 313 0 -
VTSGTVTAVNVTVNYGDG GLUP_STRGR 109 0 -
VTSGTVSAVNVTVNYSDG SFA1_STRFR 279 0 -
YQCGTITAKNVTANYAEG PRLA_LYSEN 308 0 -
WTCGYLLRKDVSVNYGNG SP1_RARFA 318 0 -

Motif 4 width=24
Element Seqn Id St Int Rpt
VYNMGRTTACSAGGDSGGAHFAGS GLUP_STRGR 128 1 -
VYGMVRTTACSAGGDSGGAHFAGS SFA1_STRFR 298 1 -
VRGLTQGNACMGRGDSGGSWITSA PRLA_LYSEN 327 1 -
IVTLNETSACALGGDSGGAYVWND SP1_RARFA 337 1 -
VSGLIQTNVCAEPGDSGGALFAGS SFA2_STRFR 132 2 -
VYGMIRTNVCAEPGDSGGPLYSGT PRTB_STRGR 240 2 -
VNGLIQTTVCAEPGDSGGALFAGD PRTD_STRGR 333 2 -
VYGMIQTNVCAEPGDSGGSLFAGS PRTA_STRGR 238 2 -
LTGMTWSTACDAPGDSGSGVYDGS SC10A516 353 1 -
ITGVTRTSACAQPGDSGGSFISGT S34672 327 1 -
VTGLIQTDVCAEPGDSGGSLFTRD STMSALO 287 1 -
ITGVTRTSVCAEPGDSGGSYISGS PC2053 321 1 -
ISGVTRTSVCAEPGDSGGSYISGS PRTC_STRGR 332 1 -

Motif 5 width=15
Element Seqn Id St Int Rpt
TALGLTSGGSGNCRT SFA2_STRFR 156 0 -
TALGLTSGGSGDCSS PRTD_STRGR 357 0 -
TALGLTSGGSGNCRT PRTA_STRGR 262 0 -
QAQGVTSGGSGNCSS PRTC_STRGR 356 0 -
VALGIHSGSSGCSGT GLUP_STRGR 152 0 -
VALGIHSGSSGCTGT SFA1_STRFR 322 0 -
QAQGVMSGGNVQSNG PRLA_LYSEN 352 1 -
QAQGITSGSNMDTNN SP1_RARFA 361 0 -
TAHGILSGGPNSGCG SC10A516 377 0 -
QAQGVTSGGSGNCSI S34672 351 0 -
LAIRLTSGGTRDCTS STMSALO 312 1 -
QAQGVTSGGSGNCTS PC2053 345 0 -
RAIGLTSGGSGNCSS PRTB_STRGR 264 0 -
Final Motifs
Motif 1  width=15
Element Seqn Id St Int Rpt
YALTAGHCTEIASTW SFA2_STRFR 29 29 -
YFLTAGHCTDGATTW PRTB_STRGR 141 141 -
YFLTAGHCTESVTSW PRTD_STRGR 231 231 -
YFLTAGHCTDGAGTW Q53756 143 143 -
HFLTAGHCTEGISTW O32439 123 123 -
YFLTAGHCTDGAGAW Q54392 143 143 -
HALTAGHCTNISASW PRTA_STRGR 143 143 -
GFATAGHCGRVGTTT PRTC_STRGR 232 232 -
GFATAGHCGTVGTST Q55353 224 224 -
YFVTAGHCTNISANW Q54211 195 195 -
YFVTAGHCTNISANW GLUP_STRGR 27 27 -
YFLTAGHCTNLSSTW SFA1_STRFR 197 197 -
GFVTAGHCGSVGNAT Q55352 228 228 -
GFATAGHCGSTGTRV O86984 211 211 -
AFLTRGHCGGGATMW Q54395 185 185 -
GFVTAGHCGTVNATA PRLA_LYSEN 229 229 -
GFLTAGHCAVEGKGH SP1_RARFA 232 232 -
YMMTAGHCAEDSSYW O54109 250 250 -

Motif 2 width=21
Element Seqn Id St Int Rpt
ITGAGNAYVGQTVQRSGSTTG SFA2_STRFR 91 47 -
ITSAANATVGMAVTRRGSTTG PRTB_STRGR 199 43 -
ITQAGDATVGQAVTRSGSTTQ PRTD_STRGR 292 46 -
ITRAATPSVGTTVIRDGSTTG Q53756 200 42 -
ISGAAEAAVGMQVTRSGSTTQ O32439 183 45 -
ITRAATPSVGTTVIRDGSTTG Q54392 199 41 -
ITTAGNAFVGQAVQRSGSTTG PRTA_STRGR 197 39 -
VAGSTASVVGASVCRSGSTTG PRTC_STRGR 292 45 -
VSGSTEAAVGASICRSGSTTG Q55353 281 42 -
ISSAANAVVGQAIKKSGSTTK Q54211 256 46 -
ISSAANAVVGQAIKKSGSTTK GLUP_STRGR 88 46 -
IASAADAVVGQAIKKSGSTTK SFA1_STRFR 258 46 -
VTGSTQATVGSSICRSGSTTG Q55352 287 44 -
VTGSQEAATGSSVCRSGATTG O86984 267 41 -
ISGAAEASVGQEVFRMGSTTG Q54395 247 47 -
VRGSTEAAVGAAVCRSGRTTG PRLA_LYSEN 287 43 -
IKGSNEAAVGAHMCKSGRTTK SP1_RARFA 297 50 -
ITNWDYDYVGQYVCKQGSTTG O54109 313 48 -

Motif 3 width=18
Element Seqn Id St Int Rpt
LHSGRVTGLNATVNYGGG SFA2_STRFR 112 0 -
THSGSVTALNATVNYGGG PRTB_STRGR 220 0 -
VHDGEVTALDATVNYGNG PRTD_STRGR 313 0 -
THSGRVTALNATVNYGGG Q53756 221 0 -
VHSGTVTGLDATVNYGNG O32439 204 0 -
THSGRVTALNATVNYGGG Q54392 220 0 -
LRSGSVTGLNATVNYGSS PRTA_STRGR 218 0 -
WHCGTIQQLNTSVTYPEG PRTC_STRGR 313 0 -
WHCGTIQQHNTSVTYPEG Q55353 302 0 -
VTSGTVTAVNVTVNYGDG Q54211 277 0 -
VTSGTVTAVNVTVNYGDG GLUP_STRGR 109 0 -
VTSGTVSAVNVTVNYSDG SFA1_STRFR 279 0 -
WRCGTIQQHNTSVTYPQG Q55352 308 0 -
WRCGTIQSKNQTVRYAEG O86984 288 0 -
LADGQVLGLDVTVNYPEG Q54395 268 0 -
YQCGTITAKNVTANYAEG PRLA_LYSEN 308 0 -
WTCGYLLRKDVSVNYGNG SP1_RARFA 318 0 -
YTCGQITETNATVSYPGR O54109 334 0 -

Motif 4 width=24
Element Seqn Id St Int Rpt
VSGLIQTNVCAEPGDSGGALFAGS SFA2_STRFR 132 2 -
VYGMIRTNVCAEPGDSGGPLYSGT PRTB_STRGR 240 2 -
VNGLIQTTVCAEPGDSGGALFAGD PRTD_STRGR 333 2 -
VSGLIQTTVCAEPGDSGGPLYGSN Q53756 241 2 -
VNGLIQTDVCAEPGDSGGSLFSGD O32439 224 2 -
VGGLIQTTVCAEPGDSGGSLYGSN Q54392 240 2 -
VYGMIQTNVCAEPGDSGGSLFAGS PRTA_STRGR 238 2 -
ISGVTRTSVCAEPGDSGGSYISGS PRTC_STRGR 332 1 -
ITGVTRTSVCAEPGDSGGSYISGS Q55353 321 1 -
VYNMVRTTACSAGGDSGGAHFAGS Q54211 296 1 -
VYNMGRTTACSAGGDSGGAHFAGS GLUP_STRGR 128 1 -
VYGMVRTTACSAGGDSGGAHFAGS SFA1_STRFR 298 1 -
ITGVTRTSACAQPGDSGGSFISGT Q55352 327 1 -
VTGLTRTTACAEGGDSGGPWLTGS O86984 307 1 -
VTGLIQTDVCAEPGDSGGSLFTRD Q54395 287 1 -
VRGLTQGNACMGRGDSGGSWITSA PRLA_LYSEN 327 1 -
IVTLNETSACALGGDSGGAYVWND SP1_RARFA 337 1 -
LTGMTWSTACDAPGDSGSGVYDGS O54109 353 1 -

Motif 5 width=15
Element Seqn Id St Int Rpt
TALGLTSGGSGNCRT SFA2_STRFR 156 0 -
RAIGLTSGGSGNCSS PRTB_STRGR 264 0 -
TALGLTSGGSGDCSS PRTD_STRGR 357 0 -
TAYGLTSGGSGNCSS Q53756 266 1 -
KAVGLTSGGSGDCTS O32439 248 0 -
TAYGLTSGGSGNCSS Q54392 265 1 -
TALGLTSGGSGNCRT PRTA_STRGR 262 0 -
QAQGVTSGGSGNCSS PRTC_STRGR 356 0 -
QAQGVTSGGSGNCTS Q55353 345 0 -
VALGIHSGSSGCSGT Q54211 320 0 -
VALGIHSGSSGCSGT GLUP_STRGR 152 0 -
VALGIHSGSSGCTGT SFA1_STRFR 322 0 -
QAQGVTSGGSGNCSI Q55352 351 0 -
QAQGVTSGGTGDCRS O86984 331 0 -
LAIRLTSGGTRDCTS Q54395 312 1 -
QAQGVMSGGNVQSNG PRLA_LYSEN 352 1 -
QAQGITSGSNMDTNN SP1_RARFA 361 0 -
TAHGILSGGPNSGCG O54109 377 0 -