WORKLIST ENTRIES (1):

GLHYDRLASE2 View alignment      Glycosyl hydrolase family 2 signature
 Type of fingerprint: COMPOUND with 5  elements
Links:
   PRINTS; PR00131 GLHYDRLASE1; PR00133 GLHYDRLASE3; PR00732 GLHYDRLASE4
   PRINTS; PR00733 GLHYDRLASE6; PR00734 GLHYDRLASE7; PR00735 GLHYDRLASE8
   PRINTS; PR00134 GLHYDRLASE10; PR00911 GLHYDRLASE11; PR00736 GLHYDRLASE15
   PRINTS; PR00737 GLHYDRLASE16; PR00738 GLHYDRLASE20; PR00739 GLHYDRLASE26
   PRINTS; PR00740 GLHYDRLASE27; PR00741 GLHYDRLASE29; PR00843 GLHYDRLASE30
   PRINTS; PR00742 GLHYDRLASE35; PR00743 GLHYDRLASE36; PR00744 GLHYDRLASE37
   PRINTS; PR00745 GLHYDRLASE39; PR00746 GLHYDRLASE41; PR00747 GLHYDRLASE47
   PRINTS; PR00844 GLHYDRLASE48; PR00845 GLHYDRLASE52; PR00846 GLHYDRLASE56
   PRINTS; PR00849 GLHYDRLASE58; PR00850 GLHYDRLASE59; PR00748 MELIBIASE
   PRINTS; PR00137 LYSOZYME; PR00684 T4LYSOZYME; PR00749 LYSOZYMEG
   PRINTS; PR00110 ALPHAAMYLASE; PR00750 BETAAMYLASE
   INTERPRO; IPR001649
   PROSITE; PS00719 GLYCOSYL_HYDROL_F2_1; PS00608 GLYCOSYL_HYDROLASE_F2_2

 Creation date 08-NOV-1994; UPDATE 07-JUN-1999

   1. HENRISSAT, B. AND BAIROCH, A.
   New families in the classification of glycosyl hydrolases based on amino
   acid sequence similarities.
   BIOCHEM.J. 293 781-788 (1993).

   2. HENRISSAT, B.
   A classification of glycosyl hydrolases based on amino acid sequence
   similarities.
   BIOCHEM.J. 280 309-316 (1991).

   3. SCHROEDER, C.J., ROBERT, C., LENZEN, G., MCKAY, L.L. AND MERCENIER, A.
   Analysis of the lacZ sequences from 2 Streptococcus thermophilus strains -
   Comparison with the Escherichia coli and Lactobacillus bulgaricus beta-
   galactosidase sequences.
   J.GEN.MICROBIOL. 137 369-380 (1991).

   4. GEBLER, J.C., AEBERSOLD, R. AND WITHERS, S.G.
   Glu-537, not Glu-461, is the nucleophile in the active site of (lacZ)
   beta-galactosidase from Escherichia coli.
   J.BIOL.CHEM. 267 11126-11130 (1992). 

   O-Glycosyl hydrolases (EC 3.2.1.-) are a widespread group of enzymes that
   hydrolyse the glycosidic bond between two or more carbohydrates, or between
   a carbohydrate and a non-carbohydrate moiety. A classification system for
   glycosyl hydrolases, based on sequence similarity, has led to the definition
   of up to 60 different families [1-3] (http://expasy.hcuge.ch/cgi-bin/lists?
   glycosid.txt).
   
   Family 2 includes various bacterial beta-galactosidases and beta-
   glucuronidases. Alignments of such sequences reveal a number of short
   conserved regions, one of which contains a glutamic acid residue, which has
   been shown to be the general acid/base catalyst in the enzyme active site
   in E.coli lacZ [4].
  
   GLHYDRLASE2 is a 5-element fingerprint that provides a signature for
   family 2 glycosyl hydrolases. The fingerprint was derived from an initial
   alignment of 5 sequences: the motifs were drawn from conserved regions 
   spanning virtually the full alignment length - motifs 3 and 4 include 
   regions encoded by PROSITE patterns GLYCOSYL_HYDROLASE_F2_1 (PS00719)
   GLYCOSYL-HYDROLASE_F2_2 (PS00608), the latter containing the catalytically
   active Glu. Two iterations on OWL24.0 were required to reach convergence,
   at which point a true set comprising 23 sequences was identified. Four
   partial matches were also found: BGAL_THETU fails to make a significant
   match with motif 5; RHMLACZ only matches motifs 2 and 3; BGLB_BACPO is a
   family 1 glycosyl hydrolase that matches motifs 3 and 5; and GUNA_XANCP is
   a family 5 glycosyl hydrolase that matches motifs 3 and 4.
  
   An update on SPTR37_9f identified a true set of 29 sequences, and 3
   partial matches.

  SUMMARY INFORMATION
     29 codes involving  5 elements
      2 codes involving  4 elements
      0 codes involving  3 elements
      1 codes involving  2 elements

   COMPOSITE FINGERPRINT INDEX
  
    5|  29   29   29   29   29  
    4|   2    2    2    2    0  
    3|   0    0    0    0    0  
    2|   0    1    1    0    0  
   --+--------------------------
     |   1    2    3    4    5  

True positives..
 BGAL_BACME     BGAL_ECOLI     P97096         O09266         
 O85167         BGAL_KLEPN     BGAL_ENTCL     O09267         
 BGAL_LACLA     O87523         BGAL_THEMA     O85250         
 BGAL_STRTR     BGAL_CLOAB     BGAL_LACSK     BGAL_LACDE     
 BGAL_LEULA     BGAL_KLULA     BGA2_ECOLI     Q47170         
 BGAL_ARTSP     BGAL_LACAC     BGAL_STAXY     BGAL_ACTPL     
 BGLR_CANFA     BGLR_HUMAN     BGLR_RAT       BGLR_MOUSE     
 BGLR_ECOLI     
Subfamily:  Codes involving 4 elements
 Subfamily True positives..
 BGAL_THETU     BGAL_THEET     
Subfamily:  Codes involving 2 elements
 Subfamily True positives..
 BGAL_RHIME     


  PROTEIN TITLES
   BGAL_BACME       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - BACILLUS MEGATE
   BGAL_ECOLI       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - ESCHERICHIA COL
   P97096           CLONING VECTOR PLACZI, COMPLETE PLASMID SEQUENCE - CLONING V
   O09266           BETA-GALACTOSIDASE - UNIDENTIFIED.
   O85167           BETA-GALACTOSIDASE - BACILLUS MEGATERIUM.
   BGAL_KLEPN       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - KLEBSIELLA PNEU
   BGAL_ENTCL       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - ENTEROBACTER CL
   O09267           BETA-GALACTOSIDASE - UNIDENTIFIED.
   BGAL_LACLA       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - LACTOCOCCUS LAC
   O87523           BETA-GALACTOSIDASE - LACTOCOCCUS LACTIS.
   BGAL_THEMA       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - THERMOTOGA MARI
   O85250           BETA-GALACTOSIDASE - THERMOTOGA NEAPOLITANA.
   BGAL_STRTR       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - STREPTOCOCCUS T
   BGAL_CLOAB       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - CLOSTRIDIUM ACE
   BGAL_LACSK       BETA-GALACTOSIDASE LARGE SUBUNIT (EC 3.2.1.23) (LACTASE) - L
   BGAL_LACDE       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - LACTOBACILLUS D
   BGAL_LEULA       BETA-GALACTOSIDASE LARGE SUBUNIT (EC 3.2.1.23) (LACTASE) - L
   BGAL_KLULA       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - KLUYVEROMYCES L
   BGA2_ECOLI       EVOLVED BETA-GALACTOSIDASE ALPHA-SUBUNIT (EC 3.2.1.23) (LACT
   Q47170           EVOLVED BETA-GALACTOSIDASE - ESCHERICHIA COLI.
   BGAL_ARTSP       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - ARTHROBACTER SP
   BGAL_LACAC       BETA-GALACTOSIDASE LARGE SUBUNIT (EC 3.2.1.23) (LACTASE) - L
   BGAL_STAXY       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - STAPHYLOCOCCUS 
   BGAL_ACTPL       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - ACTINOBACILLUS 
   BGLR_CANFA       BETA-GLUCURONIDASE PRECURSOR (EC 3.2.1.31) - CANIS FAMILIARI
   BGLR_HUMAN       BETA-GLUCURONIDASE PRECURSOR (EC 3.2.1.31) (BETA-G1) - HOMO 
   BGLR_RAT         BETA-GLUCURONIDASE PRECURSOR (EC 3.2.1.31) - RATTUS NORVEGIC
   BGLR_MOUSE       BETA-GLUCURONIDASE PRECURSOR (EC 3.2.1.31) - MUS MUSCULUS (M
   BGLR_ECOLI       BETA-GLUCURONIDASE (EC 3.2.1.31) (GUS) (BETA-D-GLUCURONOSIDE
 
   BGAL_THETU       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - THERMOANAEROBAC
   BGAL_THEET       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - THERMOANAEROBAC
 
   BGAL_RHIME       BETA-GALACTOSIDASE (EC 3.2.1.23) (LACTASE) - RHIZOBIUM MELIL

SCAN HISTORY OWL24_0 2 100 NSINGLE SPTR37_9f 2 106 NSINGLE INITIAL MOTIF SETS GLHYDRLASE21 Length of motif = 16 Motif number = 1 Cellulase 2 motif I - 1 PCODE ST INT VNSAFHLWCNGRWVGY BGAL_ECOLI 146 146 VNSAFHLWCNGVWVGY BGAL_KLEPN 152 152 AHYYAVVWVNGIHVVE BGLR_RAT 126 126 VEEALYVWLNGHFIGY BGAL_LEULA 154 154 VTHYGKVWVNNQEVME BGLR_ECOLI 91 91 GLHYDRLASE22 Length of motif = 15 Motif number = 2 Cellulase 2 motif II - 1 PCODE ST INT LNGKPLLIRGVNRHE BGAL_ECOLI 344 182 LNGKPLLIRGVNRHE BGAL_KLEPN 351 183 INGKPFYFQGVNKHE BGLR_RAT 334 192 VNNKRLVINGVNRHE BGAL_LEULA 341 171 INHKPFYFTGFGRHE BGLR_ECOLI 283 176 GLHYDRLASE23 Length of motif = 19 Motif number = 3 Cellulase 2 motif III - 1 PCODE ST INT DILLMKQNNFNAVRCSHYP BGAL_ECOLI 375 16 DILLMKQNNFNAVRCSHYP BGAL_KLEPN 382 16 DFNLLRWLGANSFRTSHYP BGLR_RAT 365 16 DIQTMLANNINADRTCHYP BGAL_LEULA 372 16 DHALMDWIGANSYRTSHYP BGLR_ECOLI 314 16 GLHYDRLASE24 Length of motif = 16 Motif number = 4 Cellulase 2 motif IV - 1 PCODE ST INT DRNHPSVIIWSLGNES BGAL_ECOLI 447 53 NRNHPCIIIWSLGNES BGAL_KLEPN 454 53 DKNHPAVVMWSVANEP BGLR_RAT 433 49 FKNHPSIIFWSLGNES BGAL_LEULA 452 61 DKNHPSVVMWSIANEP BGLR_ECOLI 399 66 GLHYDRLASE25 Length of motif = 16 Motif number = 5 Cellulase 2 motif V - 1 PCODE ST INT RPLILCEYAHAMGNSL BGAL_ECOLI 531 68 RPLILCEYAHAMGNSL BGAL_KLEPN 538 68 KPIIQSEYGADAVSGL BGLR_RAT 530 81 KPFLNCEYMHDMGNSL BGAL_LEULA 528 60 QPIIITEYGVDTLAGL BGLR_ECOLI 497 82 FINAL MOTIF SETS GLHYDRLASE21 Length of motif = 16 Motif number = 1 Cellulase 2 motif I - 2 PCODE ST INT VESAFYVWINGEFVGY BGAL_BACME 159 159 VNSAFHLWCNGRWVGY BGAL_ECOLI 146 146 VNSAFHLWCNGRWVGY P97096 168 168 VNSAFHLWCNGRWVGY O09266 153 153 VESAFYVWINGEFVGY O85167 159 159 VNSAFHLWCNGVWVGY BGAL_KLEPN 152 152 VNSAFHLWCNGQWIGY BGAL_ENTCL 148 148 VNLAFHLWCNVRWVGY O09267 158 158 VGSAFHFWLNGEYGGY BGAL_LACLA 134 134 VGSAFHFWLNGEYGGY O87523 132 132 VRSFFYLWVNGKKIGF BGAL_THEMA 136 136 VRSFFYLWVNGKRMGF O85250 136 136 VATSIFVWVNGNFVGY BGAL_STRTR 146 146 VETAFYVWVNGEFVGY BGAL_CLOAB 147 147 VEQAMYVWLNGQFIGY BGAL_LACSK 155 155 AATAIYVWLNGHFVGY BGAL_LACDE 151 151 VEEALYVWLNGHFIGY BGAL_LEULA 154 154 VDNCYELYVNGQYVGF BGAL_KLULA 132 132 VETYFEVYVNGQYVGF BGA2_ECOLI 135 135 VETYFEVYVNGQYVGF Q47170 68 68 VESRYKVWVNGVEIGV BGAL_ARTSP 143 143 AERAMYVWLNGHFIGY BGAL_LACAC 155 155 VDSAFYVWINNEFIGY BGAL_STAXY 136 136 VDSCLFVYVNKQFVGY BGAL_ACTPL 134 134 AHYYAIVWVNGVHVAE BGLR_CANFA 126 126 AHSYAIVWVNGVDTLE BGLR_HUMAN 126 126 AHYYAVVWVNGIHVVE BGLR_RAT 126 126 AHYYAVVWVNGIHVVE BGLR_MOUSE 126 126 VTHYGKVWVNNQEVME BGLR_ECOLI 91 91 GLHYDRLASE22 Length of motif = 15 Motif number = 2 Cellulase 2 motif II - 2 PCODE ST INT INGKRIVLRGVNRHE BGAL_BACME 356 181 LNGKPLLIRGVNRHE BGAL_ECOLI 344 182 LNGKPLLIRGVNRHE P97096 366 182 LNGKPLLIRGVNRHE O09266 351 182 INGKRIVLRGVNRHE O85167 356 181 LNGKPLLIRGVNRHE BGAL_KLEPN 351 183 LNGKPLLIRGVNRHE BGAL_ENTCL 346 182 LNGKPLLIRGVNRHE O09267 356 182 INGKALLVRGVNKHE BGAL_LACLA 314 164 INGKALLVRGVNKHE O87523 312 164 FNGKPLYIKGVNRHE BGAL_THEMA 322 170 FNGKPLYIKGVNRHE O85250 322 170 LNGKRIVFKGVNRHE BGAL_STRTR 333 171 LKWKRIIFKGVNRHE BGAL_CLOAB 334 171 LNGKRLVINGVNRHE BGAL_LACSK 340 169 LNGQRIVFKGANRHE BGAL_LACDE 338 171 VNNKRLVINGVNRHE BGAL_LEULA 341 171 VNGKDILFRGVNRHD BGAL_KLULA 342 194 INNRYVMLHGVNRHD BGA2_ECOLI 329 178 INNRYVMLHGVNRHD Q47170 262 178 VNGRKVIFHGVNRHE BGAL_ARTSP 314 155 LNGKCLIINGVNRHE BGAL_LACAC 343 172 INGQSIKIRGTNYHD BGAL_STAXY 311 159 FNQQPIKFKGVNRHD BGAL_ACTPL 308 158 INGKPFYFHGVNKHE BGLR_CANFA 337 195 INGKPFYFHGVNKHE BGLR_HUMAN 338 196 INGKPFYFQGVNKHE BGLR_RAT 334 192 INGKPFYFQGVNKHE BGLR_MOUSE 334 192 INHKPFYFTGFGRHE BGLR_ECOLI 283 176 GLHYDRLASE23 Length of motif = 19 Motif number = 3 Cellulase 2 motif III - 2 PCODE ST INT DILLMKQHNINAVRTSHYP BGAL_BACME 388 17 DILLMKQNNFNAVRCSHYP BGAL_ECOLI 375 16 DILLMKQNNFNAVRCSHYP P97096 397 16 DILLMKQNNFNAVRCSHYP O09266 382 16 DILLMKQHNIKPVRTSHYP O85167 388 17 DILLMKQNNFNAVRCSHYP BGAL_KLEPN 382 16 DIETMKQHSFNAVRCSHYP BGAL_ENTCL 377 16 DILLMKQNNFNAVRCSHYP O09267 387 16 DIKLMKEHNFNAVRCSHYP BGAL_LACLA 345 16 DIKLMKEHNFNAVRCSHYP O87523 343 16 DIKLMKQHNINTVRTSHYP BGAL_THEMA 353 16 DIKLMKQHNINTVRTSHYP O85250 353 16 DIKVMKQHNINAVRTSHYP BGAL_STRTR 364 16 DIKFLKQHNINAVRTSHYP BGAL_CLOAB 365 16 DIACMQRNHINAVRTSHYP BGAL_LACSK 371 16 DIKTMKRSNINAVRCSHYP BGAL_LACDE 369 16 DIQTMLANNINADRTCHYP BGAL_LEULA 372 16 DLILMKKFNINAVRNSHYP BGAL_KLULA 373 16 DLQLMKQHNINSVRTAHYP BGA2_ECOLI 360 16 DLQLMKQHNINSVRTAHYP Q47170 293 16 DLALMKRFNVNAIRTSHYP BGAL_ARTSP 345 16 DIDTFKENNINAVRTCHYP BGAL_LACAC 374 16 DLELMKQGNFNAIRTAHYP BGAL_STAXY 342 16 DLQLMKQHNINAIRTAHYP BGAL_ACTPL 339 16 DFNLLRWLGANAFRTSHYP BGLR_CANFA 368 16 DFNLLRWLGANAFRTSHYP BGLR_HUMAN 369 16 DFNLLRWLGANSFRTSHYP BGLR_RAT 365 16 DFNLLRWLGANSFRTSHYP BGLR_MOUSE 365 16 DHALMDWIGANSYRTSHYP BGLR_ECOLI 314 16 GLHYDRLASE24 Length of motif = 16 Motif number = 4 Cellulase 2 motif IV - 2 PCODE ST INT DKNHPSIIIWSLGNES BGAL_BACME 467 60 DRNHPSVIIWSLGNES BGAL_ECOLI 447 53 DRNHPSVIIWSLGNES P97096 469 53 DRNHPSVIIWSLGNES O09266 454 53 DKNHPSIIIWSLGNES O85167 467 60 NRNHPCIIIWSLGNES BGAL_KLEPN 454 53 DRNHPSIIIWSLGNES BGAL_ENTCL 449 53 DRNHPSVIIWSLGNES O09267 459 53 DRNHPSIIIWSLGNES BGAL_LACLA 417 53 DRNHPSIIIWSLGNES O87523 415 53 DKNHPSIIFWSLGNEA BGAL_THEMA 427 55 DKNHPSIIFWSLGNEA O85250 427 55 DKNHASVIIWSCGNES BGAL_STRTR 444 61 DKNHPSVLIWSCGNES BGAL_CLOAB 445 61 FKNHVSILFWSLGNES BGAL_LACSK 451 61 DKNHASILIWSLGNES BGAL_LACDE 450 62 FKNHPSIIFWSLGNES BGAL_LEULA 452 61 DVNHPSIIIWSLGNEA BGAL_KLULA 468 76 QKNHPSIIIWSLGNES BGA2_ECOLI 435 56 QKNHPSIIIWSLGNES Q47170 368 56 DKNHASIVMWSLGNES BGAL_ARTSP 420 56 FKNHTSILFWSLGNES BGAL_LACAC 454 61 LKNYSSIVSWSLGNES BGAL_STAXY 423 62 DKNRTSIIIWSLGNEA BGAL_ACTPL 441 83 DKNHPSVVMWSVANEP BGLR_CANFA 436 49 DKNHPAVVMWSVANEP BGLR_HUMAN 437 49 DKNHPAVVMWSVANEP BGLR_RAT 433 49 DKNHPAVVMWSVANEP BGLR_MOUSE 433 49 DKNHPSVVMWSIANEP BGLR_ECOLI 399 66 GLHYDRLASE25 Length of motif = 16 Motif number = 5 Cellulase 2 motif V - 2 PCODE ST INT KPYILCEYSHAMGNSC BGAL_BACME 541 58 RPLILCEYAHAMGNSL BGAL_ECOLI 531 68 RPLILCEYAHAMGNSL P97096 553 68 RPLILCEYAHAMGNSL O09266 538 68 KPYILCEYSHAMGNSC O85167 541 58 RPLILCEYAHAMGNSL BGAL_KLEPN 538 68 RPLILCEYAHAMGNSF BGAL_ENTCL 533 68 RPLILCEYAHAMGNSL O09267 543 68 RPLILCEYAHDMGNSL BGAL_LACLA 502 69 RPLILCEYAHDMGNSL O87523 500 69 KPFIMCEYAHAMGNSV BGAL_THEMA 501 58 KPFIMCEYAHAMGNSV O85250 501 58 KPYISCEYMHTMGNSG BGAL_STRTR 540 80 KPYISCEYMHSMGNST BGAL_CLOAB 519 58 KPFILCEYMHDMGNSL BGAL_LACSK 527 60 KPFISVEYAHAMGNSV BGAL_LACDE 525 59 KPFLNCEYMHDMGNSL BGAL_LEULA 528 60 KPLILCEYGHAMGNGP BGAL_KLULA 545 61 KPRIICEYAHAMGNGP BGA2_ECOLI 506 55 KPRIICEYAHAMGNGP Q47170 439 55 RPFILCEYVHAMGNGP BGAL_ARTSP 507 71 KPFMECEYMHDMGNSD BGAL_LACAC 530 60 KPFILCEYAHAMGNSP BGAL_STAXY 504 65 KPFVLCEYSHAMGNSN BGAL_ACTPL 520 63 KPIIQSEYGAETIAGF BGLR_CANFA 533 81 KPIIQSEYGAETIAGF BGLR_HUMAN 534 81 KPIIQSEYGADAVSGL BGLR_RAT 530 81 KPIIQSEYGADAIPGI BGLR_MOUSE 530 81 QPIIITEYGVDTLAGL BGLR_ECOLI 498 83

User query: Display/Full Code "GLHYDRLASE2"