SPRINT Home UMBER Home Contents Standard Search Advanced Search Relation Search

==SPRINT==> PRINTS View



  selected as


PR00778

Identifier
HTHARSR  [View Relations]  [View Alignment]  
Accession
PR00778
No. of Motifs
4
Creation Date
24-AUG-1997  (UPDATE 27-JUN-1999)
Title
Bacterial regulatory protein ArsR family signature
Database References

PROSITE; PS00846 HTH_ARSR_FAMILY
BLOCKS; BL00846
PFAM; PF01022 HTH_ARSR_family
INTERPRO; IPR001845
Literature References
1. MORBY, A.P., TURNER, J.S., HUCKLE, J.W. AND ROBINSON, N.J.
smtB is a metal-dependent repressor of the cyanobacterial metallothionein
gene smtA - identification of a Zn-inhibited DNA-protein complex.
NUCLEIC ACIDS RES. 21 921-925 (1993).
 
2. BAIROCH A.
A possible mechanism for metal-ion induced DNA-protein dissociation in
a family of prokaryotic transcriptional regulators.
NUCLEIC ACIDS RES. 21 2515-2515 (1993).

Documentation
Bacterial transcription regulatory proteins that bind DNA via a helix-turn-
helix (HTH) motif can be grouped into families on the basis of sequence 
similarities. One such group, termed arsR, includes several proteins that
appear to dissociate from DNA in the presence of metal ions: arsR, which
functions as a transcriptional repressor of an arsenic resistance operon;
smtB from Synechococcus, which acts as a transcriptional repressor of the
smtA gene that codes for a metallothionein; cadC, a protein required for
cadmium-resistance; and hypothetical protein yqcJ from Bacillus subtilis.
 
The HTH motif is thought to be located in the central part of these
proteins [1]. The motif is characterised by a number of well-conserved
residues: at its N-terminal extremity is a cysteine residue; a second Cys
is found in arsR and cadC, but not in smtA; and at the C-terminus lie one
or two histidines. These residues may be involved in metal-binding (Zn in
smtB; metal-oxyanions such as arsenite, antimonite and arsenate for arsR;
and cadmium for cadC) [2]. It is believed that binding of a metal ion could
induce a conformational change that would prevent the protein from binding
DNA [2]. 
 
HTHARSR is a 4-element fingerprint that provides a signature for the
bacterial regulatory protein arsR family. The fingerprint was derived from
an initial alignment of 13 sequences: the motifs were drawn from short
conserved regions spanning the full alignment length - motifs 2 and 3
span the region encoded by PROSITE pattern HTH_ARSR_FAMILY (PS00846), which
includes the complete HTH motif. Two iterations on OWL29.4 were required
to reach convergence, at which point a true set comprising 21 sequences
was identified. Several partial matches were also found, all of which
appear to be related DNA-binding proteins that contain an HTH motif.
 
An update on SPTR37_9f identified a true set of 30 sequences, and 27
partial matches.
Summary Information
  30 codes involving  4 elements
13 codes involving 3 elements
14 codes involving 2 elements
Composite Feature Index
430303030
31301313
292107
1234
True Positives
ARR1_ECOLI    ARR2_ECOLI    ARSR_ECOLI    ARSR_STAAU    
ARSR_STAXY CADC_BACFI CADC_LISMO CADC_STAAU
CADF_STAAU HLYU_VIBCH O05840 O26985
O27823 O52029 O53838 O57801
O67394 O68020 O69711 P71941
P73808 P74986 P94887 P95774
P96677 Q53040 Q58721 SMTB_SYNP7
SMTB_SYNY3 YQCJ_BACSU
True Positive Partials
Codes involving 3 elements
MERR_STRLI NOLR_RHIME O28144 O31480
O31844 O32242 O53478 O53626
O53773 O54057 O85142 P71939
YW25_MYCTU
Codes involving 2 elements
O08446 O28576 O28971 O28998
O34464 O50591 O52026 O53921
O58828 O59372 P77295 P96683
Q52517 YF53_METJA
Sequence Titles
ARR1_ECOLI  ARSENICAL RESISTANCE OPERON REPRESSOR - ESCHERICHIA COLI. 
ARR2_ECOLI ARSENICAL RESISTANCE OPERON REPRESSOR - ESCHERICHIA COLI.
ARSR_ECOLI ARSENICAL RESISTANCE OPERON REPRESSOR - ESCHERICHIA COLI.
ARSR_STAAU ARSENICAL RESISTANCE OPERON REPRESSOR - STAPHYLOCOCCUS AUREUS.
ARSR_STAXY ARSENICAL RESISTANCE OPERON REPRESSOR - STAPHYLOCOCCUS XYLOSUS.
CADC_BACFI CADMIUM EFFLUX SYSTEM ACCESSORY PROTEIN - BACILLUS FIRMUS.
CADC_LISMO CADMIUM EFFLUX SYSTEM ACCESSORY PROTEIN - LISTERIA MONOCYTOGENES.
CADC_STAAU CADMIUM EFFLUX SYSTEM ACCESSORY PROTEIN - STAPHYLOCOCCUS AUREUS.
CADF_STAAU CADMIUM EFFLUX SYSTEM ACCESSORY PROTEIN HOMOLOG - STAPHYLOCOCCUS AUREUS.
HLYU_VIBCH TRANSCRIPTIONAL ACTIVATOR HLYU - VIBRIO CHOLERAE.
O05840 HYPOTHETICAL 14.4 KD PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
O26985 TRANSCRIPTIONAL REGULATOR - METHANOBACTERIUM THERMOAUTOTROPHICUM.
O27823 TRANSCRIPTIONAL REGULATOR - METHANOBACTERIUM THERMOAUTOTROPHICUM.
O52029 ARSR PROTEIN - HALOBACTERIUM SP.
O53838 PUTATIVE TRANSCRIPTIONAL REGULATOR - MYCOBACTERIUM TUBERCULOSIS.
O57801 137AA LONG HYPOTHETICAL PROTEIN - PYROCOCCUS HORIKOSHII.
O67394 TRANSCRIPTIONAL REGULATOR (ARSR FAMILY) - AQUIFEX AEOLICUS.
O68020 ARSR - PSEUDOMONAS AERUGINOSA.
O69711 PUTATIVE REGULATORY PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
P71941 PUTATIVE TRANSCRIPTIONAL REGULATOR CY441.12 - MYCOBACTERIUM TUBERCULOSIS.
P73808 ARSENICAL RESISTANCE OPERON REPRESSOR - SYNECHOCYSTIS SP. (STRAIN PCC 6803).
P74986 ARSENITE INDUCIBLE REPRESSOR - YERSINIA ENTEROCOLITICA.
P94887 CADMIUM RESISTANCE REGULATORY PROTEIN - LACTOCOCCUS LACTIS.
P95774 CADX - STAPHYLOCOCCUS LUGDUNENSIS.
P96677 YDET PROTEIN - BACILLUS SUBTILIS.
Q53040 NITRILE HYDRATASE REGULATAR 2 - RHODOCOCCUS RHODOCHROUS.
Q58721 HYPOTHETICAL PROTEIN MJ1325 - METHANOCOCCUS JANNASCHII.
SMTB_SYNP7 TRANSCRIPTIONAL REPRESSOR SMTB - SYNECHOCOCCUS SP. (STRAIN PCC 7942) (ANACYSTIS NIDULANS R2).
SMTB_SYNY3 TRANSCRIPTIONAL REPRESSOR SMTB HOMOLOG - SYNECHOCYSTIS SP. (STRAIN PCC 6803).
YQCJ_BACSU HYPOTHETICAL 12.3 KD PROTEIN IN CWLA-CISA INTERGENIC REGION - BACILLUS SUBTILIS.

MERR_STRLI PROBABLE MERCURY RESISTANCE OPERON REPRESSOR - STREPTOMYCES LIVIDANS.
NOLR_RHIME NODULATION PROTEIN NOLR - RHIZOBIUM MELILOTI.
O28144 TRANSCRIPTIONAL REGULATORY PROTEIN, ARSR FAMILY - ARCHAEOGLOBUS FULGIDUS.
O31480 YCZG PROTEIN - BACILLUS SUBTILIS.
O31844 YOZA PROTEIN - BACILLUS SUBTILIS.
O32242 YVBA PROTEIN - BACILLUS SUBTILIS.
O53478 PUTATIVE REGULATOR - MYCOBACTERIUM TUBERCULOSIS.
O53626 TRANSCRIPTIONAL REGULATOR - MYCOBACTERIUM TUBERCULOSIS.
O53773 HYPOTHETICAL 46.4 KD PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
O54057 BV. VICIAE NOLR GENE - RHIZOBIUM LEGUMINOSARUM.
O85142 REPRESSOR PROTEIN - STAPHYLOCOCCUS AUREUS.
P71939 PUTATIVE TRANSCRIPTIONAL REGULATOR CY441.10C - MYCOBACTERIUM TUBERCULOSIS.
YW25_MYCTU HYPOTHETICAL TRANSCRIPTIONAL REGULATOR CY39.25 - MYCOBACTERIUM TUBERCULOSIS.

O08446 HYPOTHETICAL 24.1 KD PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
O28576 CONSERVED HYPOTHETICAL PROTEIN - ARCHAEOGLOBUS FULGIDUS.
O28971 HYPOTHETICAL 27.1 KD PROTEIN - ARCHAEOGLOBUS FULGIDUS.
O28998 TRANSCRIPTIONAL REGULATORY PROTEIN, ARSR FAMILY - ARCHAEOGLOBUS FULGIDUS.
O34464 YCEK - BACILLUS SUBTILIS.
O50591 ARSR - ACIDIPHILIUM MULTIVORUM.
O52026 SIMILAR TO BACILLUS SUBTILIS GP:GI-1881340 - HALOBACTERIUM SP.
O53921 HYPOTHETICAL 23.8 KD PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
O58828 147AA LONG HYPOTHETICAL PROTEIN - PYROCOCCUS HORIKOSHII.
O59372 253AA LONG HYPOTHETICAL PROTEIN - PYROCOCCUS HORIKOSHII.
P77295 FROM BASES 2786072 TO 2796651 (SECTION 241 OF 400) OF THE COMPLETE GENOME (SECTION 241 OF 400) - ESCHERICHIA COLI.
P96683 YDFF PROTEIN - BACILLUS SUBTILIS.
Q52517 HYPOTHETICAL 12.1 KD PROTEIN - STREPTOMYCES COELICOLOR.
YF53_METJA HYPOTHETICAL PROTEIN MJ1553 - METHANOCOCCUS JANNASCHII.
Scan History
OWL29_4    2  100  NSINGLE    
SPTR37_9f 3 150 NSINGLE
Initial Motifs
Motif 1  width=16
Element Seqn Id St Int Rpt
FAVLADPNRLRLLSLL SMTB_SYNP7 40 40 -
FKALSDDTRVKIAYVL CADF_STAAU 35 35 -
FKNLSDETRLGIVLLL ARR1_ECOLI 10 10 -
FKILSDETRLGIVLLL ARR2_ECOLI 10 10 -
FKILADETRLGIVLLL ARSR_ECOLI 10 10 -
FKAFGDPTRLMILKLL D64465 11 11 -
FKILSDETRVKIVYAL LISCADTNP1 35 35 -
FQALSDPIRLQVLTLL S74901 13 13 -
FDALADPVRRAILTVL D67028 7 7 -
LRALAAPVRIAIVLQL MTCY2721 52 52 -
FKALADQKRLEIMYEL YQCJ_BACSU 16 16 -
LKILSDSSRLEILDLL ARSR_STAAU 10 10 -
LKVLSDPSRLEILDLL ARSR_STAXY 10 10 -

Motif 2 width=12
Element Seqn Id St Int Rpt
SMCVCKIIDELK D64465 31 4 -
ELCVCDLANIVE LISCADTNP1 55 4 -
EQCVCDLCDQLN S74901 32 3 -
ECSVNELVDQID D67028 27 4 -
QRCVHELVDALH MTCY2721 71 3 -
ELCVCDLCTALD ARSR_ECOLI 30 4 -
ELCVCDLCTALE ARR2_ECOLI 30 4 -
ELCVCDLCMALD ARR1_ECOLI 30 4 -
ELCVCDVANIIE CADF_STAAU 55 4 -
ELCVGDLAQAIG SMTB_SYNP7 59 3 -
ELCACDLLEHFQ ARSR_STAXY 29 3 -
ELCACDLLEHFQ ARSR_STAAU 29 3 -
KTCVCDLTEIFE YQCJ_BACSU 36 4 -

Motif 3 width=16
Element Seqn Id St Int Rpt
SQSKLSFHLKRLRDAE S74901 45 1 -
TVAATSHHLRFLKKQG LISCADTNP1 68 1 -
SQPKISRHLALLRESG ARSR_ECOLI 43 1 -
SQPKTSRHLAMLRESG ARR2_ECOLI 43 1 -
SQPKISRHLAMLRESG ARR1_ECOLI 43 1 -
STATASHHLRLLKNLG CADF_STAAU 68 1 -
SESAVSHQLRSLRNLR SMTB_SYNP7 72 1 -
SQPTLSHHMKSLVDNE ARSR_STAXY 42 1 -
SQPTLSHHMKSLVDNE ARSR_STAAU 42 1 -
TQSKLSYHLKILLDAN YQCJ_BACSU 49 1 -
PQPLVSQHLKILKAAG MTCY2721 84 1 -
GRTGVSNHLRILRHAG D67028 41 2 -
PQPTISHHLNILKKAG D64465 44 1 -

Motif 4 width=16
Element Seqn Id St Int Rpt
GLVTERKAGRFRFYSI D67028 56 -1 -
GVVTGERSGREVLYRL MTCY2721 99 -1 -
GLLLDRKQGKWVHYRL ARSR_ECOLI 58 -1 -
GLLLDRKQGKWVHYRL ARR2_ECOLI 58 -1 -
GILLDRKQGKWVHYRL ARR1_ECOLI 58 -1 -
GIAKYRKEGKLVYYSL CADF_STAAU 83 -1 -
RLVSYRKQGRHVYYQL SMTB_SYNP7 87 -1 -
ELVTTRKNGNKHMYQL ARSR_STAXY 57 -1 -
ELVTTRKDGNKHWYQL ARSR_STAAU 57 -1 -
NLITKETKGTWSYYDL YQCJ_BACSU 64 -1 -
ELVHTRQDGRWIYYRL S74901 60 -1 -
GIVKARKEGTWNFYYI D64465 59 -1 -
GIANYRKDGKLVYYSL LISCADTNP1 83 -1 -
Final Motifs
Motif 1  width=16
Element Seqn Id St Int Rpt
FKILADETRLGIVLLL ARSR_ECOLI 10 10 -
FKILSDETRLGIVLLL ARR2_ECOLI 10 10 -
FKNLSDETRLGIVLLL ARR1_ECOLI 10 10 -
FKILSDETRVKIVYAL CADC_LISMO 35 35 -
FKILSDETRLAIVMLL P74986 8 8 -
FKALSDDTRVKIAYVL CADF_STAAU 35 35 -
FKCLADETRVRATLLI O68020 8 8 -
LKALADPTRLLIIYLL O26985 42 42 -
FKILSDENRLKIVHAL P94887 35 35 -
FYALSEPKRLCMVKLL O67394 10 10 -
LKAIADENRAKITYAL CADC_STAAU 36 36 -
FSALADPSRLRLMSAL SMTB_SYNY3 50 50 -
FQALSDPIRLQVLTLL P73808 13 13 -
FKILSEPTRLKILMAL O27823 36 36 -
FAVLADPNRLRLLSLL SMTB_SYNP7 40 40 -
LKAIADENRAKITYAL CADC_BACFI 36 36 -
FKALADPVRLQLLSSV P71941 35 35 -
FKAFGDPTRLMILKLL Q58721 11 11 -
LKVLSDPSRLEILDLL ARSR_STAXY 10 10 -
LKILSDSSRLEILDLL ARSR_STAAU 10 10 -
LSALANETRYKIIRIL O52029 49 49 -
LKTLSDQTRLIMMRLF P96677 12 12 -
LKAMANERRLQILCML HLYU_VIBCH 25 25 -
FKALADQKRLEIMYEL YQCJ_BACSU 16 16 -
LQALATPSRLMILTQL O69711 27 27 -
FDALADPVRRAILTVL Q53040 7 7 -
LRALAAPVRIAIVLQL O05840 52 52 -
LEKICDEKKLKIILSL P95774 36 36 -
FRMLADATRVQVLWSL O53838 22 22 -
LKVVSNPIRYGIVKML O57801 50 50 -

Motif 2 width=12
Element Seqn Id St Int Rpt
ELCVCDLCTALD ARSR_ECOLI 30 4 -
ELCVCDLCTALE ARR2_ECOLI 30 4 -
ELCVCDLCMALD ARR1_ECOLI 30 4 -
ELCVCDLANIVE CADC_LISMO 55 4 -
EMCVCDLCGATS P74986 28 4 -
ELCVCDVANIIE CADF_STAAU 55 4 -
ELCVCELMCALA O68020 28 4 -
DLCVCEIMAALK O26985 61 3 -
ELCVCDIANIID P94887 55 4 -
ELCVCDFMRIFK O67394 30 4 -
ELCVCDIANILG CADC_STAAU 56 4 -
ELCVCDLAAAMK SMTB_SYNY3 69 3 -
EQCVCDLCDQLN P73808 32 3 -
SLCVCELASLLD O27823 55 3 -
ELCVGDLAQAIG SMTB_SYNP7 59 3 -
ESCVCDIANIIG CADC_BACFI 56 4 -
EACVCDISAGVE P71941 57 6 -
SMCVCKIIDELK Q58721 31 4 -
ELCACDLLEHFQ ARSR_STAXY 29 3 -
ELCACDLLEHFQ ARSR_STAAU 29 3 -
ELCVCEFSPLLD O52029 70 5 -
EYCVCQLVDMFE P96677 31 3 -
ELSVGELSSRLE HLYU_VIBCH 44 3 -
KTCVCDLTEIFE YQCJ_BACSU 36 4 -
PLPVTDLAEAIG O69711 46 3 -
ECSVNELVDQID Q53040 27 4 -
QRCVHELVDALH O05840 71 3 -
ELCVCDISLILK P95774 56 4 -
EMSVNELAEQVG O53838 41 3 -
WMCVCLIAKALD O57801 69 3 -

Motif 3 width=16
Element Seqn Id St Int Rpt
SQPKISRHLALLRESG ARSR_ECOLI 43 1 -
SQPKTSRHLAMLRESG ARR2_ECOLI 43 1 -
SQPKISRHLAMLRESG ARR1_ECOLI 43 1 -
TVAATSHHLRFLKKQG CADC_LISMO 68 1 -
SQPKISRHMAILREAE P74986 41 1 -
STATASHHLRLLKNLG CADF_STAAU 68 1 -
SQPKISRHLAQLRSAG O68020 41 1 -
PQPTISHHLNILRRAG O26985 74 1 -
SVATTSHHLNSLKKLG P94887 68 1 -
SQPKISFHLKVLREAG O67394 43 1 -
TIANASHHLRTLYKQG CADC_STAAU 69 1 -
SESAVSHQLRILRSQR SMTB_SYNY3 82 1 -
SQSKLSFHLKRLRDAE P73808 45 1 -
TQSAVSHQLRILRNAG O27823 68 1 -
SESAVSHQLRSLRNLR SMTB_SYNP7 72 1 -
TAANASHHLRTLHKQG CADC_BACFI 69 1 -
SQPTISHHLKVLRDAG P71941 70 1 -
PQPTISHHLNILKKAG Q58721 44 1 -
SQPTLSHHMKSLVDNE ARSR_STAXY 42 1 -
SQPTLSHHMKSLVDNE ARSR_STAAU 42 1 -
SDSAISHSLSQLTEAG O52029 83 1 -
SQPAISQHLRKLKNAG P96677 44 1 -
SQSALSQHLAWLRRDG HLYU_VIBCH 57 1 -
TQSKLSYHLKILLDAN YQCJ_BACSU 49 1 -
EQSAVSHQLRVLRNLG O69711 59 1 -
GRTGVSNHLRILRHAG Q53040 41 2 -
PQPLVSQHLKILKAAG O05840 84 1 -
SVASTSHHLRLLYKND P95774 69 1 -
PAPSVSQHLAKLRMAR O53838 54 1 -
DQTLVSHHIRILKEID O57801 82 1 -

Motif 4 width=16
Element Seqn Id St Int Rpt
GLLLDRKQGKWVHYRL ARSR_ECOLI 58 -1 -
GLLLDRKQGKWVHYRL ARR2_ECOLI 58 -1 -
GILLDRKQGKWVHYRL ARR1_ECOLI 58 -1 -
GIANYRKDGKLVYYSL CADC_LISMO 83 -1 -
ELVLDRREGKWVHYRL P74986 56 -1 -
GIAKYRKEGKLVYYSL CADF_STAAU 83 -1 -
GLLLDRRQGQWVYYRL O68020 56 -1 -
GFLKAEKRGVWVHYSL O26985 89 -1 -
GVVDSHKDGKLVYYFI P94887 83 -1 -
GLVTSQKRGKWNYYRL O67394 58 -1 -
GVVNFRKEGKLALYSL CADC_STAAU 84 -1 -
RLVKYRRVGRNVYYSL SMTB_SYNY3 97 -1 -
ELVHTRQDGRWIYYRL P73808 60 -1 -
GMVDYERDGKMARYYL O27823 83 -1 -
RLVSYRKQGRHVYYQL SMTB_SYNP7 87 -1 -
GIVRYRKEGKLAFYSL CADC_BACFI 84 -1 -
GLLTSRRRASWVYYAV P71941 85 -1 -
GIVKARKEGTWNFYYI Q58721 59 -1 -
ELVTTRKNGNKHMYQL ARSR_STAXY 57 -1 -
ELVTTRKDGNKHWYQL ARSR_STAAU 57 -1 -
GLVTRRKDGKWRKYQT O52029 98 -1 -
GFVNEDRRGQWRYYSI P96677 59 -1 -
GLVNTRKEAQTVFYTL HLYU_VIBCH 72 -1 -
NLITKETKGTWSYYDL YQCJ_BACSU 64 -1 -
GLVVGDRAGRSIVYSL O69711 74 -1 -
GLVTERKAGRFRFYSI Q53040 56 -1 -
GVVTGERSGREVLYRL O05840 99 -1 -
DVLDFYKKGKMAYYFI P95774 84 -1 -
RLVRTRRDGTTIFYRL O53838 69 -1 -
DLLEEKREGKLRFYRV O57801 97 -1 -