PRINTS BLAST Help

Introduction

PRINTS stores family specific protein sequence infomation in the form of protein sequence motifs. Each entry, known as a fingerprint, consists of a group of conserved motifs used to characterise a protein family. For each fingerprint, there is also a list of family member ID names, which has been used to create a FASTA formatted sequence file. BLAST searches against this FASTA sequence compilation of PRINTS.

Input

Sequence input can be in the form of primary sequence database codes or accession numbers of sequences found in OWL, or actual protein or DNA sequence. OWL is a non-redundant composite database, created from SWISS-PROT, a translation of GenBank, PIR-1, PIR-2, PIR-3, PIR-4 and NRL-3D. Whitespace and numerical characters are automatically stripped from sequence input. Results of previous searches can be retrieved using the search ID given at the top of the result output.

BLAST input parameters are described here.


Output

Hyperlinks have been added to sections of the standard BLAST output so that the results may be more easily examined. At the top of the output an additional summary of detected fingerprints has been included. The three main parts to the BLAST output are,

i) fingerprint summary - this displays in tabular form, upto 10 of the most frequently detected fingerprints in the BLAST summary. Hyperlinks to the fingerprint entry and graphScan are included. GraphScan help is availbale here.


	
  PRINTS ID Code|Occur |  Picture  
  --------------+------+-----------
  CALPAIN       | 25   | GraphScan
  
ii) blast summary - displays the first 60 characters of the FASTA sequence description line, followed by the score of the highest scoring HSP (high scoring segment pair) , the lowest P-value (probablity that match is by chance) of a given HSP set, and finally the number of HSPs which give the lowest P-values. The description line consists of the PRINTS ID name (1), PRINTS accession number (2), motifs in the protein (3) / total motifs in fingerprint (4), sequence ID name (5), sequence accession number (6) and protein description (7).

Three hyperlinks are present in each description line, the PRINTS (1) and protein (5) ID codes are linked to the appropriate database entries, and the number of matching motifs (3/4) are linked to a graphical representation of the fingerprint with both the given sequence and a verified family member. A labelled summary section is shown below,


	                                                                 Smallest
                                                                  	   Sum
                                                          	  High  Probability
  Sequences producing High-scoring Segment Pairs:              Score  P(N)      N

            1   |   2   |3/4|   5    |    6   |        7
  PRINTS:CALPAIN PR00704 9/9 CAN1_HUMAN P07384 CALPAIN 1, L...  3762  0.0       1
  PRINTS:CALPAIN PR00704 9/9 AF021847 AF021847 Mus musculus...  2196  0.0       2
  PRINTS:CALPAIN PR00704 9/9 RNU53858 U53858 Rattus norvegi...  2191  0.0       2	
  
iii) blast alignments - includes the full sequence description line, alignments and statistical information of all HSPs for each sequence.

More detailed information about BLAST including, descriptions of the BLAST search parameters, a BLAST manual and references for BLAST and its statistics, can be found on the NCBI BLAST web pages.

Mail any comments/suggestions/problems to will@bioinf.man.ac.uk