The following information has been received by the server:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________________________________________________________
reference predict_h32463 (Jun 28, 2001 04:59:37)
reference pred_h32463 (Jun 28, 2001 04:59:27)
PPhdr from: koebnik@genetik.uni-halle.de
PPhdr resp: MAIL
PPhdr orig: HTML
PPhdr want: ASCII
PPhdr password(###)
prediction of: - default prediction of: - PHDsec PHDacc PHDhtm ProSite SEG ProDom
return msf format
# default: single protein sequence description=TonB E.c.
MIMTSMTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEP
EPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRA
LSRNQPQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGT
TEIQ
________________________________________________________________________________
Result of PROSITE search (Amos Bairoch):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
please quote: A Bairoch, P Bucher & K Hofmann: The PROSITE database,
its status in 1997. Nucl. Acids Res., 1997, 25, 217-221.
________________________________________________________________________________
--------------------------------------------------------
Pattern-ID: ASN_GLYCOSYLATION PS00001 PDOC00001
Pattern-DE: N-glycosylation site
Pattern: N[^P][ST][^P]
238 NGTT
Pattern-ID: PKC_PHOSPHO_SITE PS00005 PDOC00005
Pattern-DE: Protein kinase C phosphorylation site
Pattern: [ST].[RK]
147 TSK
200 SAK
Pattern-ID: CK2_PHOSPHO_SITE PS00006 PDOC00006
Pattern-DE: Casein kinase II phosphorylation site
Pattern: [ST].{2}[DE]
56 TPAD
128 SPFE
Pattern-ID: MYRISTYL PS00008 PDOC00008
Pattern-DE: N-myristoylation site
Pattern: G[^EDRKHPFYW].{2}[STAGCN][^P]
26 GAVVAG
228 GIVVNI
________________________________________________________________________________
Result of SEG low-complexity search (JC Wootton & S Federhen):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
please quote: J C Wootton & S Federhen: Analysis of compositionally
biased regions in sequence databases. Meth. in Enzymol.
1996, 266, 554-571.
NOTE 1: regions of low-complexity ('simple sequence' or 'compo-
sition biased regions') are marked by the letter 'x' in
the following output.
NOTE 2: The dynamic programming algorithm (MaxHom) does NOT take
the SEG information into account, nor do the PHD pre-
dictions!
!!! --> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! <-- !!!
!!! --> WE STRONGLY suggest that you resubmit the regions NOT marked by <-- !!!
!!! --> 'x' separately!! <-- !!!
!!! --> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! <-- !!!
________________________________________________________________________________
prot (#) ppOld, default: single protein sequence description=tonb e.c. /home/phd/server/work/predict_h32463 from: 1 to: 244
prot (#) ppOld, default: single protein sequence description=tonb e.c. /home/phd/server/work/predict_h32463
/home/phd/server/work/predict_h32463.segNormGcg Length: 244 11-Jul-99 Check: 2818 ..
1 MIMTSMTLDL PRRFPWPTLL SVCIHGAVVA GLLYTSVHQV IELPAPAQPI
51 SVTMVTPADL xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx
101 xxxxxxxxxx xxxxxxRDVK PVESRPASPF ENxxxxxxxx xxxxxxxxxx
151 xxxxxxGPRA LSRNQPQYPA RAQALRIEGQ VKVKFDVTPD GRVDNVQILS
201 AKPANMFERE VKNAMRRWRY EPGKPGSGIV VNILFKINGT TEIQ
________________________________________________________________________________
Result of ProDom domain search (Sonnhammer; Corpet, Gouzy, Kahn):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- please quote: ELL Sonnhammer & D Kahn, Prot. Sci., 1994, 3, 482-492
________________________________________________________________________________
--- ------------------------------------------------------------
--- Results from running BLAST against PRODOM domains
---
--- PLEASE quote:
--- F Corpet, J Gouzy, D Kahn (1998). The ProDom database
--- of protein domain families. Nucleic Ac Res 26:323-326.
---
--- BEGIN of BLASTP output
BLASTP 1.4.7 [16-Oct-94] [Build 12:52:03 Oct 30 1994]
Reference: Altschul, Stephen F., Warren Gish, Webb Miller, Eugene W. Myers,
and David J. Lipman (1990). Basic local alignment search tool. J. Mol. Biol.
215:403-10.
Query= prot (#) ppOld, default: single protein sequence description=tonb e.c.
/home/phd/server/work/predict_h32463
(244 letters)
Database: prodom_00_1
174,952 sequences; 19,895,393 total letters.
Searching..................................................done
Smallest
Sum
High Probability
Sequences producing High-scoring Segment Pairs: Score P(N) N
PD005186 p2000.1 (10) TONB(8) Q9ZHV8(1) Q9ZJP4(1) // PR... 263 2.8e-30 1
PD010309 p2000.1 (4) TONB(4) // TONB PROTEIN TRANSPORT ... 198 1.1e-20 1
PD015378 p2000.1 (2) TONB(2) // TONB PROTEIN TRANSPORT ... 177 9.5e-18 1
PD188847 p2000.1 (2) TONB(2) // TONB PROTEIN TRANSPORT ... 101 7.3e-10 2
PD193463 p2000.1 (1) TONB_ECOLI // TONB PROTEIN TRANSPOR... 116 5.2e-09 1
PD022611 p2000.1 (4) IF4G(2) O43177(1) O96046(1) // INI... 107 5.7e-09 3
PD000540 p2000.1 (1342) H1(18) O76786(11) TONB(10) // P... 122 1.1e-08 1
PD011390 p2000.1 (21) SRE1(2) O14686(2) O14687(2) // PR... 109 6.9e-07 1
PD181779 p2000.1 (1) P76017_ECOLI // FROM BASES 1243821 ... 61 1.3e-06 2
PD058122 p2000.1 (1) P72605_SYNY3 // HYPOTHETICAL 55.5 K... 58 1.6e-06 2
PD191000 p2000.1 (2) O84603(1) Q9Z7C3(1) // PROTEIN CT598 81 1.1e-05 2
PD039984 p2000.1 (3) O00204(1) O00205(1) O75814(1) // H... 67 1.4e-05 2
PD041502 p2000.1 (2) GNDS(2) // GUANINE NUCLEOTIDE DISS... 69 2.2e-05 3
PD155475 p2000.1 (2) SRE2(2) // STEROL REGULATORY ELEME... 89 3.6e-05 2
PD002948 p2000.1 (36) P93237(9) Q25434(6) Q40376(4) // ... 84 4.8e-05 1
PD041936 p2000.1 (2) DNJM(2) // DNAJ-LIKE PROTEIN MG200... 60 9.3e-05 3
PD029489 p2000.1 (1) G3PT_HUMAN // PUTATIVE GLYCERALDEHY... 87 0.00010 1
PD064632 p2000.1 (1) Q23446_CAEEL // SIMILARITY TO CCAAT... 76 0.00015 2
PD148762 p2000.1 (1) O01905_CAEEL // SIMILARITY TO HUMAN... 79 0.00015 2
PD140516 p2000.1 (1) O54201_STRCL // PBP2 PROTEIN PENICI... 85 0.00016 2
PD059755 p2000.1 (1) Q69565_VVVVV // HHV-6 U1102, VARIAN... 60 0.00025 2
PD061585 p2000.1 (1) O31240_ANASP // PKND 77 0.00032 2
PD145083 p2000.1 (1) Q23447_CAEEL // CODED FOR BY C. ELE... 71 0.00038 3
PD029337 p2000.1 (15) Q25388(4) LSTP(3) O01761(3) // RE... 87 0.00045 1
PD131059 p2000.1 (1) O54963_RAT // ZINC FINGER TRANSCRIP... 64 0.00045 2
PD124778 p2000.1 (4) Q9Z2B6(2) O43150(1) O97902(1) // A... 69 0.00053 2
PD218236 p2000.1 (1) O97007_LEIMA // L7610.4 PROTEIN 56 0.00064 3
>PD005186 p2000.1 (10) TONB(8) Q9ZHV8(1) Q9ZJP4(1) // PROTEIN TRANSPORT TONB
INNER MEMBRANE PERIPLASMIC TRANSMEMBRANE REPEAT PHAGE RECOGNITION
Length = 78
Score = 263 (119.2 bits), Expect = 2.8e-30, P = 2.8e-30
Identities = 50/76 (65%), Positives = 58/76 (76%)
Query: 168 YPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGS 227
YP AQA IEG+VKVKF + DGRV ++++L A P NMFEREVK AMR+WRYE G PG
Sbjct: 2 YPKMAQARGIEGEVKVKFTINADGRVTDIKVLKANPKNMFEREVKQAMRKWRYEAGVPGG 61
Query: 228 GIVVNILFKINGTTEI 243
IVV I FKINGTTE+
Sbjct: 62 DIVVTIKFKINGTTEL 77
>PD010309 p2000.1 (4) TONB(4) // TONB PROTEIN TRANSPORT INNER MEMBRANE
PERIPLASMIC TRANSMEMBRANE REPEAT PHAGE RECOGNITION
Length = 39
Score = 198 (89.7 bits), Expect = 1.1e-20, P = 1.1e-20
Identities = 38/39 (97%), Positives = 38/39 (97%)
Query: 6 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELP 44
MTLDLPRRFPWPTLLSV IHGAVVAGLLYTSVHQVIELP
Sbjct: 1 MTLDLPRRFPWPTLLSVAIHGAVVAGLLYTSVHQVIELP 39
>PD015378 p2000.1 (2) TONB(2) // TONB PROTEIN TRANSPORT INNER MEMBRANE
PERIPLASMIC TRANSMEMBRANE REPEAT PHAGE RECOGNITION
Length = 40
Score = 177 (80.2 bits), Expect = 9.5e-18, P = 9.5e-18
Identities = 36/40 (90%), Positives = 37/40 (92%)
Query: 128 SPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQ 167
SPFENTAPAR TSSTATAATSKP TSV +GPRALSRNQPQ
Sbjct: 1 SPFENTAPARPTSSTATAATSKPATSVPTGPRALSRNQPQ 40
>PD188847 p2000.1 (2) TONB(2) // TONB PROTEIN TRANSPORT INNER MEMBRANE
PERIPLASMIC TRANSMEMBRANE REPEAT
Length = 55
Score = 101 (45.8 bits), Expect = 7.3e-10, Sum P(2) = 7.3e-10
Identities = 21/37 (56%), Positives = 24/37 (64%)
Query: 10 LPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAP 46
L R WP SV IHGA++AGLLY SV Q+ E P P
Sbjct: 8 LNRWITWPFAFSVGIHGALIAGLLYASVEQMREQPEP 44
Score = 35 (15.9 bits), Expect = 7.3e-10, Sum P(2) = 7.3e-10
Identities = 6/9 (66%), Positives = 7/9 (77%)
Query: 75 EPEPEPEPI 83
+PEPE PI
Sbjct: 41 QPEPEDAPI 49
>PD193463 p2000.1 (1) TONB_ECOLI // TONB PROTEIN TRANSPORT INNER MEMBRANE
PERIPLASMIC TRANSMEMBRANE REPEAT PHAGE RECOGNITION COLICIN
Length = 23
Score = 116 (52.6 bits), Expect = 5.2e-09, P = 5.2e-09
Identities = 23/23 (100%), Positives = 23/23 (100%)
Query: 45 APAQPISVTMVTPADLEPPQAVQ 67
APAQPISVTMVTPADLEPPQAVQ
Sbjct: 1 APAQPISVTMVTPADLEPPQAVQ 23
...
Parameters:
E=0.001
B=500
V=500
-ctxfactor=1.00
Query ----- As Used ----- ----- Computed ----
Frame MatID Matrix name Lambda K H Lambda K H
+0 0 BLOSUM62 0.314 0.132 0.387 same same same
Query
Frame MatID Length Eff.Length E S W T X E2 S2
+0 0 244 244 0.0010 88 3 11 23 0.25 33
Statistics:
Query Expected Observed HSPs HSPs
Frame MatID High Score High Score Reportable Reported
+0 0 63 (28.5 bits) 263 (119.2 bits) 89 89
Query Neighborhd Word Excluded Failed Successful Overlaps
Frame MatID Words Hits Hits Extensions Extensions Excluded
+0 0 5023 11818274 2902988 8858087 57183 8670
Database: prodom_00_1
Release date: unknown
Posted date: 5:56 PM EDT Jun 21, 2000
# of letters in database: 19,895,393
# of sequences in database: 174,952
# of database sequences satisfying E: 27
No. of states in DFA: 547 (54 KB)
Total size of DFA: 105 KB (128 KB)
Time to generate neighborhood: 0.01u 0.00s 0.01t Real: 00:00:00
Time to search database: 16.55u 0.05s 16.60t Real: 00:00:17
Total cpu time: 16.58u 0.07s 16.65t Real: 00:00:17
--- END of BLASTP output
--- ------------------------------------------------------------
---
--- Again: these results were obtained based on the domain data-
--- base collected by Daniel Kahn and his coworkers in Toulouse.
---
--- PLEASE quote:
--- F Corpet, J Gouzy, D Kahn (1998). The ProDom database
--- of protein domain families. Nucleic Ac Res 26:323-326.
---
--- The general WWW page is on:
---- ---------------------------------------
--- http://www.toulouse.inra.fr/prodom.html
---- ---------------------------------------
---
--- For WWW graphic interfaces to PRODOM, in particular for your
--- protein family, follow the following links (each line is ONE
--- single link for your protein!!):
---
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD005186 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD005186
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD005186 ==> graphical output of all proteins having domain PD005186
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD010309 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD010309
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD010309 ==> graphical output of all proteins having domain PD010309
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD015378 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD015378
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD015378 ==> graphical output of all proteins having domain PD015378
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD188847 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD188847
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD188847 ==> graphical output of all proteins having domain PD188847
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD193463 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD193463
http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD193463 ==> graphical output of all proteins having domain PD193463
...
---
--- NOTE: if you want to use the link, make sure the entire line
--- is pasted as URL into your browser!
---
--- END of PRODOM
--- ------------------------------------------------------------
________________________________________________________________________________
The alignment that has been used as input to the network is:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________________________________________________________
---
--- Version of database searched for alignment:
--- SWISS-PROT release 39.0 (5/00) with 85 249 proteins
---
--- ------------------------------------------------------------
--- MAXHOM multiple sequence alignment
--- ------------------------------------------------------------
---
--- MAXHOM ALIGNMENT HEADER: ABBREVIATIONS FOR SUMMARY
--- ID : identifier of aligned (homologous) protein
--- STRID : PDB identifier (only for known structures)
--- IDE : percentage of pairwise sequence identity
--- WSIM : percentage of weighted similarity
--- LALI : number of residues aligned
--- NGAP : number of insertions and deletions (indels)
--- LGAP : number of residues in all indels
--- LSEQ2 : length of aligned sequence
--- ACCNUM : SwissProt accession number
--- OMIM : OMIM (Online Mendelian Inheritance in Man) ID
--- NAME : one-line description of aligned protein
---
--- MAXHOM ALIGNMENT HEADER: SUMMARY
ID STRID IDE WSIM LALI NGAP LGAP LSEQ2 ACCNUM NAME
tonb_ecoli 100 100 239 0 0 239 P94739 TONB PROTEIN.
tonb_salty 87 88 238 3 5 242 P25945 TONB PROTEIN.
tonb_klepn 77 74 233 7 11 243 P45610 TONB PROTEIN.
tonb_entae 76 77 233 4 7 243 P46383 TONB PROTEIN.
tonb_serma 54 59 239 5 11 247 P26185 TONB PROTEIN.
tonb_yeren 46 52 241 5 15 255 Q05740 TONB PROTEIN.
tonb_pseae 37 40 225 5 32 342 Q51368 TONB PROTEIN.
tonb_haein 32 28 231 8 39 270 P42872 TONB PROTEIN.
tonb_haedu 30 31 224 7 57 279 O51810 TONB PROTEIN.
---
--- MAXHOM ALIGNMENT: IN MSF FORMAT
MSF of: /home/phd/server/work/predict_h32463.hsspFilter from: 1 to: 244
/home/phd/server/work/predict_h32463.msfRet MSF: 244 Type: P 28-Jun-01 05:00:2 Check: 7659 ..
Name: predict_h3240 Len: 244 Check: 3879 Weight: 1.00
Name: tonb_ecoli Len: 244 Check: 3364 Weight: 1.00
Name: tonb_salty Len: 244 Check: 1399 Weight: 1.00
Name: tonb_klepn Len: 244 Check: 8093 Weight: 1.00
Name: tonb_entae Len: 244 Check: 7194 Weight: 1.00
Name: tonb_serma Len: 244 Check: 4523 Weight: 1.00
Name: tonb_yeren Len: 244 Check: 5387 Weight: 1.00
Name: tonb_pseae Len: 244 Check: 2973 Weight: 1.00
Name: tonb_haein Len: 244 Check: 1105 Weight: 1.00
Name: tonb_haedu Len: 244 Check: 9742 Weight: 1.00
//
1 50
predict_h3240 MIMTSMTLDL PRRFPWPTLL SVCIHGAVVA GLLYTSVHQV IELPAPAQPI
tonb_ecoli .....MTLDL PRRFPWPTLL SVCIHGAVVA GLLYTSVHQV IELPAPAQPI
tonb_salty .....MTLDL PRRFPWPTLL SVGIHGAVVA GLLYTSVHQV IELPAPAQPI
tonb_klepn .....MTLDL PRRFPWPTLL SVAIHGAVVA GLLYTSVHQV IEQPSPTQPI
tonb_entae .....MTLDL PRRFPWPTLL SVAIHGAVVA GLLYTSVHQV IEKPSPSQPI
tonb_serma ..MPLKKMFL NRRISVPFVL SVGLHSALVA GLLYASVKEV VELPKPeaPI
tonb_yeren ..MQLNKFFL GRWLTWPLAF SVGIHGSVIA ALLYVSVEQm iQPEIEDAPI
tonb_pseae .......... .SRWWLSSGA AVAMHVAIIG ALVWVMPTPa lGHGELPKTM
tonb_haein .......MQQ TKRSLLGLLI SLIAHGIVIG FILWnsDSAN SAQGDISTSI
tonb_haedu .......... .KHSRIGLIS SVFIHIVLFA SFISLVEVSH SDLSDGDSPL
51 100
predict_h3240 SVTMVTPADL EPPQAVQPPP EPVVEPEPEP EPIPEPPKEA PVVIEKPKPK
tonb_ecoli SVTMVTPADL EPPQAVQPPP EPVVEPEPEP EPIPEPPKEA PVVIEKPKPK
tonb_salty TVTMVSPADL EPPQAVQPPP EPVvePEPEP EPIPEPPKEA PVVIekPKPK
tonb_klepn EITMVAPADL EPPPA.QPVV EPVvePEPEP EVVPEPPKEA VVIhpKPKPK
tonb_entae EITMVAPADL EPPQAAQPVV EPVvePEPEP EVVPEPPKEV PVVIHKPEPK
tonb_serma SVMMVNTAAM AEPPPPAPAE PEPpePEPEP EPIVEPPPKA IVKPEPVKPK
tonb_yeren AVTMVNIDTF AAPQPaePQA EPEPEPEPEP EPIDEAPPEP EVlpEPVKPK
tonb_pseae QVNFVQLEKK AEPTPQPPAA APEPTPPKIE EPKPEPPKPK PV..EKPKPK
tonb_haein SMELLQGMVL EEPAPeePEP EPEpePEPEK QEipEPKKIK EPEKEKPKPK
tonb_haedu SIELV.AALL EQPQVAVAPE EVTsePEPEP DAIPEPITK. ..PIEKPKEK
101 150
predict_h3240 PKPKPKPVKK VQEQPKRDVK PVESRPASPF ENTAPARLTS STATAATSKP
tonb_ecoli PKPKPKPVKK VQEQPKRDVK PVESRPASPF ENTAPARLTS STATAATSKP
tonb_salty PKPKPKPVKK VEEQPKREVK PAAPRPASPF ENSAPVRPTS STA.SATSKP
tonb_klepn PKPKPKPEKK V.EQPKREVK PaePRPASPF EntAPARTAP STSTAAAKPT
tonb_entae PKPKPKPKPK PevEPKREVK PAEPRPVSPF EntAPARTAP ST.TAATAKP
tonb_serma PKPKPKP..K VEKQVKPEPK KVEPREPSPF NNDSPAKPID KAPvpAAPVQ
tonb_yeren PKPVKKEVKK PEVKKPDVKK TVAPPDDKPF KSDEPALVST NAPvpKASVP
tonb_pseae PKPKPKPVEN AIPKAKPKPE PKpsQPSPSS AAPPPAPTVG QSTPGAQTAP
tonb_haein GKPKGKPKNK PKKEVKPQKK PINKELPKGD ENidKASTTs sNAQVAGSGT
tonb_haedu PKEKPKKPEK PKEKLKKEKP KEKAKQIEAL EKGPEAKQGI VAQagASSNE
151 200
predict_h3240 VTSVASGPRA LSRNQPQYPA RAQALRIEGQ VKVKFDVTPD GRVDNVQILS
tonb_ecoli VTSVASGPRA LSRNQPQYPA RAQALRIEGQ VKVKFDVTPD GRVDNVQILS
tonb_salty AVSVPTGPRA LSRNQPQYPA RAQALRIEGR VKVKFDVTSA GRVENVQILS
tonb_klepn VTA.PSGPRA ISRVQPSYPA RAQALRIEGT VRVKFDVSPD GRIDNLQILS
tonb_entae MTTAPSVPKA LKRGDPSYPQ RAQALRIEGD VRVKFDVTAD GRVENIQILS
tonb_serma GNSREVGPRP ISRANPLYPP RAQALQIEGN VRVQFDIDSD GRVSNVRILS
tonb_yeren GVSTSTGPKA LSKAKPTYPA RALALGVEGQ VKVQYDIDEN GRVTNVRILE
tonb_pseae SGSqdSDIKP LRMDPPVYPR MAQARGIEGR VKVLFTITSD GRIDDIQVLE
tonb_haein DTSEIAAYRS AIrsHKRYPT RAKIMRKQGK VSVSFNVGAD GSLSGAKVTK
tonb_haedu INAyaALQRA LQHrnNAYPA REKMMRKTGV VTLGFTISPS GKLIDVTVLN
201 244
predict_h3240 AKPANMFERE VKNAMRRWRY EPGKPGSGIV VNILFKINGT TEIQ
tonb_ecoli AKPANMFERE VKNAMRRWRY EPGKPGSGIV VNILFKINGT TEIQ
tonb_salty AQPANMFERE VKNAMRKWRY EAGKPGSGLV VNIIFRLNGT AQIE
tonb_klepn AQPANMFERE VKSAMRRWRY QQGRPGTGVT MTIKFRLNGV E...
tonb_entae AKPANMFERD VKTAMRKWRY EAGRPGTGLT MNIKFRLNG. ....
tonb_serma AEPRNMFERE VKQAMRKWRY EA.KEAKDRT VTIRFKLNGT TELN
tonb_yeren ATPRNTFERE VKQVMRKWRF EA.VAAKDYV TTVVFKIGGT TEMD
tonb_pseae SVPSRMFDRE VRQAMAKWRF EPRVSGGKIV amFFFKIE.. ....
tonb_haein SSGDESLDKA ALDAINVSRS VGTRPasSLS VQISFTLQ.. ....
tonb_haedu SSGNQNLDAA AVQAAEATKV APPPIGfnVT VPIKFSIQ.. ....
________________________________________________________________________________
Result of COILS prediction (Andrei Lupas):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A Lupas: Methods in Enzymology, 1996, 266, 513-525.
version 2.2: Rob B. Russell & Andrei N. Lupas, 1999
________________________________________________________________________________
no coiled-coil above probability 0.5
________________________________________________________________________________
PHD: Profile fed neural network systems from HeiDelberg
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prediction of:
secondary structure, by PHDsec
solvent accessibility, by PHDacc
and helical transmembrane regions, by PHDhtm
Author:
Burkhard Rost
EMBL, 69012 Heidelberg, Germany
Internet: Rost@EMBL-Heidelberg.DE
All rights reserved.
The network systems are described in:
PHDsec: B Rost & C Sander: JMB, 1993, 232, 584-599.
B Rost & C Sander: Proteins, 1994, 19, 55-72.
PHDacc: B Rost & C Sander: Proteins, 1994, 20, 216-226.
PHDhtm: B Rost et al.: Prot. Science, 1995, 4, 521-533.
The resulting network (PHD) prediction is:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
________________________________________________________________________________
PHD: Profile fed neural network systems from HeiDelberg
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prediction of:
secondary structure, by PHDsec
solvent accessibility, by PHDacc
and helical transmembrane regions, by PHDhtm
Author:
Burkhard Rost
EMBL, 69012 Heidelberg, Germany
Internet: Rost@EMBL-Heidelberg.DE
All rights reserved.
The network systems are described in:
PHDsec: B Rost & C Sander: JMB, 1993, 232, 584-599.
B Rost & C Sander: Proteins, 1994, 19, 55-72.
PHDacc: B Rost & C Sander: Proteins, 1994, 20, 216-226.
PHDhtm: B Rost et al.: Prot. Science, 1995, 4, 521-533.
Some statistics
~~~~~~~~~~~~~~~
Percentage of amino acids:
+--------------+--------+--------+--------+--------+--------+
| AA: | P | V | A | K | E |
| % of AA: | 16.4 | 10.7 | 8.6 | 7.4 | 7.4 |
+--------------+--------+--------+--------+--------+--------+
| AA: | T | S | R | L | Q |
| % of AA: | 6.1 | 5.7 | 5.7 | 5.3 | 4.9 |
+--------------+--------+--------+--------+--------+--------+
| AA: | I | G | N | M | D |
| % of AA: | 4.9 | 3.7 | 2.9 | 2.5 | 2.5 |
+--------------+--------+--------+--------+--------+--------+
| AA: | F | Y | W | H | C |
| % of AA: | 2.0 | 1.2 | 0.8 | 0.8 | 0.4 |
+--------------+--------+--------+--------+--------+--------+
Percentage of secondary structure predicted:
+--------------+--------+--------+--------+
| SecStr: | H | E | L |
| % Predicted: | 13.5 | 19.7 | 66.8 |
+--------------+--------+--------+--------+
According to the following classes:
all-alpha: %H>45 and %E< 5; all-beta : %H<5 and %E>45
alpha-beta : %H>30 and %E>20; mixed: rest,
this means that the predicted class is: mixed class
PHD output for your protein
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Thu Jun 28 05:00:32 2001
Jury on: 10 different architectures (version 5.94_317 ).
Note: differently trained architectures, i.e., different versions can
result in different predictions.
About the protein
~~~~~~~~~~~~~~~~~
HEADER /home/phd/server/work/predict_h32463.fas
COMPND
SOURCE
AUTHOR
SEQLENGTH 244
NCHAIN 1 chain(s) in predict_h32463 data set
NALIGN 9
(=number of aligned sequences in HSSP file)
Abbreviations: PHDsec
~~~~~~~~~~~~~~~~~~~~~
sequence:
AA : amino acid sequence
secondary structure:
HEL: H=helix, E=extended (sheet), blank=other (loop)
PHD: Profile network prediction HeiDelberg
Rel: Reliability index of prediction (0-9)
detail:
prH: 'probability' for assigning helix
prE: 'probability' for assigning strand
prL: 'probability' for assigning loop
note: the 'probabilites' are scaled to the interval 0-9, e.g.,
prH=5 means, that the first output node is 0.5-0.6
subset:
SUB: a subset of the prediction, for all residues with an expected
average accuracy > 82% (tables in header)
note: for this subset the following symbols are used:
L: is loop (for which above " " is used)
".": means that no prediction is made for this residue, as the
reliability is: Rel < 5
Abbreviations: PHDacc
~~~~~~~~~~~~~~~~~~~~~
SS : secondary structure
HEL: H=helix, E=extended (sheet), blank=other (loop)
solvent accessibility:
3st: relative solvent accessibility (acc) in 3 states:
b = 0-9%, i = 9-36%, e = 36-100%.
PHD: Profile network prediction HeiDelberg
Rel: Reliability index of prediction (0-9)
O_3: observed relative acc. in 3 states: B, I, E
note: for convenience a blank is used intermediate (i).
P_3: predicted relative accessibility in 3 states
10st:relative accessibility in 10 states:
= n corresponds to a relative acc. of n*n %
subset:
SUB: a subset of the prediction, for all residues with an expected
average correlation > 0.69 (tables in header)
note: for this subset the following symbols are used:
"I": is intermediate (for which above " " is used)
".": means that no prediction is made for this residue, as the
reliability is: Rel < 4
Abbreviations: PHDhtm
~~~~~~~~~~~~~~~~~~~~~
secondary structure:
HL: T=helical transmembrane region, blank=other (loop)
PHD: Profile network prediction HeiDelberg
PHDF:filtered prediction, i.e., too long transmembrane segments
are split, too short ones are deleted
Rel: Reliability index of prediction (0-9)
detail:
prH: 'probability' for assigning helical transmembrane region
prL: 'probability' for assigning loop
note: the 'probabilites' are scaled to the interval 0-9, e.g.,
prH=5 means, that the first output node is 0.5-0.6
subset:
SUB: a subset of the prediction, for all residues with an expected
average accuracy > 82% (tables in header)
note: for this subset the following symbols are used:
L: is loop (for which above " " is used)
".": means that no prediction is made for this residue, as the
reliability is: Rel < 5
protein: predict length 244
....,....1....,....2....,....3....,....4....,....5....,....6
AA |MIMTSMTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADL|
PHD sec | EEEE EEEEEHHHHHHHHHEEEEEEEEE EEEEEE |
Rel sec |964234167788687883378722345541112222221322799999679998427899|
detail:
prH sec |000000000000000000000124556664443332222100000000000000001000|
prE sec |023566511100211103678742221224445554444554100000279988631000|
prL sec |976332477888788885310132221100001112223245799999710001357889|
subset: SUB sec |LL.....LLLLLLLLLL..EEE....HH..............LLLLLLLEEEEE..LLLL|
accessibility
3st: P_3 acc |bbbebbebebee bebbbbbbbbbbbbbbbbbbbbbbeebbeeeeeeeebbbbbbbebeb|
10st: PHD acc |000600606076507000000000000000000000066007678787700000006070|
Rel acc |025034021111101010367857667999687113211422012321052928610110|
subset: SUB acc |..b..b.............bbbbbbbbbbbbbb......b.........b.b.bb.....|
....,....7....,....8....,....9....,....10...,....11...,....12
AA |EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVK|
PHD sec | |
Rel sec |999999999999999999998999999999732368999999999989886768898989|
detail:
prH sec |000000000000000000000000000000000000000000000011012111001000|
prE sec |000000000000000000000000000000134321000000000000000000000000|
prL sec |999999999999999999998899999999865578999999999988887778898889|
subset: SUB sec |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL...LLLLLLLLLLLLLLLLLLLLLLLLLL|
accessibility
3st: P_3 acc |eeeeeeeeeeeeeeeeeeeeeebee eeeeebeeeeeeeeeeeeeeeeeeeeeeeeeeee|
10st: PHD acc |776789779977777978787707757877706797979797979777777787777777|
Rel acc |310211101111111111122003200440110102031313232332430443145423|
subset: SUB acc |...........................ee...................e..ee..eee..|
....,....13...,....14...,....15...,....16...,....17...,....18
AA |PVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQ|
PHD sec | HHHHHHH |
Rel sec |999999999899998967898898899999988779986535899998168886523465|
detail:
prH sec |000000000000000000000100000000000000011100000001478887753100|
prE sec |000000000000000011000000000000000110000132000000000000000212|
prL sec |999999999889998977898888899999988789987657899998521001246677|
subset: SUB sec |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL.LLLLLLL.HHHHHH...LL|
accessibility
3st: P_3 acc |eeeeeeeeeeeeeeebeeeeeebeebeeeeeebeeebeebbeee eebbb bebbebebe|
10st: PHD acc |797977977779777077777707707777770777076007795770005060060706|
Rel acc |313123111031013421022101011132001011111321211320100626311242|
subset: SUB acc |...............b...................................b.b....b.|
....,....19...,....20...,....21...,....22...,....23...,....24
AA |VKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGT|
PHD sec |EEEEEEE EEEEEEE HHHHHHHHHHHHHHHHH EEEEEEEEE |
Rel sec |599998389995323697504981489999999999975618989981579999983694|
detail:
prH sec |000000000000000100000005688999999999877740000000000000000001|
prE sec |799998510002356787642000000000000000000000000014689999986202|
prL sec |200001389987533101246884200000000000012158988985210000013786|
subset: SUB sec |EEEEEE.LLLLL...EEEE..LL..HHHHHHHHHHHHHHH.LLLLLL.EEEEEEEE.LL.|
accessibility
3st: P_3 acc |bebebeb eeb bebbebbebebeebbeeebeebbeeb eebeeebeebbbbbbbebebb|
10st: PHD acc |060606059705060060070607700667067006705670777077000000060600|
Rel acc |729271510331203708611201104100823541120010223022237091915140|
subset: SUB acc |b.b.b.b........b.bb.......b...b..bb...............b.b.b.b.b.|
....,....25...,....26...,....27...,....28...,....29...,....30
AA |TEIQ|
PHD sec | E |
Rel sec |2149|
detail:
prH sec |0000|
prE sec |3430|
prL sec |5469|
subset: SUB sec |...L|
accessibility
3st: P_3 acc |eebe|
10st: PHD acc |6609|
Rel acc |1111|
subset: SUB acc |....|
PHDhtm Helical transmembrane prediction
note: PHDacc and PHDsec are reliable for water-
soluble globular proteins, only. Thus,
please take the predictions above with
particular caution wherever transmembrane
helices are predicted by PHDhtm!
PHDhtm
---
--- PhdTopology REFINEMENT AND TOPOLOGY PREDICTION: SYMBOLS
--- AA : amino acid in one-letter code
--- PHD htm : HTM's predicted by the PHD neural network
--- system (T=HTM, ' '=not HTM)
--- Rel htm : Reliability index of prediction (0-9, 0 is low)
--- detail : Neural network output in detail
--- prH htm : 'Probability' for assigning a helical trans-
--- membrane region (HTM)
--- prL htm : 'Probability' for assigning a non-HTM region
--- note: 'Probabilites' are scaled to the interval
--- 0-9, e.g., prH=5 means, that the first
--- output node is 0.5-0.6
--- subset : Subset of more reliable predictions
--- SUB htm : All residues for which the expected average
--- accuracy is > 82% (tables in header).
--- note: for this subset the following symbols are used:
--- L: is loop (for which above ' ' is used)
--- '.': means that no prediction is made for this,
--- residue as the reliability is: Rel < 5
--- other : predictions derived based on PHDhtm
--- PHDFhtm : filtered prediction, i.e., too long HTM's are
--- split, too short ones are deleted
--- PHDRhtm : refinement of neural network output
--- PHDThtm : topology prediction based on refined model
--- symbols used:
--- i: intra-cytoplasmic
--- T: transmembrane region
--- o: extra-cytoplasmic
---
--- PhdTopology REFINEMENT AND TOPOLOGY PREDICTION
....,....1....,....2....,....3....,....4....,....5....,....6
AA |MIMTSMTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADL|
PHD htm | TTTTTTTTTTTTTTTTT |
Rel htm |999999999999988863046778888888876520367788999999999999999999|
detail:
prH htm |000000000000000013578889999999988764311100000000000000000000|
prL htm |999999999999999986421110000000011235688899999999999999999999|
PHDRhtm | TTTTTTTTTTTTTTTTTT |
PHDThtm |iiiiiiiiiiiiiiiiiiTTTTTTTTTTTTTTTTTToooooooooooooooooooooooo|
subset: SUB htm |LLLLLLLLLLLLLLLLL..HHHHHHHHHHHHHHH...LLLLLLLLLLLLLLLLLLLLLLL|
....,....7....,....8....,....9....,....10...,....11...,....12
AA |EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVK|
PHD htm | |
Rel htm |999999999999999999999999999999999999999999999999999999999999|
detail:
prH htm |000000000000000000000000000000000000000000000000000000000000|
prL htm |999999999999999999999999999999999999999999999999999999999999|
PHDRhtm | |
PHDThtm |oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo|
subset: SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL|
....,....13...,....14...,....15...,....16...,....17...,....18
AA |PVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQ|
PHD htm | |
Rel htm |999999999999999999999999999999999999999999999999999999999999|
detail:
prH htm |000000000000000000000000000000000000000000000000000000000000|
prL htm |999999999999999999999999999999999999999999999999999999999999|
PHDRhtm | |
PHDThtm |oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo|
subset: SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL|
....,....19...,....20...,....21...,....22...,....23...,....24
AA |VKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGT|
PHD htm | |
Rel htm |999999999999999999999999999999999999999999999999999989999999|
detail:
prH htm |000000000000000000000000000000000000000000000000000000000000|
prL htm |999999999999999999999999999999999999999999999999999999999999|
PHDRhtm | |
PHDThtm |oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo|
subset: SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL|
....,....25...,....26...,....27...,....28...,....29...,....30
AA |TEIQ|
PHD htm | |
Rel htm |9999|
detail:
prH htm |0000|
prL htm |9999|
PHDRhtm | |
PHDThtm |oooo|
subset: SUB htm |LLLL|
---
--- PhdTopology REFINEMENT AND TOPOLOGY PREDICTION END
---
________________________________________________________________________________
Result of ASP prediction(Malin Young, Kent Kirshenbaum, Stefan Highsmith)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Kirshenbaum K, Young M and Highsmith S.
Prot. Sci.(1999) 8:1806-1815.
Young M, Kirshenbaum K, Dill KA and Highsmith S.
Prot. Sci.(1999) 8:1752-1764.
________________________________________________________________________________
Ambivalent Sequence Predictor (ASP v1.0) mmy
Parameters:
Window size : 5
Min mu dPr : 9
Z-score cutoff : -1.75
Mean dPr score=15.018, Standard deviation=2.970
....,....1....,....2....,....3....,....4....,....5....,....6
AA |MIMTSMTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADL|
prH sec |000000000000000000000124556664443332222100000000000000001000|
prE sec |023566511100211103678742221224445554444554100000279988631000|
prL sec |976332477888788885310132221100001112223245799999710001357889|
ASP sec |......................SSSSSSSSSSSSSSSSSSS...................|
....,....7....,....8....,....9....,....10...,....11...,....12
AA |EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVK|
prH sec |000000000000000000000000000000000000000000000011012111001000|
prE sec |000000000000000000000000000000134321000000000000000000000000|
prL sec |999999999999999999998899999999865578999999999988887778898889|
ASP sec |............................................................|
....,....13...,....14...,....15...,....16...,....17...,....18
AA |PVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQ|
prH sec |000000000000000000000100000000000000011100000001478887753100|
prE sec |000000000000000011000000000000000110000132000000000000000212|
prL sec |999999999889998977898888899999988789987657899998521001246677|
ASP sec |............................................................|
....,....19...,....20...,....21...,....22...,....23...,....24
AA |VKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGT|
prH sec |000000000000000100000005688999999999877740000000000000000001|
prE sec |799998510002356787642000000000000000000000000014689999986202|
prL sec |200001389987533101246884200000000000012158988985210000013786|
ASP sec |............................................................|
....,....25...,....26...,....27...,....28...,....29...,....30
AA |TEIQ|
prH sec |0000|
prE sec |3430|
prL sec |5469|
ASP sec |....|
Please note: ASP was designed to identify the location of conformational
switches in amino acid sequences. It is NOT designed to predict whether
a given sequence does or does not contain a switch. For best results,
ASP should be used on sequences of length >150 amino acids with >10
sequence homologues in the SWISS-PROT data bank.
ASP has been validated against a set of globular proteins and may not
be generally applicable. Please see Young et al., Protein Science
8(9):1852-64. 1999. for details and for how best to interpret this
output. We consider ASP to be experimental at this time, and would
appreciate any feedback from our users.
________________________________________________________________________________
________________________________________________________________________________
-----------------------------------------------------------------------------
- PredictProtein (PP): News 2000 -
-----------------------------------------------------------------------------
- -
- PP home: -
New York http://cubic.bioc.columbia.edu/predictprotein/
- -
- PP mirrors: -
Australia Sydney http://molmod.angis.org.au/predictprotein/
Germany EMBL http://www.embl-heidelberg.de/predictprotein/
China CBI,Peking http://www.cbi.pku.edu.cn/predictprotein/
China Inst. Microbiol. http://micronet.im.ac.cn/predictprotein/
England EBI http://www.ebi.ac.uk/~rost/predictprotein/
India CDFD http://www.cdfd.org.in/~www/pp/predictprotein/
India Pune http://202.41.70.33/predictprotein/
Iran Tehran http://www.ibc.ut.ac.ir/predictprotein/
Israel Beer-Sheva http://www.cs.bgu.ac.il/~dfischer/predictprotein/
Italy Rome http://obelix.bio.uniroma2.it/www/predictprotein/
Mexico Cuernavaca UNAM http://www.ibt.unam.mx/paginas/lorenzo/predictprotein/
Netherlands CMBI http://www.cmbi.kun.nl/bioinf/predictprotein/
Russia Puschino http://mirror.protres.ru/predictprotein/
Singapore http://embl.bic.nus.edu.sg/predictprotein/
Spain CNB http://www.es.embnet.org/Services/MolBio/PredictProtein/
Switzerland Glaxo http://www.gwer.ch/tools/predictprotein/
USA San Diego SDSC http://www.sdsc.edu/predictprotein/
- -
- Tools to post-process PP results: -
- -
- Generate a PostScript (or GIF, or TIFF): -
ESPript (New York) http://cubic.bioc.columbia.edu/cgi/pp/nph-ESPript_exe.cgi
ESPript (Toulouse) http://www-pgm1.ipbs.fr:8080/ESPript
- -
-----------------------------------------------------------------------------
Latest update of content: June 28, 2001 Ralf Koebnik
| ||||||||||||||||||||||||