The following information has been received by the server: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ reference predict_h32463 (Jun 28, 2001 04:59:37) reference pred_h32463 (Jun 28, 2001 04:59:27) PPhdr from: koebnik@genetik.uni-halle.de PPhdr resp: MAIL PPhdr orig: HTML PPhdr want: ASCII PPhdr password(###) prediction of: - default prediction of: - PHDsec PHDacc PHDhtm ProSite SEG ProDom return msf format # default: single protein sequence description=TonB E.c. MIMTSMTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEP EPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRA LSRNQPQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGT TEIQ ________________________________________________________________________________ Result of PROSITE search (Amos Bairoch): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ please quote: A Bairoch, P Bucher & K Hofmann: The PROSITE database, its status in 1997. Nucl. Acids Res., 1997, 25, 217-221. ________________________________________________________________________________ -------------------------------------------------------- Pattern-ID: ASN_GLYCOSYLATION PS00001 PDOC00001 Pattern-DE: N-glycosylation site Pattern: N[^P][ST][^P] 238 NGTT Pattern-ID: PKC_PHOSPHO_SITE PS00005 PDOC00005 Pattern-DE: Protein kinase C phosphorylation site Pattern: [ST].[RK] 147 TSK 200 SAK Pattern-ID: CK2_PHOSPHO_SITE PS00006 PDOC00006 Pattern-DE: Casein kinase II phosphorylation site Pattern: [ST].{2}[DE] 56 TPAD 128 SPFE Pattern-ID: MYRISTYL PS00008 PDOC00008 Pattern-DE: N-myristoylation site Pattern: G[^EDRKHPFYW].{2}[STAGCN][^P] 26 GAVVAG 228 GIVVNI ________________________________________________________________________________ Result of SEG low-complexity search (JC Wootton & S Federhen): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ please quote: J C Wootton & S Federhen: Analysis of compositionally biased regions in sequence databases. Meth. in Enzymol. 1996, 266, 554-571. NOTE 1: regions of low-complexity ('simple sequence' or 'compo- sition biased regions') are marked by the letter 'x' in the following output. NOTE 2: The dynamic programming algorithm (MaxHom) does NOT take the SEG information into account, nor do the PHD pre- dictions! !!! --> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! <-- !!! !!! --> WE STRONGLY suggest that you resubmit the regions NOT marked by <-- !!! !!! --> 'x' separately!! <-- !!! !!! --> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! <-- !!! ________________________________________________________________________________ prot (#) ppOld, default: single protein sequence description=tonb e.c. /home/phd/server/work/predict_h32463 from: 1 to: 244 prot (#) ppOld, default: single protein sequence description=tonb e.c. /home/phd/server/work/predict_h32463 /home/phd/server/work/predict_h32463.segNormGcg Length: 244 11-Jul-99 Check: 2818 .. 1 MIMTSMTLDL PRRFPWPTLL SVCIHGAVVA GLLYTSVHQV IELPAPAQPI 51 SVTMVTPADL xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx 101 xxxxxxxxxx xxxxxxRDVK PVESRPASPF ENxxxxxxxx xxxxxxxxxx 151 xxxxxxGPRA LSRNQPQYPA RAQALRIEGQ VKVKFDVTPD GRVDNVQILS 201 AKPANMFERE VKNAMRRWRY EPGKPGSGIV VNILFKINGT TEIQ ________________________________________________________________________________ Result of ProDom domain search (Sonnhammer; Corpet, Gouzy, Kahn): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - please quote: ELL Sonnhammer & D Kahn, Prot. Sci., 1994, 3, 482-492 ________________________________________________________________________________ --- ------------------------------------------------------------ --- Results from running BLAST against PRODOM domains --- --- PLEASE quote: --- F Corpet, J Gouzy, D Kahn (1998). The ProDom database --- of protein domain families. Nucleic Ac Res 26:323-326. --- --- BEGIN of BLASTP output BLASTP 1.4.7 [16-Oct-94] [Build 12:52:03 Oct 30 1994] Reference: Altschul, Stephen F., Warren Gish, Webb Miller, Eugene W. Myers, and David J. Lipman (1990). Basic local alignment search tool. J. Mol. Biol. 215:403-10. Query= prot (#) ppOld, default: single protein sequence description=tonb e.c. /home/phd/server/work/predict_h32463 (244 letters) Database: prodom_00_1 174,952 sequences; 19,895,393 total letters. Searching..................................................done Smallest Sum High Probability Sequences producing High-scoring Segment Pairs: Score P(N) N PD005186 p2000.1 (10) TONB(8) Q9ZHV8(1) Q9ZJP4(1) // PR... 263 2.8e-30 1 PD010309 p2000.1 (4) TONB(4) // TONB PROTEIN TRANSPORT ... 198 1.1e-20 1 PD015378 p2000.1 (2) TONB(2) // TONB PROTEIN TRANSPORT ... 177 9.5e-18 1 PD188847 p2000.1 (2) TONB(2) // TONB PROTEIN TRANSPORT ... 101 7.3e-10 2 PD193463 p2000.1 (1) TONB_ECOLI // TONB PROTEIN TRANSPOR... 116 5.2e-09 1 PD022611 p2000.1 (4) IF4G(2) O43177(1) O96046(1) // INI... 107 5.7e-09 3 PD000540 p2000.1 (1342) H1(18) O76786(11) TONB(10) // P... 122 1.1e-08 1 PD011390 p2000.1 (21) SRE1(2) O14686(2) O14687(2) // PR... 109 6.9e-07 1 PD181779 p2000.1 (1) P76017_ECOLI // FROM BASES 1243821 ... 61 1.3e-06 2 PD058122 p2000.1 (1) P72605_SYNY3 // HYPOTHETICAL 55.5 K... 58 1.6e-06 2 PD191000 p2000.1 (2) O84603(1) Q9Z7C3(1) // PROTEIN CT598 81 1.1e-05 2 PD039984 p2000.1 (3) O00204(1) O00205(1) O75814(1) // H... 67 1.4e-05 2 PD041502 p2000.1 (2) GNDS(2) // GUANINE NUCLEOTIDE DISS... 69 2.2e-05 3 PD155475 p2000.1 (2) SRE2(2) // STEROL REGULATORY ELEME... 89 3.6e-05 2 PD002948 p2000.1 (36) P93237(9) Q25434(6) Q40376(4) // ... 84 4.8e-05 1 PD041936 p2000.1 (2) DNJM(2) // DNAJ-LIKE PROTEIN MG200... 60 9.3e-05 3 PD029489 p2000.1 (1) G3PT_HUMAN // PUTATIVE GLYCERALDEHY... 87 0.00010 1 PD064632 p2000.1 (1) Q23446_CAEEL // SIMILARITY TO CCAAT... 76 0.00015 2 PD148762 p2000.1 (1) O01905_CAEEL // SIMILARITY TO HUMAN... 79 0.00015 2 PD140516 p2000.1 (1) O54201_STRCL // PBP2 PROTEIN PENICI... 85 0.00016 2 PD059755 p2000.1 (1) Q69565_VVVVV // HHV-6 U1102, VARIAN... 60 0.00025 2 PD061585 p2000.1 (1) O31240_ANASP // PKND 77 0.00032 2 PD145083 p2000.1 (1) Q23447_CAEEL // CODED FOR BY C. ELE... 71 0.00038 3 PD029337 p2000.1 (15) Q25388(4) LSTP(3) O01761(3) // RE... 87 0.00045 1 PD131059 p2000.1 (1) O54963_RAT // ZINC FINGER TRANSCRIP... 64 0.00045 2 PD124778 p2000.1 (4) Q9Z2B6(2) O43150(1) O97902(1) // A... 69 0.00053 2 PD218236 p2000.1 (1) O97007_LEIMA // L7610.4 PROTEIN 56 0.00064 3 >PD005186 p2000.1 (10) TONB(8) Q9ZHV8(1) Q9ZJP4(1) // PROTEIN TRANSPORT TONB INNER MEMBRANE PERIPLASMIC TRANSMEMBRANE REPEAT PHAGE RECOGNITION Length = 78 Score = 263 (119.2 bits), Expect = 2.8e-30, P = 2.8e-30 Identities = 50/76 (65%), Positives = 58/76 (76%) Query: 168 YPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGS 227 YP AQA IEG+VKVKF + DGRV ++++L A P NMFEREVK AMR+WRYE G PG Sbjct: 2 YPKMAQARGIEGEVKVKFTINADGRVTDIKVLKANPKNMFEREVKQAMRKWRYEAGVPGG 61 Query: 228 GIVVNILFKINGTTEI 243 IVV I FKINGTTE+ Sbjct: 62 DIVVTIKFKINGTTEL 77 >PD010309 p2000.1 (4) TONB(4) // TONB PROTEIN TRANSPORT INNER MEMBRANE PERIPLASMIC TRANSMEMBRANE REPEAT PHAGE RECOGNITION Length = 39 Score = 198 (89.7 bits), Expect = 1.1e-20, P = 1.1e-20 Identities = 38/39 (97%), Positives = 38/39 (97%) Query: 6 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELP 44 MTLDLPRRFPWPTLLSV IHGAVVAGLLYTSVHQVIELP Sbjct: 1 MTLDLPRRFPWPTLLSVAIHGAVVAGLLYTSVHQVIELP 39 >PD015378 p2000.1 (2) TONB(2) // TONB PROTEIN TRANSPORT INNER MEMBRANE PERIPLASMIC TRANSMEMBRANE REPEAT PHAGE RECOGNITION Length = 40 Score = 177 (80.2 bits), Expect = 9.5e-18, P = 9.5e-18 Identities = 36/40 (90%), Positives = 37/40 (92%) Query: 128 SPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQ 167 SPFENTAPAR TSSTATAATSKP TSV +GPRALSRNQPQ Sbjct: 1 SPFENTAPARPTSSTATAATSKPATSVPTGPRALSRNQPQ 40 >PD188847 p2000.1 (2) TONB(2) // TONB PROTEIN TRANSPORT INNER MEMBRANE PERIPLASMIC TRANSMEMBRANE REPEAT Length = 55 Score = 101 (45.8 bits), Expect = 7.3e-10, Sum P(2) = 7.3e-10 Identities = 21/37 (56%), Positives = 24/37 (64%) Query: 10 LPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAP 46 L R WP SV IHGA++AGLLY SV Q+ E P P Sbjct: 8 LNRWITWPFAFSVGIHGALIAGLLYASVEQMREQPEP 44 Score = 35 (15.9 bits), Expect = 7.3e-10, Sum P(2) = 7.3e-10 Identities = 6/9 (66%), Positives = 7/9 (77%) Query: 75 EPEPEPEPI 83 +PEPE PI Sbjct: 41 QPEPEDAPI 49 >PD193463 p2000.1 (1) TONB_ECOLI // TONB PROTEIN TRANSPORT INNER MEMBRANE PERIPLASMIC TRANSMEMBRANE REPEAT PHAGE RECOGNITION COLICIN Length = 23 Score = 116 (52.6 bits), Expect = 5.2e-09, P = 5.2e-09 Identities = 23/23 (100%), Positives = 23/23 (100%) Query: 45 APAQPISVTMVTPADLEPPQAVQ 67 APAQPISVTMVTPADLEPPQAVQ Sbjct: 1 APAQPISVTMVTPADLEPPQAVQ 23 ... Parameters: E=0.001 B=500 V=500 -ctxfactor=1.00 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H +0 0 BLOSUM62 0.314 0.132 0.387 same same same Query Frame MatID Length Eff.Length E S W T X E2 S2 +0 0 244 244 0.0010 88 3 11 23 0.25 33 Statistics: Query Expected Observed HSPs HSPs Frame MatID High Score High Score Reportable Reported +0 0 63 (28.5 bits) 263 (119.2 bits) 89 89 Query Neighborhd Word Excluded Failed Successful Overlaps Frame MatID Words Hits Hits Extensions Extensions Excluded +0 0 5023 11818274 2902988 8858087 57183 8670 Database: prodom_00_1 Release date: unknown Posted date: 5:56 PM EDT Jun 21, 2000 # of letters in database: 19,895,393 # of sequences in database: 174,952 # of database sequences satisfying E: 27 No. of states in DFA: 547 (54 KB) Total size of DFA: 105 KB (128 KB) Time to generate neighborhood: 0.01u 0.00s 0.01t Real: 00:00:00 Time to search database: 16.55u 0.05s 16.60t Real: 00:00:17 Total cpu time: 16.58u 0.07s 16.65t Real: 00:00:17 --- END of BLASTP output --- ------------------------------------------------------------ --- --- Again: these results were obtained based on the domain data- --- base collected by Daniel Kahn and his coworkers in Toulouse. --- --- PLEASE quote: --- F Corpet, J Gouzy, D Kahn (1998). The ProDom database --- of protein domain families. Nucleic Ac Res 26:323-326. --- --- The general WWW page is on: ---- --------------------------------------- --- http://www.toulouse.inra.fr/prodom.html ---- --------------------------------------- --- --- For WWW graphic interfaces to PRODOM, in particular for your --- protein family, follow the following links (each line is ONE --- single link for your protein!!): --- http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD005186 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD005186 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD005186 ==> graphical output of all proteins having domain PD005186 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD010309 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD010309 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD010309 ==> graphical output of all proteins having domain PD010309 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD015378 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD015378 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD015378 ==> graphical output of all proteins having domain PD015378 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD188847 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD188847 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD188847 ==> graphical output of all proteins having domain PD188847 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom1=PD193463 ==> multiple alignment, consensus, PDB and PROSITE links of domain PD193463 http://www.toulouse.inra.fr/prodom/cgi-bin/ReqProdomII.pl?id_dom2=PD193463 ==> graphical output of all proteins having domain PD193463 ... --- --- NOTE: if you want to use the link, make sure the entire line --- is pasted as URL into your browser! --- --- END of PRODOM --- ------------------------------------------------------------ ________________________________________________________________________________ The alignment that has been used as input to the network is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ --- --- Version of database searched for alignment: --- SWISS-PROT release 39.0 (5/00) with 85 249 proteins --- --- ------------------------------------------------------------ --- MAXHOM multiple sequence alignment --- ------------------------------------------------------------ --- --- MAXHOM ALIGNMENT HEADER: ABBREVIATIONS FOR SUMMARY --- ID : identifier of aligned (homologous) protein --- STRID : PDB identifier (only for known structures) --- IDE : percentage of pairwise sequence identity --- WSIM : percentage of weighted similarity --- LALI : number of residues aligned --- NGAP : number of insertions and deletions (indels) --- LGAP : number of residues in all indels --- LSEQ2 : length of aligned sequence --- ACCNUM : SwissProt accession number --- OMIM : OMIM (Online Mendelian Inheritance in Man) ID --- NAME : one-line description of aligned protein --- --- MAXHOM ALIGNMENT HEADER: SUMMARY ID STRID IDE WSIM LALI NGAP LGAP LSEQ2 ACCNUM NAME tonb_ecoli 100 100 239 0 0 239 P94739 TONB PROTEIN. tonb_salty 87 88 238 3 5 242 P25945 TONB PROTEIN. tonb_klepn 77 74 233 7 11 243 P45610 TONB PROTEIN. tonb_entae 76 77 233 4 7 243 P46383 TONB PROTEIN. tonb_serma 54 59 239 5 11 247 P26185 TONB PROTEIN. tonb_yeren 46 52 241 5 15 255 Q05740 TONB PROTEIN. tonb_pseae 37 40 225 5 32 342 Q51368 TONB PROTEIN. tonb_haein 32 28 231 8 39 270 P42872 TONB PROTEIN. tonb_haedu 30 31 224 7 57 279 O51810 TONB PROTEIN. --- --- MAXHOM ALIGNMENT: IN MSF FORMAT MSF of: /home/phd/server/work/predict_h32463.hsspFilter from: 1 to: 244 /home/phd/server/work/predict_h32463.msfRet MSF: 244 Type: P 28-Jun-01 05:00:2 Check: 7659 .. Name: predict_h3240 Len: 244 Check: 3879 Weight: 1.00 Name: tonb_ecoli Len: 244 Check: 3364 Weight: 1.00 Name: tonb_salty Len: 244 Check: 1399 Weight: 1.00 Name: tonb_klepn Len: 244 Check: 8093 Weight: 1.00 Name: tonb_entae Len: 244 Check: 7194 Weight: 1.00 Name: tonb_serma Len: 244 Check: 4523 Weight: 1.00 Name: tonb_yeren Len: 244 Check: 5387 Weight: 1.00 Name: tonb_pseae Len: 244 Check: 2973 Weight: 1.00 Name: tonb_haein Len: 244 Check: 1105 Weight: 1.00 Name: tonb_haedu Len: 244 Check: 9742 Weight: 1.00 // 1 50 predict_h3240 MIMTSMTLDL PRRFPWPTLL SVCIHGAVVA GLLYTSVHQV IELPAPAQPI tonb_ecoli .....MTLDL PRRFPWPTLL SVCIHGAVVA GLLYTSVHQV IELPAPAQPI tonb_salty .....MTLDL PRRFPWPTLL SVGIHGAVVA GLLYTSVHQV IELPAPAQPI tonb_klepn .....MTLDL PRRFPWPTLL SVAIHGAVVA GLLYTSVHQV IEQPSPTQPI tonb_entae .....MTLDL PRRFPWPTLL SVAIHGAVVA GLLYTSVHQV IEKPSPSQPI tonb_serma ..MPLKKMFL NRRISVPFVL SVGLHSALVA GLLYASVKEV VELPKPeaPI tonb_yeren ..MQLNKFFL GRWLTWPLAF SVGIHGSVIA ALLYVSVEQm iQPEIEDAPI tonb_pseae .......... .SRWWLSSGA AVAMHVAIIG ALVWVMPTPa lGHGELPKTM tonb_haein .......MQQ TKRSLLGLLI SLIAHGIVIG FILWnsDSAN SAQGDISTSI tonb_haedu .......... .KHSRIGLIS SVFIHIVLFA SFISLVEVSH SDLSDGDSPL 51 100 predict_h3240 SVTMVTPADL EPPQAVQPPP EPVVEPEPEP EPIPEPPKEA PVVIEKPKPK tonb_ecoli SVTMVTPADL EPPQAVQPPP EPVVEPEPEP EPIPEPPKEA PVVIEKPKPK tonb_salty TVTMVSPADL EPPQAVQPPP EPVvePEPEP EPIPEPPKEA PVVIekPKPK tonb_klepn EITMVAPADL EPPPA.QPVV EPVvePEPEP EVVPEPPKEA VVIhpKPKPK tonb_entae EITMVAPADL EPPQAAQPVV EPVvePEPEP EVVPEPPKEV PVVIHKPEPK tonb_serma SVMMVNTAAM AEPPPPAPAE PEPpePEPEP EPIVEPPPKA IVKPEPVKPK tonb_yeren AVTMVNIDTF AAPQPaePQA EPEPEPEPEP EPIDEAPPEP EVlpEPVKPK tonb_pseae QVNFVQLEKK AEPTPQPPAA APEPTPPKIE EPKPEPPKPK PV..EKPKPK tonb_haein SMELLQGMVL EEPAPeePEP EPEpePEPEK QEipEPKKIK EPEKEKPKPK tonb_haedu SIELV.AALL EQPQVAVAPE EVTsePEPEP DAIPEPITK. ..PIEKPKEK 101 150 predict_h3240 PKPKPKPVKK VQEQPKRDVK PVESRPASPF ENTAPARLTS STATAATSKP tonb_ecoli PKPKPKPVKK VQEQPKRDVK PVESRPASPF ENTAPARLTS STATAATSKP tonb_salty PKPKPKPVKK VEEQPKREVK PAAPRPASPF ENSAPVRPTS STA.SATSKP tonb_klepn PKPKPKPEKK V.EQPKREVK PaePRPASPF EntAPARTAP STSTAAAKPT tonb_entae PKPKPKPKPK PevEPKREVK PAEPRPVSPF EntAPARTAP ST.TAATAKP tonb_serma PKPKPKP..K VEKQVKPEPK KVEPREPSPF NNDSPAKPID KAPvpAAPVQ tonb_yeren PKPVKKEVKK PEVKKPDVKK TVAPPDDKPF KSDEPALVST NAPvpKASVP tonb_pseae PKPKPKPVEN AIPKAKPKPE PKpsQPSPSS AAPPPAPTVG QSTPGAQTAP tonb_haein GKPKGKPKNK PKKEVKPQKK PINKELPKGD ENidKASTTs sNAQVAGSGT tonb_haedu PKEKPKKPEK PKEKLKKEKP KEKAKQIEAL EKGPEAKQGI VAQagASSNE 151 200 predict_h3240 VTSVASGPRA LSRNQPQYPA RAQALRIEGQ VKVKFDVTPD GRVDNVQILS tonb_ecoli VTSVASGPRA LSRNQPQYPA RAQALRIEGQ VKVKFDVTPD GRVDNVQILS tonb_salty AVSVPTGPRA LSRNQPQYPA RAQALRIEGR VKVKFDVTSA GRVENVQILS tonb_klepn VTA.PSGPRA ISRVQPSYPA RAQALRIEGT VRVKFDVSPD GRIDNLQILS tonb_entae MTTAPSVPKA LKRGDPSYPQ RAQALRIEGD VRVKFDVTAD GRVENIQILS tonb_serma GNSREVGPRP ISRANPLYPP RAQALQIEGN VRVQFDIDSD GRVSNVRILS tonb_yeren GVSTSTGPKA LSKAKPTYPA RALALGVEGQ VKVQYDIDEN GRVTNVRILE tonb_pseae SGSqdSDIKP LRMDPPVYPR MAQARGIEGR VKVLFTITSD GRIDDIQVLE tonb_haein DTSEIAAYRS AIrsHKRYPT RAKIMRKQGK VSVSFNVGAD GSLSGAKVTK tonb_haedu INAyaALQRA LQHrnNAYPA REKMMRKTGV VTLGFTISPS GKLIDVTVLN 201 244 predict_h3240 AKPANMFERE VKNAMRRWRY EPGKPGSGIV VNILFKINGT TEIQ tonb_ecoli AKPANMFERE VKNAMRRWRY EPGKPGSGIV VNILFKINGT TEIQ tonb_salty AQPANMFERE VKNAMRKWRY EAGKPGSGLV VNIIFRLNGT AQIE tonb_klepn AQPANMFERE VKSAMRRWRY QQGRPGTGVT MTIKFRLNGV E... tonb_entae AKPANMFERD VKTAMRKWRY EAGRPGTGLT MNIKFRLNG. .... tonb_serma AEPRNMFERE VKQAMRKWRY EA.KEAKDRT VTIRFKLNGT TELN tonb_yeren ATPRNTFERE VKQVMRKWRF EA.VAAKDYV TTVVFKIGGT TEMD tonb_pseae SVPSRMFDRE VRQAMAKWRF EPRVSGGKIV amFFFKIE.. .... tonb_haein SSGDESLDKA ALDAINVSRS VGTRPasSLS VQISFTLQ.. .... tonb_haedu SSGNQNLDAA AVQAAEATKV APPPIGfnVT VPIKFSIQ.. .... ________________________________________________________________________________ Result of COILS prediction (Andrei Lupas): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A Lupas: Methods in Enzymology, 1996, 266, 513-525. version 2.2: Rob B. Russell & Andrei N. Lupas, 1999 ________________________________________________________________________________ no coiled-coil above probability 0.5 ________________________________________________________________________________ PHD: Profile fed neural network systems from HeiDelberg ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Prediction of: secondary structure, by PHDsec solvent accessibility, by PHDacc and helical transmembrane regions, by PHDhtm Author: Burkhard Rost EMBL, 69012 Heidelberg, Germany Internet: Rost@EMBL-Heidelberg.DE All rights reserved. The network systems are described in: PHDsec: B Rost & C Sander: JMB, 1993, 232, 584-599. B Rost & C Sander: Proteins, 1994, 19, 55-72. PHDacc: B Rost & C Sander: Proteins, 1994, 20, 216-226. PHDhtm: B Rost et al.: Prot. Science, 1995, 4, 521-533. The resulting network (PHD) prediction is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________________________________________________________ PHD: Profile fed neural network systems from HeiDelberg ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Prediction of: secondary structure, by PHDsec solvent accessibility, by PHDacc and helical transmembrane regions, by PHDhtm Author: Burkhard Rost EMBL, 69012 Heidelberg, Germany Internet: Rost@EMBL-Heidelberg.DE All rights reserved. The network systems are described in: PHDsec: B Rost & C Sander: JMB, 1993, 232, 584-599. B Rost & C Sander: Proteins, 1994, 19, 55-72. PHDacc: B Rost & C Sander: Proteins, 1994, 20, 216-226. PHDhtm: B Rost et al.: Prot. Science, 1995, 4, 521-533. Some statistics ~~~~~~~~~~~~~~~ Percentage of amino acids: +--------------+--------+--------+--------+--------+--------+ | AA: | P | V | A | K | E | | % of AA: | 16.4 | 10.7 | 8.6 | 7.4 | 7.4 | +--------------+--------+--------+--------+--------+--------+ | AA: | T | S | R | L | Q | | % of AA: | 6.1 | 5.7 | 5.7 | 5.3 | 4.9 | +--------------+--------+--------+--------+--------+--------+ | AA: | I | G | N | M | D | | % of AA: | 4.9 | 3.7 | 2.9 | 2.5 | 2.5 | +--------------+--------+--------+--------+--------+--------+ | AA: | F | Y | W | H | C | | % of AA: | 2.0 | 1.2 | 0.8 | 0.8 | 0.4 | +--------------+--------+--------+--------+--------+--------+ Percentage of secondary structure predicted: +--------------+--------+--------+--------+ | SecStr: | H | E | L | | % Predicted: | 13.5 | 19.7 | 66.8 | +--------------+--------+--------+--------+ According to the following classes: all-alpha: %H>45 and %E< 5; all-beta : %H<5 and %E>45 alpha-beta : %H>30 and %E>20; mixed: rest, this means that the predicted class is: mixed class PHD output for your protein ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Thu Jun 28 05:00:32 2001 Jury on: 10 different architectures (version 5.94_317 ). Note: differently trained architectures, i.e., different versions can result in different predictions. About the protein ~~~~~~~~~~~~~~~~~ HEADER /home/phd/server/work/predict_h32463.fas COMPND SOURCE AUTHOR SEQLENGTH 244 NCHAIN 1 chain(s) in predict_h32463 data set NALIGN 9 (=number of aligned sequences in HSSP file) Abbreviations: PHDsec ~~~~~~~~~~~~~~~~~~~~~ sequence: AA : amino acid sequence secondary structure: HEL: H=helix, E=extended (sheet), blank=other (loop) PHD: Profile network prediction HeiDelberg Rel: Reliability index of prediction (0-9) detail: prH: 'probability' for assigning helix prE: 'probability' for assigning strand prL: 'probability' for assigning loop note: the 'probabilites' are scaled to the interval 0-9, e.g., prH=5 means, that the first output node is 0.5-0.6 subset: SUB: a subset of the prediction, for all residues with an expected average accuracy > 82% (tables in header) note: for this subset the following symbols are used: L: is loop (for which above " " is used) ".": means that no prediction is made for this residue, as the reliability is: Rel < 5 Abbreviations: PHDacc ~~~~~~~~~~~~~~~~~~~~~ SS : secondary structure HEL: H=helix, E=extended (sheet), blank=other (loop) solvent accessibility: 3st: relative solvent accessibility (acc) in 3 states: b = 0-9%, i = 9-36%, e = 36-100%. PHD: Profile network prediction HeiDelberg Rel: Reliability index of prediction (0-9) O_3: observed relative acc. in 3 states: B, I, E note: for convenience a blank is used intermediate (i). P_3: predicted relative accessibility in 3 states 10st:relative accessibility in 10 states: = n corresponds to a relative acc. of n*n % subset: SUB: a subset of the prediction, for all residues with an expected average correlation > 0.69 (tables in header) note: for this subset the following symbols are used: "I": is intermediate (for which above " " is used) ".": means that no prediction is made for this residue, as the reliability is: Rel < 4 Abbreviations: PHDhtm ~~~~~~~~~~~~~~~~~~~~~ secondary structure: HL: T=helical transmembrane region, blank=other (loop) PHD: Profile network prediction HeiDelberg PHDF:filtered prediction, i.e., too long transmembrane segments are split, too short ones are deleted Rel: Reliability index of prediction (0-9) detail: prH: 'probability' for assigning helical transmembrane region prL: 'probability' for assigning loop note: the 'probabilites' are scaled to the interval 0-9, e.g., prH=5 means, that the first output node is 0.5-0.6 subset: SUB: a subset of the prediction, for all residues with an expected average accuracy > 82% (tables in header) note: for this subset the following symbols are used: L: is loop (for which above " " is used) ".": means that no prediction is made for this residue, as the reliability is: Rel < 5 protein: predict length 244 ....,....1....,....2....,....3....,....4....,....5....,....6 AA |MIMTSMTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADL| PHD sec | EEEE EEEEEHHHHHHHHHEEEEEEEEE EEEEEE | Rel sec |964234167788687883378722345541112222221322799999679998427899| detail: prH sec |000000000000000000000124556664443332222100000000000000001000| prE sec |023566511100211103678742221224445554444554100000279988631000| prL sec |976332477888788885310132221100001112223245799999710001357889| subset: SUB sec |LL.....LLLLLLLLLL..EEE....HH..............LLLLLLLEEEEE..LLLL| accessibility 3st: P_3 acc |bbbebbebebee bebbbbbbbbbbbbbbbbbbbbbbeebbeeeeeeeebbbbbbbebeb| 10st: PHD acc |000600606076507000000000000000000000066007678787700000006070| Rel acc |025034021111101010367857667999687113211422012321052928610110| subset: SUB acc |..b..b.............bbbbbbbbbbbbbb......b.........b.b.bb.....| ....,....7....,....8....,....9....,....10...,....11...,....12 AA |EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVK| PHD sec | | Rel sec |999999999999999999998999999999732368999999999989886768898989| detail: prH sec |000000000000000000000000000000000000000000000011012111001000| prE sec |000000000000000000000000000000134321000000000000000000000000| prL sec |999999999999999999998899999999865578999999999988887778898889| subset: SUB sec |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL...LLLLLLLLLLLLLLLLLLLLLLLLLL| accessibility 3st: P_3 acc |eeeeeeeeeeeeeeeeeeeeeebee eeeeebeeeeeeeeeeeeeeeeeeeeeeeeeeee| 10st: PHD acc |776789779977777978787707757877706797979797979777777787777777| Rel acc |310211101111111111122003200440110102031313232332430443145423| subset: SUB acc |...........................ee...................e..ee..eee..| ....,....13...,....14...,....15...,....16...,....17...,....18 AA |PVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQ| PHD sec | HHHHHHH | Rel sec |999999999899998967898898899999988779986535899998168886523465| detail: prH sec |000000000000000000000100000000000000011100000001478887753100| prE sec |000000000000000011000000000000000110000132000000000000000212| prL sec |999999999889998977898888899999988789987657899998521001246677| subset: SUB sec |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL.LLLLLLL.HHHHHH...LL| accessibility 3st: P_3 acc |eeeeeeeeeeeeeeebeeeeeebeebeeeeeebeeebeebbeee eebbb bebbebebe| 10st: PHD acc |797977977779777077777707707777770777076007795770005060060706| Rel acc |313123111031013421022101011132001011111321211320100626311242| subset: SUB acc |...............b...................................b.b....b.| ....,....19...,....20...,....21...,....22...,....23...,....24 AA |VKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGT| PHD sec |EEEEEEE EEEEEEE HHHHHHHHHHHHHHHHH EEEEEEEEE | Rel sec |599998389995323697504981489999999999975618989981579999983694| detail: prH sec |000000000000000100000005688999999999877740000000000000000001| prE sec |799998510002356787642000000000000000000000000014689999986202| prL sec |200001389987533101246884200000000000012158988985210000013786| subset: SUB sec |EEEEEE.LLLLL...EEEE..LL..HHHHHHHHHHHHHHH.LLLLLL.EEEEEEEE.LL.| accessibility 3st: P_3 acc |bebebeb eeb bebbebbebebeebbeeebeebbeeb eebeeebeebbbbbbbebebb| 10st: PHD acc |060606059705060060070607700667067006705670777077000000060600| Rel acc |729271510331203708611201104100823541120010223022237091915140| subset: SUB acc |b.b.b.b........b.bb.......b...b..bb...............b.b.b.b.b.| ....,....25...,....26...,....27...,....28...,....29...,....30 AA |TEIQ| PHD sec | E | Rel sec |2149| detail: prH sec |0000| prE sec |3430| prL sec |5469| subset: SUB sec |...L| accessibility 3st: P_3 acc |eebe| 10st: PHD acc |6609| Rel acc |1111| subset: SUB acc |....| PHDhtm Helical transmembrane prediction note: PHDacc and PHDsec are reliable for water- soluble globular proteins, only. Thus, please take the predictions above with particular caution wherever transmembrane helices are predicted by PHDhtm! PHDhtm --- --- PhdTopology REFINEMENT AND TOPOLOGY PREDICTION: SYMBOLS --- AA : amino acid in one-letter code --- PHD htm : HTM's predicted by the PHD neural network --- system (T=HTM, ' '=not HTM) --- Rel htm : Reliability index of prediction (0-9, 0 is low) --- detail : Neural network output in detail --- prH htm : 'Probability' for assigning a helical trans- --- membrane region (HTM) --- prL htm : 'Probability' for assigning a non-HTM region --- note: 'Probabilites' are scaled to the interval --- 0-9, e.g., prH=5 means, that the first --- output node is 0.5-0.6 --- subset : Subset of more reliable predictions --- SUB htm : All residues for which the expected average --- accuracy is > 82% (tables in header). --- note: for this subset the following symbols are used: --- L: is loop (for which above ' ' is used) --- '.': means that no prediction is made for this, --- residue as the reliability is: Rel < 5 --- other : predictions derived based on PHDhtm --- PHDFhtm : filtered prediction, i.e., too long HTM's are --- split, too short ones are deleted --- PHDRhtm : refinement of neural network output --- PHDThtm : topology prediction based on refined model --- symbols used: --- i: intra-cytoplasmic --- T: transmembrane region --- o: extra-cytoplasmic --- --- PhdTopology REFINEMENT AND TOPOLOGY PREDICTION ....,....1....,....2....,....3....,....4....,....5....,....6 AA |MIMTSMTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADL| PHD htm | TTTTTTTTTTTTTTTTT | Rel htm |999999999999988863046778888888876520367788999999999999999999| detail: prH htm |000000000000000013578889999999988764311100000000000000000000| prL htm |999999999999999986421110000000011235688899999999999999999999| PHDRhtm | TTTTTTTTTTTTTTTTTT | PHDThtm |iiiiiiiiiiiiiiiiiiTTTTTTTTTTTTTTTTTToooooooooooooooooooooooo| subset: SUB htm |LLLLLLLLLLLLLLLLL..HHHHHHHHHHHHHHH...LLLLLLLLLLLLLLLLLLLLLLL| ....,....7....,....8....,....9....,....10...,....11...,....12 AA |EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVK| PHD htm | | Rel htm |999999999999999999999999999999999999999999999999999999999999| detail: prH htm |000000000000000000000000000000000000000000000000000000000000| prL htm |999999999999999999999999999999999999999999999999999999999999| PHDRhtm | | PHDThtm |oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo| subset: SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL| ....,....13...,....14...,....15...,....16...,....17...,....18 AA |PVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQ| PHD htm | | Rel htm |999999999999999999999999999999999999999999999999999999999999| detail: prH htm |000000000000000000000000000000000000000000000000000000000000| prL htm |999999999999999999999999999999999999999999999999999999999999| PHDRhtm | | PHDThtm |oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo| subset: SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL| ....,....19...,....20...,....21...,....22...,....23...,....24 AA |VKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGT| PHD htm | | Rel htm |999999999999999999999999999999999999999999999999999989999999| detail: prH htm |000000000000000000000000000000000000000000000000000000000000| prL htm |999999999999999999999999999999999999999999999999999999999999| PHDRhtm | | PHDThtm |oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo| subset: SUB htm |LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL| ....,....25...,....26...,....27...,....28...,....29...,....30 AA |TEIQ| PHD htm | | Rel htm |9999| detail: prH htm |0000| prL htm |9999| PHDRhtm | | PHDThtm |oooo| subset: SUB htm |LLLL| --- --- PhdTopology REFINEMENT AND TOPOLOGY PREDICTION END --- ________________________________________________________________________________ Result of ASP prediction(Malin Young, Kent Kirshenbaum, Stefan Highsmith) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Kirshenbaum K, Young M and Highsmith S. Prot. Sci.(1999) 8:1806-1815. Young M, Kirshenbaum K, Dill KA and Highsmith S. Prot. Sci.(1999) 8:1752-1764. ________________________________________________________________________________ Ambivalent Sequence Predictor (ASP v1.0) mmy Parameters: Window size : 5 Min mu dPr : 9 Z-score cutoff : -1.75 Mean dPr score=15.018, Standard deviation=2.970 ....,....1....,....2....,....3....,....4....,....5....,....6 AA |MIMTSMTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADL| prH sec |000000000000000000000124556664443332222100000000000000001000| prE sec |023566511100211103678742221224445554444554100000279988631000| prL sec |976332477888788885310132221100001112223245799999710001357889| ASP sec |......................SSSSSSSSSSSSSSSSSSS...................| ....,....7....,....8....,....9....,....10...,....11...,....12 AA |EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVK| prH sec |000000000000000000000000000000000000000000000011012111001000| prE sec |000000000000000000000000000000134321000000000000000000000000| prL sec |999999999999999999998899999999865578999999999988887778898889| ASP sec |............................................................| ....,....13...,....14...,....15...,....16...,....17...,....18 AA |PVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQ| prH sec |000000000000000000000100000000000000011100000001478887753100| prE sec |000000000000000011000000000000000110000132000000000000000212| prL sec |999999999889998977898888899999988789987657899998521001246677| ASP sec |............................................................| ....,....19...,....20...,....21...,....22...,....23...,....24 AA |VKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGT| prH sec |000000000000000100000005688999999999877740000000000000000001| prE sec |799998510002356787642000000000000000000000000014689999986202| prL sec |200001389987533101246884200000000000012158988985210000013786| ASP sec |............................................................| ....,....25...,....26...,....27...,....28...,....29...,....30 AA |TEIQ| prH sec |0000| prE sec |3430| prL sec |5469| ASP sec |....| Please note: ASP was designed to identify the location of conformational switches in amino acid sequences. It is NOT designed to predict whether a given sequence does or does not contain a switch. For best results, ASP should be used on sequences of length >150 amino acids with >10 sequence homologues in the SWISS-PROT data bank. ASP has been validated against a set of globular proteins and may not be generally applicable. Please see Young et al., Protein Science 8(9):1852-64. 1999. for details and for how best to interpret this output. We consider ASP to be experimental at this time, and would appreciate any feedback from our users. ________________________________________________________________________________ ________________________________________________________________________________ ----------------------------------------------------------------------------- - PredictProtein (PP): News 2000 - ----------------------------------------------------------------------------- - - - PP home: - New York http://cubic.bioc.columbia.edu/predictprotein/ - - - PP mirrors: - Australia Sydney http://molmod.angis.org.au/predictprotein/ Germany EMBL http://www.embl-heidelberg.de/predictprotein/ China CBI,Peking http://www.cbi.pku.edu.cn/predictprotein/ China Inst. Microbiol. http://micronet.im.ac.cn/predictprotein/ England EBI http://www.ebi.ac.uk/~rost/predictprotein/ India CDFD http://www.cdfd.org.in/~www/pp/predictprotein/ India Pune http://202.41.70.33/predictprotein/ Iran Tehran http://www.ibc.ut.ac.ir/predictprotein/ Israel Beer-Sheva http://www.cs.bgu.ac.il/~dfischer/predictprotein/ Italy Rome http://obelix.bio.uniroma2.it/www/predictprotein/ Mexico Cuernavaca UNAM http://www.ibt.unam.mx/paginas/lorenzo/predictprotein/ Netherlands CMBI http://www.cmbi.kun.nl/bioinf/predictprotein/ Russia Puschino http://mirror.protres.ru/predictprotein/ Singapore http://embl.bic.nus.edu.sg/predictprotein/ Spain CNB http://www.es.embnet.org/Services/MolBio/PredictProtein/ Switzerland Glaxo http://www.gwer.ch/tools/predictprotein/ USA San Diego SDSC http://www.sdsc.edu/predictprotein/ - - - Tools to post-process PP results: - - - - Generate a PostScript (or GIF, or TIFF): - ESPript (New York) http://cubic.bioc.columbia.edu/cgi/pp/nph-ESPript_exe.cgi ESPript (Toulouse) http://www-pgm1.ipbs.fr:8080/ESPript - - -----------------------------------------------------------------------------
Latest update of content: June 28, 2001 Ralf Koebnik
Back to previous page |