2023年12月8日发(作者:宋杨万里)
细胞色素c序列查找和分析1 登陆NCBI网站,查找关于细胞色素C相关的蛋白的序列,选取了human,rat,yeast,drosophila等14个物种的细胞色素C蛋白序列,制订成表格,如下:NO.
1
ACC NO Organism
AAA28437 fruit fly
Protein quences
2 AAA21711 Rattus
norvegicus
3 Homo sapiens Homo sapiens
4 P00006 Bos taurus
5 CAA25046 Gallus gallus
6 S11172 yeast
7 AAC80552 Tigriopus
californicus
8 CCSF s tarfish
9 CCCA common carp
1 mgvpagdvek gkklfvqrca qchtveaggk
hkvgpnlhgl igrktgqaag faytdankak
gitwnedtlf eylenpkkyi pgtkmifagl
kkpnergdli aylksatk
1 mgdvekgkki fvqkcaqcht vekggkhktg
pnlhglfgrk tgqaagfsyt danknkgitw
gedtlmeyle npkkyipgtk mifagikkkg
eradliaylk katne
1 mgdvekgkki fimkcsqcht vekggkhktg
pnlhglfgrk tgqapgysyt aanknkgiiw
gedtlmeyle npkkyipgtk mifvgikkke
eradliaylk katne
1 gdvekgkkif vqkcaqchtv ekggkhktgp
nlhglfgrkt gqapgfsytd anknkgitwg
eetlmeylen pkkyipgtkm ifagikkkge
redliaylkk atne
1 mgdiekgkki fvqkcsqcht vekggkhktg
pnlhglfgrk tgqaegfsyt danknkgitw
gedtlmeyle npkkyipgtk mifagikkks
ervdliaylk datsk
1 mpyapgdekk gaslfktrca qchtvekgga
nkvgpnlhgv fgrktgqaeg fsyteanrdk
gitwdeetlf aylenpkkyi pgtkmafagf
kkpadrnnvi tylkkat
1 mgdidkgkki fvqkctqcht ieaggkhkvg
pnlhgmygrq tgkaagysyt dankskgvtw
neetldiylt npkkyipgtk mvfaglkkkg
dredliaylk sasss
1 gqvekgkkif vqrcaqchtv ekagkhktgp
nlngilgrkt gqaagfsytd anrnkgitwk
netlfeylen pkkyipgtkm vfaglkkqke
rqdliaylea atk
1 gdvekgkkvf vqkcaqchtv zbggkhkvgp
nlwglfgrkt gqapgfsytb abkskgivwb
zztlmeylzb pkkyipgtkm ifagikkkge 10 CCHOZ common zebra
11 AAL67777 Actinobacillus
lignieresii
12 CCHOD donkey
13 AAB86817 Pichia stipitis
14 CAA25899 Mus musculus
radliaylks ats
1 gdvekgkkif vqkcaqchtv ekggkhktgp
nlhglfgrkt gqapgfsytd anknkgitwk
eetlmeylen pkkyipgtkm ifagikkkte
redliaylkk atne
1 mtkllqkiaf ilplvfslva xaemvdtfqf
qnetdrvrav alakslrcpq cqnqnlvesn
attayklrle vyemvnqgkt deeiikimte
rfghfvnykp pfna
1 gdvekgkkif vqkcaqchtv ekggkhktgp
nlhglfgrkt gqapgfsytd anknkgitwk
eetlmeylen pkkyipgtkm ifagikkkte
redliaylkk atne
1 mpapfekg kkgatlfktr clqchtveeg
gphkvgpnlh gimgrksgqa vgysytdank
kkgvewqtmsdylenpkkyipgtkmafg
glkkpkdrnd lvtylasatk
1 mgdvekgkki fvqkcaqcht vekggkhktg
pnlhglfgrk tgqaagfsyt danknkgitw
gedtlmeyle npkkyipgtk mifagikkkg
eradliaylk katne
2 将所查找的序列作成fasta格式的文本文档。
3 选取第二条序列(AAA21711)为代表,进行蛋白质一级,二级,三级结构的预测
a.一级结构用的是/tools/,结果如下:Ur-provided quence:
1 11 21 31 41 51
| | | | | |
1 MGDVEKGKKI FIMKCSQCHT VEKGGKHKTG PNLHGLFGRK TGQAPGYSYT AANKNKGIIW
60
61 GEDTLMEYLE NPKKYIPGTK MIFVGIKKKE ERADLIAYLK KATNE
References and documentation are available. Number of amino acids: 105
Molecular weight: 11748.7
Theoretical pI: 9.59
Amino acid composition:
Ala (A) 6 5.7%
Arg (R) 2 1.9%
Asn (N) 5 4.8%
Asp (D) 3 2.9%
Cys (C) 2 1.9%
Gln (Q) 2 1.9%
Glu (E) 8 7.6%
Gly (G) 13 12.4%
His (H) 3 2.9%
Ile (I) 8 7.6%
Leu (L) 6 5.7%
Lys (K) 18 17.1%
Met (M) 4 3.8%
Phe (F) 3 2.9%
Pro (P) 4 3.8%
Ser (S) 2 1.9%
Thr (T) 7 6.7%
Trp (W) 1 1.0%
Tyr (Y) 5 4.8%
Val (V) 3 2.9%
Asx (B) 0 0.0%
Glx (Z) 0 0.0%
Xaa (X) 0 0.0%
Total number of negatively charged residues (Asp + Glu
):Total number of positively charged residues (Arg + Lys
):
Atomic composition:
Carbon C 526
Hydrogen H 845
Nitrogen N 143
11 20Oxygen O 149
Sulfur S 6
Formula: C526H845N143O149S6
Total number of atoms: 1669
Extinction coefficients:
Conditions: 6.0 M guanidium hydrochloride 0.02 M phosphate buffer
pH 6.5
-1-1Extinction coefficients are in units of M cm .
The first table lists values computed assuming ALL Cys
residues appear as half cystines, whereas the cond table
assumes that NONE do.
276 278 279 280 282
nm nm nm nm nm
Ext. coefficient 12795 12727 12505 12210 11720
Abs 0.1% (=1 g/l) 1.089 1.083 1.064 1.039 0.998
276 278 279 280 282
nm nm nm nm nm
Ext. coefficient 12650 12600 12385 12090 11600
Abs 0.1% (=1 g/l) 1.077 1.072 1.054 1.029 0.987
Estimated half-life:
The N-terminal of the quence considered is M (Met).
The estimated half-life is: 30 hours (mammalian reticulocytes, in vitro).
>20 hours (yeast, in vivo).
>10 hours (Escherichia coli, in vivo).
Instability index:
The instability index (II) is computed to be 11.38This classifies the protein as stable.
Aliphatic index: 66.00
Grand average of hydropathicity (GRAVY): -0.706
b.二级结构用的是:
/cgi-bin/npsa_?page=npsa_
结果如下:
GOR4 result for : UNK_162940
Abstract
GOR condary structure prediction method version IV, J. Garnier, J.-F. Gibrat, B. Robson,
Methods in Enzymology,R.F. Doolittle Ed., vol 266, 540-553, (1996)
View GOR4 in: [MPSA
(Mac, UNIX) , ] [AnTheProt
(PC) , ]
[HELP]
10 20 30 40 50 60
70
| | | | | |
|
MGDVEKGKKIFVQKCAQCHTVEKGGKHKTGPNLHGLFGRKTGQAAGFSYTDANKNKGITWGEDTLMEYLE
cccccccceeeeeecccceeeecccccccccceeeecccccccccceeeccccccccceecccchhhhhc
NPKKYIPGTKMIFAGIKKKGERADLIAYLKKATNE
ccccccccchhhhhhhhhhcchhhhhhhhhhceec
Sequence length : 105
GOR4 :
Alpha helix (Hh) : 25 is 23.81% 310
helix (Gg) : 0 is 0.00% Pi helix (Ii) : 0 is 0.00% Beta bridge (Bb) : 0 is 0.00% Extended strand (Ee) : 21 is 20.00% Beta turn (Tt) : 0 is 0.00% Bend region (Ss) : 0 is 0.00% Random coil (Cc) : 59 is 56.19% Ambigous states (?) : 0 is 0.00% Other states : 0 is 0.00%
Prediction result file (text): [GOR4]
C.三级结构用的是/urbm/bioinfo/esypred/
结果如下:e-mail:*******************4.
用PHYLIP软件推导进化树。
a. 打开文件 →在下拉菜单file中点击load Sequence
→在弹出窗口中选择 打开 → 在下拉菜单Alignment中单击Do Complete Alignment → 单击ALIGN →在下拉菜单file中点击Save Sequence as →在弹出窗口的Format选项中选择PHYLIP→OK→得到文件。 如下:14 111
fruit --MGVPAGDV EKGKKLFVQR CAQCHTVEAG GKHKVGPNLH GLIGRKTGQA
starfish -------GQV EKGKKIFVQR CAQCHTVEKA GKHKTGPNLN GILGRKTGQA
common -------GDV EKGKKIFVQK CAQCHTVEKG GKHKTGPNLH GLFGRKTGQA
donkey -------GDV EKGKKIFVQK CAQCHTVEKG GKHKTGPNLH GLFGRKTGQA
Bos -------GDV EKGKKIFVQK CAQCHTVEKG GKHKTGPNLH GLFGRKTGQA
Rattus ------MGDV EKGKKIFVQK CAQCHTVEKG GKHKTGPNLH GLFGRKTGQA
Mus ------MGDV EKGKKIFVQK CAQCHTVEKG GKHKTGPNLH GLFGRKTGQA
Homo ------MGDV EKGKKIFIMK CSQCHTVEKG GKHKTGPNLH GLFGRKTGQA
Gallus ------MGDI EKGKKIFVQK CSQCHTVEKG GKHKTGPNLH GLFGRKTGQA
carp -------GDV EKGKKVFVQK CAQCHTVZBG GKHKVGPNLW GLFGRKTGQA
Tigriopus ------MGDI DKGKKIFVQK CTQCHTIEAG GKHKVGPNLH GMYGRQTGKA
yeast --MPYAPGDE KKGASLFKTR CAQCHTVEKG GANKVGPNLH GVFGRKTGQA
Pichia MPAPFEKGSE KKGATLFKTR CLQCHTVEEG GPHKVGPNLH GIMGRKSGQA
Actinobaci -----MTKLL QKIAFILPLV FSLVAXAEMV DTFQFQNETD RVR--AVALA
AGFAYTDANK AKGITWNEDT LFEYLENPKK YIPGTKMIFA GLKKPNERGD
AGFSYTDANR NKGITWKNET LFEYLENPKK YIPGTKMVFA GLKKQKERQD
PGFSYTDANK NKGITWKEET LMEYLENPKK YIPGTKMIFA GIKKKTERED
PGFSYTDANK NKGITWKEET LMEYLENPKK YIPGTKMIFA GIKKKTERED
PGFSYTDANK NKGITWGEET LMEYLENPKK YIPGTKMIFA GIKKKGERED
AGFSYTDANK NKGITWGEDT LMEYLENPKK YIPGTKMIFA GIKKKGERAD
AGFSYTDANK NKGITWGEDT LMEYLENPKK YIPGTKMIFA GIKKKGERAD
PGYSYTAANK NKGIIWGEDT LMEYLENPKK YIPGTKMIFV GIKKKEERAD
EGFSYTDANK NKGITWGEDT LMEYLENPKK YIPGTKMIFA GIKKKSERVD
PGFSYTBABK SKGIVWBZZT LMEYLZBPKK YIPGTKMIFA GIKKKGE---
AGYSYTDANK SKGVTWNEET LDIYLTNPKK YIPGTKMVFA GLKKKGDRED
EGFSYTEANR DKGITWDEET LFAYLENPKK YIPGTKMAFA GFKKPADRNN
VGYSYTDANK KKGVEWSEQT MSDYLENPKK YIPGTKMAFG GLKKPKDRND
KSLRCPQCQN QNLVESNATT AYKLRLEVYE MVNQGKTDEE IIKIMTERFG
LIAYLKSATK -
LIAYLEAATK - LIAYLKKATN E LIAYLKKATN E LIAYLKKATN E LIAYLKKATN E LIAYLKKATN E LIAYLKKATN E LIAYLKDATS K ---------- - LIAYLKSASS S
VITYLKKATS E LVTYLASATK -
HFVNYKPPFN Ab 进入EXE文件夹,点击SEQBOOT软件输入文件名,回车后,输入R更改参数,更改重复数字为200。输Y确认参数。输入奇数种子3。程序开始运行,并在EXE文件夹中产生outfile文件。
c 把文件outfile改为infile。点击protdist程序。输入M更改参数,输入D选择data ts。输入200。输Y确认参数。程序开始运行,并在EXE文件夹中产生outfile d 将outfile文件名改为infile,为避免与原先infile文件重复,将
原先文件名改为infile1。在EXE文件夹中选择通过距离矩阵推测进化树的算法,点击NEIGHBOR程序。输入M更改参数,输入D选择data ts。输入200。输入奇数种子3。输Y确认参数。程序开始运行,并在EXE文件夹中产生outfile和outtree两个结果输出。outtree文件是一个树文件,可以用treeview等软件打开。outfile是一个分析结果的输出报告,包括了树和其他一些分析报告,可以用记事本直接打开。部分结果如下:Connsus tree program, version 3.6bSpecies in order:
1. Homo 2. Mus
3. Rattus 4. Bos
5. common 6. donkey 7. carp
8. Tigriopus 9. yeast
10. Actinobaci 11. Pichia 12. Gallus
13. starfish 14. fruit
Sets included in the connsus tree
Set (species in order) How many times out of 200.00
....**.... .... 166.00
.**....... .... 161.00
.......*** *.** ** *.** 93.00
...******* **** ***.... .... 83.00
........** .... 80.00
.......*** **** ** *.*. 68.00
........** *... 61.00
...****... .... 57.00
Sets NOT included in connsus tree:
Set (species in order) How many times out of
200.00
......**** *.** 63.00
...******* *.** *. *... 57.00
......**** **** .* ..*. 45.00
.........* *...
.********* *.**
...*..*... ....
........** *..*
....****** *.**
.**....*** ****
.........* ..**
.......*** *..*
........** ..**
........** ..*.
........** ...*
......**.. ....
.........* ...*
....****** ****
.........* *.*.
.......*** *.*.
.......**. *...
37.00
37.0035.00
29.00
26.00
24.00
24.00
23.00
23.00
23.00
22.00
20.00
19.00
16.00
16.00
15.00
14.00
.......*.. *... 13.00
........** **** ***... .... *.* ...* 11.00
.******... .... *****.. ....
.**...**** ****
..******** ****
.***...... ....
........*. *.*.
.......*.* ..**
.........* .*..
...*.*.... ....
.***..**** ****
.*.******* ****
....**.*** *.**
.......... *.*.
.......*** *...
.......*.* ....
......**** *..*
.***..*... ....
....**.*** ****
10.00
10.00 10.009.00
9.00
8.00
8.00
8.00
8.00 7.00 7.007.00
7.00
7.00
6.00
6.00
6.00
.......**. *.** .* .**. .* *.** ** .*** **..... .... 5.00
.......*.* *.**
......***. *.**
.**.****** ****
.......*.* .*.*
.......*.* *.*.
........** .*..
........*. *.**
.......*.. ...*
.......*.* *...
.....***.. ....
..**...... ....
.***..***. *.**
......*.** *.**
........** ***.
...*****.. *...
....****.. ....
......*... ...*
5.00
5.00
5.00 4.00
4.00
4.00
4.00
4.00
4.00
3.00
3.00
3.00 3.00
3.00
3.00
3.00
3.00
......*..* ...* 3.00.**....... .*.. 3.00
...*..**** *.** **.. *... 3.00
.*.****... .... ***. *...
.......*.* .***
...***.*** *.**
.......*** ....
.********. *.**
......*..* ....
.......... ..**
.......*.* ..*.
......**.. .*..
.......*.* .**.
.******... .*..
...******. *.**
.*******.. ....
......*.*. *...
....*....* .*..
...*.**... ....
....**...* .*..
3.00
3.00
3.00
2.00
2.002.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
.*******.. *... **.*... .... 2.00
..******.. .... *. *..* 2.00
.*****.... .... *.. *.**
....*..*** *.**
....*.*... ....
.......**. *.*.
.**.....** ****
......*... .*..
.........* .***
...***.*.. ....
.......*** ...*
......*.*. *.*.
........** **..
.........* .*.*
.*......** ****
.......... *.**
...*..***. *.**
.*.....*** ****
.........* ***.
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
2.00
.......*.* *..* 2.00
.**.****.. .... *.** .*** *.*. *.** ***.... ..*. 1.00
...****.** *.**
..*...*... ....
.**......* ....
.**...**.. ....
.*******.. ..*.
....*....* .**.
.......*.* ****
.***...*** ****
......**.* ...*
....*..*** ****
.****...** ****
.****..... ....
...****... ...*
.*.......* ....
....*****. ****
.*****.*** ****
........*. ...*
1.001.00
1.00
1.00
1.00 1.00
1.00
1.00 1.00
1.00 1.00 1.00
1.00
1.00
1.00 1.001.00
......*... *... 1.00
.*****.*** *.** **..** *.** 1.00.**.**...* .*.* 1.00
.***....** **** 1.00..*.....** ****
......*.*. *..*
....**.... ..*.
...*..**.. ....
......*.** ****
.......*.* .*..
....*...** ....
...*****.. ..*.
....**...* .**.
..*****... ....
.****..*** ****
...*.*.*** *.**
......*.** *.*.
...****.*. *.*.
....**..** ****
......**.* .*.*
....**...* .*.*
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
.....**... .... 1.00
....***.*. *.*. *.*. .... 1.00
....**..** .... 1.00.*.....*.. .... 1.00
....***..* .***
.******.** ****
.******... ...*
...***.*** ****
......***. ...*
.**....*.. ....
.**.*..... ....
......**.. ...*
.......**. *..*
.......*.. .*..
...****..* .***
.**...***. *.**
.*.......* .*..
.**..***.. ....
.**......* .*..
...*..**.. *.**
....**.*.* .*.*
1.00 1.00 1.00 1.00 1.00
1.00
1.00
1.00
1.00
1.00
1.00 1.001.00
1.00
1.00
1.00
1.00
.*.*...... .... 1.00
......***. **** ** **.* 1.00
.******..* .*** ****... .*.. 1.00
.......*.. *.*.
.********. ****
.*.....*.* .*.*
.**....*.* .*.*
.**....*.* .**.
..******.. ..*.
......***. *.*.
.*******.. **..
..*....*** ****
......*.*. ...*
.***..**** *.**
.*****..** ****
...*...*** *.**
.**...*... ....
......*..* .*..
...******. ****
.......*** .*..
1.00
1.00 1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
.**.....** .*** 1.00
.*.*..*... .... 1.00
Extended majority rule connsus tree
CONSENSUS TREE:
the numbers on the branches indicate the number
of times the partition of the species into the two tswhich are parated by that branch occurred
among the trees, out of 200.00 trees
+-------------Pichia
+-61.0-|
|
+------yeast
+-68.0-|
+-80.0-|
|
+------Actinobaci
+-93.0-| | |
+--------------------starfish
+133.0-| |
|
|
|
|
|
+---------------------------fruit
|
+-70.0-| | |
+----------------------------------Tigriopus | |
|
+-----------------------------------------Gallus +-84.0-|
|
+------common
|
+166.0-|
| |
+------donkey
| |
+------| +----------------------57.0-|
| |
|
+--------------------carp
| |
|
|
|
+-83.0-|
| |+-------------Bos | |
|
+------Rattus
| +------------------------------------------161.0-|
|
+------Mus |
+--------------------------------------------------------------Homo
e 将outtree文件名改为intree,点击DRAWTREE程序,输入font1文件名,作为参数。输Y确认参数。程序开始运行,并出现Tree
Preview图。
f 点击DRAWGRAM程序,输入font1文件名,作为参数。输Y确认参数。程序开始运行,并出现Tree Preview图。 g 将EXE文件夹中的outfile文件名改为outfile1,以避免被新生成的outfile 文件覆盖。点击CONSENSE程序。输入Y确认设置。EXE文件夹中新生成outfile和outtree。Outfile文件用记事本打开,将EXE文件夹中的intree文件名改为intree1,将outtree改intree。点击DRAWTREE程序,输入font1文件名,作为参数。输Y确认参数。程序开始运行,并出现Tree Preview图。 8、点击DRAWGRAM程序,输入font1文件名,作为参数。输Y确认参数。程序开始运行,并出现Tree Preview图。
本文发布于:2023-12-08 02:42:47,感谢您对本站的认可!
本文链接:https://www.wtabcd.cn/zhishi/a/1701974567239217.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文word下载地址:细胞色素c序列查找和分析.doc
本文 PDF 下载地址:细胞色素c序列查找和分析.pdf
留言与评论(共有 0 条评论) |