layout: post title: 拼接分子的简单脚本 categories:
在分子建模, 特别是聚合物建模过程中, 有时我们需要构建具有周期性重复结构的分子, 或者是具有不同聚合度的分子. 虽然有些可视化的建模软件支持这种功能, 但使用起来不一定方便, 所以我还是写了一个简单的bash脚本来做这件事情, 放在这里备份下, 也供需要的人参考.
脚本主要使用awk语言, 利用分子单体构建聚合物. 方法很简单, 先在分子单体中定义某个原子为"头原子", 另一原子为"尾原子", 拼接时将第二个分子单体平移至其头原子与第一个分子单体的尾原子重合的位置, 然后将这两个原子删除. 当然, 首端的分子单体会保留其头原子, 而末端的分子单体会保留其尾原子.
使用时, 在命令行中指定分子单体的文件名称, 头原子编号, 尾原子编号, 拼接的片段数目(聚合度)即可.
为方便读取和书写, 输入文件与输出文件均采用.xyz
格式: 第一行指定原子数目, 第二行是随意的标题行, 后续每行指定每个原子的元素符号与xyz坐标. 值得注意的是, 使用VMD查看分子结构时, 其中的原子编号是从0开始, 所以VMD提示编号为n的原子在从1开始的编号中为n+1号原子.
由于采用的是头原子与尾原子重合并删除的拼接方法, 所以默认情况下, 删除重合原子后, 其相邻原子间的距离可能过大, 导致可视化软件判断成键时不正常, 简单的解决方法是将头原子与尾原子所在键的长度适当缩小. 虽然这样拼接后所得聚合物首端头原子和末端尾原子的成键距离偏小, 但用于分子动力学模拟的初始构型足够了.
以下面的分子为例, 头原子为3号, 尾原子为11号, 为清楚起见原子类型已改为Cl, 且它们所在键长已经适当调整.
<table class="highlighttable"><th colspan="2" style="text-align:left">ht.xyz</th><tr><td><div class="linenodiv" style="background-color: #f0f0f0; padding-right: 10px"><pre style="line-height: 125%"> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26</pre></div></td><td class="code"><div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span></span><span style="color: #666666">24</span> Tips C <span style="color: #666666">-2.25735638</span> <span style="color: #666666">-0.11936983</span> <span style="color: #666666">-0.24914958</span> C <span style="color: #666666">-0.73984978</span> <span style="color: #666666">-0.32271456</span> <span style="color: #666666">-0.08354569</span> Cl <span style="color: #666666">-2.37768477</span> <span style="color: #666666">0.47589194</span> <span style="color: #666666">-0.30646016</span> H <span style="color: #666666">-2.59326436</span> <span style="color: #666666">-0.63693340</span> <span style="color: #666666">-1.12333136</span> H <span style="color: #666666">-2.76475201</span> <span style="color: #666666">-0.50466899</span> <span style="color: #666666">0.61049816</span> C <span style="color: #666666">-0.43606989</span> <span style="color: #666666">-1.82550657</span> <span style="color: #666666">0.06114004</span> H <span style="color: #666666">-0.23245414</span> <span style="color: #666666">0.06258460</span> <span style="color: #666666">-0.94319342</span> H <span style="color: #666666">-0.40394179</span> <span style="color: #666666">0.19484900</span> <span style="color: #666666">0.79063609</span> C <span style="color: #666666">1.08143670</span> <span style="color: #666666">-2.02885130</span> <span style="color: #666666">0.22674393</span> H <span style="color: #666666">-0.94346553</span> <span style="color: #666666">-2.21080573</span> <span style="color: #666666">0.92078778</span> Cl <span style="color: #666666">1.21557328</span> <span style="color: #666666">-2.69242179</span> <span style="color: #666666">0.29063114</span> H <span style="color: #666666">1.58883234</span> <span style="color: #666666">-1.64355214</span> <span style="color: #666666">-0.63290380</span> H <span style="color: #666666">1.41734469</span> <span style="color: #666666">-1.51128773</span> <span style="color: #666666">1.10092571</span> C <span style="color: #666666">-0.91952625</span> <span style="color: #666666">-2.57041114</span> <span style="color: #666666">-1.19702812</span> C <span style="color: #666666">-1.34532335</span> <span style="color: #666666">-3.90221323</span> <span style="color: #666666">-1.10257193</span> C <span style="color: #666666">-0.93367443</span> <span style="color: #666666">-1.91647220</span> <span style="color: #666666">-2.43641735</span> C <span style="color: #666666">-1.78527002</span> <span style="color: #666666">-4.58007592</span> <span style="color: #666666">-2.24750470</span> H <span style="color: #666666">-1.33451760</span> <span style="color: #666666">-4.40151122</span> <span style="color: #666666">-0.15627137</span> C <span style="color: #666666">-1.37362358</span> <span style="color: #666666">-2.59433407</span> <span style="color: #666666">-3.58134966</span> H <span style="color: #666666">-0.60856921</span> <span style="color: #666666">-0.89961164</span> <span style="color: #666666">-2.50853669</span> C <span style="color: #666666">-1.79942144</span> <span style="color: #666666">-3.92613591</span> <span style="color: #666666">-3.48689333</span> H <span style="color: #666666">-2.11037260</span> <span style="color: #666666">-5.59693736</span> <span style="color: #666666">-2.17538586</span> H <span style="color: #666666">-1.38442919</span> <span style="color: #666666">-2.09503612</span> <span style="color: #666666">-4.52765025</span> H <span style="color: #666666">-2.13533239</span> <span style="color: #666666">-4.44369848</span> <span style="color: #666666">-4.36107455</span> </pre></div> </td></tr></table>
<figure><script>var Mol1=new ChemDoodle.TransformCanvas3D('Mol-1',642,396);Mol1.specs.shapes_color='#fff';Mol1.specs.backgroundColor='black';Mol1.specs.set3DRepresentation('Ball and Stick');Mol1.specs.projectionPerspective_3D=false;Mol1.specs.compass_display=true; /*//Mol1.specs.atoms_resolution_3D=15; //Mol1.specs.bonds_resolution_3D=15; //Mol1.specs.crystals_unitCellLineWidth=1.5;*/ Mol1.nextFrame=function(delta){var matrix=[];ChemDoodle.lib.mat4.identity(matrix);var change=delta*Math.PI/15000;ChemDoodle.lib.mat4.rotate(matrix,change,[1,0,0]);ChemDoodle.lib.mat4.rotate(matrix,change,[0,1,0]);ChemDoodle.lib.mat4.rotate(matrix,change,[0,0,1]);ChemDoodle.lib.mat4.multiply(this.rotationMatrix, matrix)}; Mol1.startAnimation=ChemDoodle._AnimatorCanvas.prototype.startAnimation;Mol1.stopAnimation=ChemDoodle._AnimatorCanvas.prototype.stopAnimation;Mol1.isRunning=ChemDoodle._AnimatorCanvas.prototype.isRunning;Mol1.dblclick=ChemDoodle.RotatorCanvas.prototype.dblclick;Mol1.timeout=5;Mol1.handle=null; var Fmol='24\nTips\nC -2.25735638 -0.11936983 -0.24914958\nC -0.73984978 -0.32271456 -0.08354569\nCl -2.37768477 0.47589194 -0.30646016\nH -2.59326436 -0.63693340 -1.12333136\nH -2.76475201 -0.50466899 0.61049816\nC -0.43606989 -1.82550657 0.06114004\nH -0.23245414 0.06258460 -0.94319342\nH -0.40394179 0.19484900 0.79063609\nC 1.08143670 -2.02885130 0.22674393\nH -0.94346553 -2.21080573 0.92078778\nCl 1.21557328 -2.69242179 0.29063114\nH 1.58883234 -1.64355214 -0.63290380\nH 1.41734469 -1.51128773 1.10092571\nC -0.91952625 -2.57041114 -1.19702812\nC -1.34532335 -3.90221323 -1.10257193\nC -0.93367443 -1.91647220 -2.43641735\nC -1.78527002 -4.58007592 -2.24750470\nH -1.33451760 -4.40151122 -0.15627137\nC -1.37362358 -2.59433407 -3.58134966\nH -0.60856921 -0.89961164 -2.50853669\nC -1.79942144 -3.92613591 -3.48689333\nH -2.11037260 -5.59693736 -2.17538586\nH -1.38442919 -2.09503612 -4.52765025\nH -2.13533239 -4.44369848 -4.36107455\n'; Mol1.loadMolecule(ChemDoodle.readXYZ(Fmol));Mol1.startAnimation();Mol1.stopAnimation();function setProj1(yesPers){Mol1.specs.projectionPerspective_3D=yesPers;Mol1.setupScene();Mol1.repaint()}function setModel1(model){Mol1.specs.set3DRepresentation(model);Mol1.setupScene();Mol1.repaint()}function setSpeed1(){Mol1.timeout=500-document.getElementById('spd1').value;Mol1.loadMolecule(ChemDoodle.readXYZ(Fmol));Mol1.startAnimation()}</script><br><span class='meta'>视图: <input type='radio' name='group2' onclick='setProj1(true)'>投影 <input type='radio' name='group2' onclick='setProj1(false)' checked=''>正交 速度: <input type='range' id='spd1' min='1' max='500' onchange='setSpeed1()'/><br>模型: <input type='radio' name='model' onclick='setModel1('Ball and Stick')' checked=''>球棍 <input type='radio' name='model' onclick='setModel1('van der Waals Spheres')'>范德华球 <input type='radio' name='model' onclick='setModel1('Stick')'>棍状 <input type='radio' name='model' onclick='setModel1('Wireframe')'>线框 <input type='radio' name='model' onclick='setModel1('Line')'>线型 <input type='checkbox' onclick='Mol1.specs.atoms_displayLabels_3D=this.checked;Mol1.repaint()'>名称<br>左键: 转动 滚轮: 缩放 双击: 自动旋转开关 Alt+左键: 移动</span><br><figurecaption>Fig.1</figurecaption></figure>执行下面的命令获得聚合度为10的分子
<div class="highlight" style="background:#f8f8f8"><pre style="line-height:125%"><span style="color:#B8860B">bash </span>commol.bsh ht.xyz 3 11 10 </pre></div>得到聚合物结构如下
<figure><script>var Mol2=new ChemDoodle.TransformCanvas3D('Mol-2',642,396);Mol2.specs.shapes_color='#fff';Mol2.specs.backgroundColor='black';Mol2.specs.set3DRepresentation('Ball and Stick');Mol2.specs.projectionPerspective_3D=false;Mol2.specs.compass_display=true; /*//Mol2.specs.atoms_resolution_3D=15; //Mol2.specs.bonds_resolution_3D=15; //Mol2.specs.crystals_unitCellLineWidth=1.5;*/ Mol2.nextFrame=function(delta){var matrix=[];ChemDoodle.lib.mat4.identity(matrix);var change=delta*Math.PI/15000;ChemDoodle.lib.mat4.rotate(matrix,change,[1,0,0]);ChemDoodle.lib.mat4.rotate(matrix,change,[0,1,0]);ChemDoodle.lib.mat4.rotate(matrix,change,[0,0,1]);ChemDoodle.lib.mat4.multiply(this.rotationMatrix, matrix)}; Mol2.startAnimation=ChemDoodle._AnimatorCanvas.prototype.startAnimation;Mol2.stopAnimation=ChemDoodle._AnimatorCanvas.prototype.stopAnimation;Mol2.isRunning=ChemDoodle._AnimatorCanvas.prototype.isRunning;Mol2.dblclick=ChemDoodle.RotatorCanvas.prototype.dblclick;Mol2.timeout=5;Mol2.handle=null; var Fmol='222\nTips 24 3 11\nC -2.257356 -0.119370 -0.249150\nC -0.739850 -0.322715 -0.083546\nCl -2.377685 0.475892 -0.306460\nH -2.593264 -0.636933 -1.123331\nH -2.764752 -0.504669 0.610498\nC -0.436070 -1.825507 0.061140\nH -0.232454 0.062585 -0.943193\nH -0.403942 0.194849 0.790636\nC 1.081437 -2.028851 0.226744\nH -0.943466 -2.210806 0.920788\nH 1.588832 -1.643552 -0.632904\nH 1.417345 -1.511288 1.100926\nC -0.919526 -2.570411 -1.197028\nC -1.345323 -3.902213 -1.102572\nC -0.933674 -1.916472 -2.436417\nC -1.785270 -4.580076 -2.247505\nH -1.334518 -4.401511 -0.156271\nC -1.373624 -2.594334 -3.581350\nH -0.608569 -0.899612 -2.508537\nC -1.799421 -3.926136 -3.486893\nH -2.110373 -5.596937 -2.175386\nH -1.384429 -2.095036 -4.527650\nH -2.135332 -4.443698 -4.361075\nC 1.335902 -3.287684 0.347942\nC 2.853408 -3.491028 0.513546\nH 0.999994 -3.805247 -0.526240\nH 0.828506 -3.672983 1.207589\nC 3.157188 -4.993820 0.658231\nH 3.360804 -3.105729 -0.346102\nH 3.189316 -2.973465 1.387727\nC 4.674695 -5.197165 0.823835\nH 2.649793 -5.379119 1.517879\nH 5.182090 -4.811866 -0.035813\nH 5.010603 -4.679601 1.698017\nC 2.673732 -5.738725 -0.599937\nC 2.247935 -7.070527 -0.505481\nC 2.659584 -5.084786 -1.839326\nC 1.807988 -7.748390 -1.650413\nH 2.258740 -7.569825 0.440820\nC 2.219634 -5.762648 -2.984258\nH 2.984689 -4.067925 -1.911445\nC 1.793837 -7.094450 -2.889802\nH 1.482885 -8.765251 -1.578295\nH 2.208829 -5.263350 -3.930559\nH 1.457926 -7.612012 -3.763983\nC 4.929160 -6.455997 0.945033\nC 6.446666 -6.659342 1.110637\nH 4.593252 -6.973561 0.070851\nH 4.421764 -6.841296 1.804681\nC 6.750446 -8.162134 1.255323\nH 6.954062 -6.274043 0.250989\nH 6.782574 -6.141778 1.984819\nC 8.267953 -8.365479 1.420927\nH 6.243051 -8.547433 2.114970\nH 8.775348 -7.980180 0.561279\nH 8.603861 -7.847915 2.295108\nC 6.266990 -8.907039 -0.002846\nC 5.841193 -10.238841 0.091611\nC 6.252842 -8.253100 -1.242235\nC 5.401246 -10.916703 -1.053322\nH 5.851999 -10.738139 1.037911\nC 5.812893 -8.930962 -2.387167\nH 6.577947 -7.236239 -1.314354\nC 5.387095 -10.262763 -2.292711\nH 5.076144 -11.933565 -0.981203\nH 5.802087 -8.431664 -3.333468\nH 5.051184 -10.780326 -3.166892\nC 8.522418 -9.624311 1.542124\nC 10.039924 -9.827656 1.707728\nH 8.186510 -10.141875 0.667943\nH 8.015022 -10.009610 2.401772\nC 10.343704 -11.330448 1.852414\nH 10.547320 -9.442357 0.848080\nH 10.375832 -9.310092 2.581910\nC 11.861211 -11.533792 2.018018\nH 9.836309 -11.715747 2.712062\nH 12.368606 -11.148493 1.158370\nH 12.197119 -11.016229 2.892200\nC 9.860248 -12.075352 0.594246\nC 9.434451 -13.407154 0.688702\nC 9.846100 -11.421413 -0.645143\nC 8.994504 -14.085017 -0.456231\nH 9.445257 -13.906452 1.635003\nC 9.406151 -12.099275 -1.790076\nH 10.171205 -10.404553 -0.717263\nC 8.980353 -13.431077 -1.695619\nH 8.669402 -15.101879 -0.384112\nH 9.395345 -11.599977 -2.736376\nH 8.644442 -13.948640 -2.569801\nC 12.115676 -12.792625 2.139216\nC 13.633182 -12.995969 2.304820\nH 11.779768 -13.310188 1.265034\nH 11.608280 -13.177924 2.998863\nC 13.936962 -14.498761 2.449505\nH 14.140578 -12.610670 1.445172\nH 13.969090 -12.478406 3.179001\nC 15.454469 -14.702106 2.615109\nH 13.429567 -14.884061 3.309153\nH 15.961865 -14.316807 1.755461\nH 15.790377 -14.184543 3.489291\nC 13.453506 -15.243666 1.191337\nC 13.027709 -16.575468 1.285793\nC 13.439358 -14.589727 -0.048052\nC 12.587762 -17.253331 0.140860\nH 13.038515 -17.074766 2.232094\nC 12.999409 -15.267589 -1.192984\nH 13.764463 -13.572867 -0.120171\nC 12.573611 -16.599391 -1.098528\nH 12.262660 -18.270192 0.212979\nH 12.988603 -14.768291 -2.139285\nH 12.237700 -17.116953 -1.972709\nC 15.708934 -15.960938 2.736307\nC 17.226440 -16.164283 2.901911\nH 15.373026 -16.478502 1.862125\nH 15.201538 -16.346238 3.595955\nC 17.530220 -17.667075 3.046597\nH 17.733836 -15.778984 2.042263\nH 17.562348 -15.646720 3.776093\nC 19.047727 -17.870420 3.212200\nH 17.022825 -18.052374 3.906244\nH 19.555123 -17.485121 2.352553\nH 19.383635 -17.352856 4.086382\nC 17.046764 -18.411980 1.788428\nC 16.620967 -19.743782 1.882885\nC 17.032616 -17.758041 0.549039\nC 16.181020 -20.421645 0.737952\nH 16.631773 -20.243080 2.829185\nC 16.592667 -18.435903 -0.595893\nH 17.357721 -16.741180 0.476920\nC 16.166869 -19.767705 -0.501437\nH 15.855918 -21.438506 0.810071\nH 16.581861 -17.936605 -1.542194\nH 15.830958 -20.285267 -1.375618\nC 19.302192 -19.129252 3.333398\nC 20.819699 -19.332597 3.499002\nH 18.966284 -19.646816 2.459216\nH 18.794796 -19.514551 4.193046\nC 21.123478 -20.835389 3.643688\nH 21.327094 -18.947298 2.639354\nH 21.155607 -18.815033 4.373184\nC 22.640985 -21.038734 3.809292\nH 20.616083 -21.220688 4.503336\nH 23.148381 -20.653435 2.949644\nH 22.976893 -20.521170 4.683474\nC 20.640022 -21.580294 2.385520\nC 20.214225 -22.912096 2.479976\nC 20.625874 -20.926355 1.146130\nC 19.774278 -23.589958 1.335043\nH 20.225031 -23.411394 3.426276\nC 20.185925 -21.604216 0.001198\nH 20.950979 -19.909494 1.074011\nC 19.760127 -22.936018 0.095654\nH 19.449176 -24.606820 1.407162\nH 20.175119 -21.104919 -0.945102\nH 19.424216 -23.453581 -0.778527\nC 22.895450 -22.297566 3.930490\nC 24.412957 -22.500911 4.096093\nH 22.559542 -22.815130 3.056308\nH 22.388054 -22.682865 4.790137\nC 24.716736 -24.003703 4.240779\nH 24.920352 -22.115612 3.236446\nH 24.748865 -21.983347 4.970275\nC 26.234243 -24.207047 4.406383\nH 24.209341 -24.389002 5.100427\nH 26.741639 -23.821748 3.546735\nH 26.570151 -23.689484 5.280565\nC 24.233280 -24.748607 2.982611\nC 23.807483 -26.080409 3.077067\nC 24.219132 -24.094668 1.743222\nC 23.367536 -26.758272 1.932134\nH 23.818289 -26.579707 4.023368\nC 23.779183 -24.772530 0.598289\nH 24.544237 -23.077808 1.671102\nC 23.353385 -26.104332 0.692746\nH 23.042434 -27.775133 2.004253\nH 23.768377 -24.273232 -0.348011\nH 23.017474 -26.621895 -0.181435\nC 26.488708 -25.465880 4.527581\nC 28.006215 -25.669224 4.693185\nH 26.152800 -25.983443 3.653399\nH 25.981312 -25.851179 5.387229\nC 28.309995 -27.172016 4.837870\nH 28.513610 -25.283925 3.833537\nH 28.342123 -25.151661 5.567366\nC 29.827501 -27.375361 5.003474\nH 27.802599 -27.557316 5.697518\nH 30.334897 -26.990062 4.143827\nH 30.163409 -26.857798 5.877656\nC 27.826538 -27.916921 3.579702\nC 27.400741 -29.248723 3.674158\nC 27.812390 -27.262982 2.340313\nC 26.960794 -29.926586 2.529226\nH 27.411547 -29.748021 4.620459\nC 27.372441 -27.940844 1.195381\nH 28.137495 -26.246121 2.268194\nC 26.946643 -29.272646 1.289837\nH 26.635692 -30.943447 2.601345\nH 27.361635 -27.441546 0.249080\nH 26.610732 -29.790208 0.415656\nC 30.081966 -28.634193 5.124672\nC 31.599473 -28.837538 5.290276\nH 29.746058 -29.151757 4.250490\nH 29.574570 -29.019493 5.984320\nC 31.903253 -30.340330 5.434962\nH 32.106868 -28.452239 4.430628\nH 31.935381 -28.319975 6.164458\nC 33.420759 -30.543675 5.600566\nH 31.395857 -30.725629 6.294609\nCl 33.554896 -31.207245 5.664453\nH 33.928155 -30.158376 4.740918\nH 33.756667 -30.026111 6.474747\nC 31.419796 -31.085235 4.176794\nC 30.993999 -32.417037 4.271250\nC 31.405648 -30.431296 2.937404\nC 30.554052 -33.094899 3.126317\nH 31.004805 -32.916335 5.217550\nC 30.965699 -31.109158 1.792472\nH 31.730753 -29.414435 2.865285\nC 30.539901 -32.440959 1.886928\nH 30.228950 -34.111761 3.198436\nH 30.954893 -30.609860 0.846171\nH 30.203990 -32.958522 1.012747\n'; Mol2.loadMolecule(ChemDoodle.readXYZ(Fmol));Mol2.startAnimation();Mol2.stopAnimation();function setProj2(yesPers){Mol2.specs.projectionPerspective_3D=yesPers;Mol2.setupScene();Mol2.repaint()}function setModel2(model){Mol2.specs.set3DRepresentation(model);Mol2.setupScene();Mol2.repaint()}function setSpeed2(){Mol2.timeout=500-document.getElementById('spd2').value;Mol2.loadMolecule(ChemDoodle.readXYZ(Fmol));Mol2.startAnimation()}</script><br><span class='meta'>视图: <input type='radio' name='group2' onclick='setProj2(true)'>投影 <input type='radio' name='group2' onclick='setProj2(false)' checked=''>正交 速度: <input type='range' id='spd2' min='1' max='500' onchange='setSpeed2()'/><br>模型: <input type='radio' name='model' onclick='setModel2('Ball and Stick')' checked=''>球棍 <input type='radio' name='model' onclick='setModel2('van der Waals Spheres')'>范德华球 <input type='radio' name='model' onclick='setModel2('Stick')'>棍状 <input type='radio' name='model' onclick='setModel2('Wireframe')'>线框 <input type='radio' name='model' onclick='setModel2('Line')'>线型 <input type='checkbox' onclick='Mol2.specs.atoms_displayLabels_3D=this.checked;Mol2.repaint()'>名称<br>左键: 转动 滚轮: 缩放 双击: 自动旋转开关 Alt+左键: 移动</span><br><figurecaption>Fig.2</figurecaption></figure>