以前简单说过使用GROMACS进行团簇分析的方法, 是利用脚本组合mdmat
和cluster
这两个工具完成的. 最近又需要进行团簇分析, 就把这两部分的代码整理了一下, 组合成一个新程序mdcluster
, 并将其编译为GROMACS的内置模块, 这样就可以直接gmx mdcluster
调用了, 更方便, 速度也更快了.
编写GROMACS分析程序涉及到的知识在前两篇文档中有说明, 但对于不熟悉C++的人来说, 还是很难看明白的. 所以我这里就具体地说下我的作法, 供参考.
在src\gromacs\gmxana\gmx_ana.h
中定义要添加的程序
<div class="highlight"><pre style="line-height:125%"><span></span><span style="color: #00BB00; font-weight: bold">int</span> <span style="color: #00A000">gmx_mdcluster</span>(<span style="color: #00BB00; font-weight: bold">int</span> argc, <span style="color: #00BB00; font-weight: bold">char</span> <span style="color: #666666">*</span>argv[]);</pre></div>
在src\programs\legacymodules.cpp
中的// Modules from gmx_ana.h.
部分注册模块, 这样才能在gmx
主程序中调用
<div class="highlight"><pre style="line-height:125%"><span></span>registerModule(manager, <span style="color: #666666">&</span>gmx_mdcluster, <span style="color: #BB4444">"mdcluster"</span>,
<span style="color: #BB4444">"md Cluster structures"</span>);</pre></div>
将src\gromacs\gmxana\gmx_mdmat.cpp
和src\gromacs\gmxana\gmx_cluster.cpp
两个文件合并, 命名为mdcluster.cpp
整理mdcluster.cpp
的头文件, 修改主函数的名称为mdcluster
试着进行编译. 遇到函数重定义的错误, 就修改函数名称. 直到编译通过, 执行gmx mdcluster
成功
根据需要编写需要的功能. 主要是弄明白mdmat
函数输出的距离矩阵, 并在cluster
函数中调用它.
根据需要, 添加其他需要的功能. 必要时可参考GROMACS自带的其他分析程序.
步骤说起来并不复杂, 但实际做起来还是有点难度的, 虽然不需要你熟悉C++, 但至少需要要像我一样, 学过基本的C, 否则很多语句会看不懂, 也就无从修改了.
另外, 我的做法比较脏快, 好在也能解决问题, 所以我也就不再考虑更优雅的作法了.
代码
测试GROMACS 2016.5和2018版本, 都正常通过.
使用方法
<div class="highlight"><pre style="line-height:125%"><span style="color:#A2F">gmx</span> mdcluster <span style="color:#666">-f</span> <span style="color:#666">-s</span> <span style="color:#666">-n</span></pre></div>选项
-f
: 默认traj.trr
, 要分析的轨迹文件
-s
: 默认topol.tpr
, 运行输入文件
-n
: 默认index.ndx
, 可选索引文件
-g
: 默认cluster.log
, 输出文件, 包含每个时刻的团簇信息
-num
: 默认clust-num.xvg
, 可选输出xvg文件, 每个时刻的团簇数目
-xyz
: 默认clust-xyz.pdb
, 可选输出xyz格式的坐标文件, 列出每个团簇的坐标. 之所以扩展名为pdb, 是因为GROMACS不支持指定xyz格式的输出文件, 只能使用pdb代替了.
- 其他涉及团簇分析的选项没有改变, 参考
cluster
的文档.
输出文件
log文件
按时间顺序列出团簇数目, 距离矩阵的简单信息, 以及每个团簇包含的残基的编号.
<div class="highlight"><pre style="line-height:125%"><span></span>Time: 7.4 Clusters: 3
Distance: 0.168967 to 1.00864 nm Average: 0.473927
Energy of the matrix: 2.95684
<span style="color: #008800; font-style: italic">#cluster | members(#res)</span>
<span style="color: #666666">1</span> | <span style="color: #666666">1</span> <span style="color: #666666">4</span> <span style="color: #666666">5</span> <span style="color: #666666">7</span> 9
<span style="color: #666666">2</span> | <span style="color: #666666">2</span> <span style="color: #666666">6</span> 10
<span style="color: #666666">3</span> | <span style="color: #666666">3</span> 8
Time: 7.6 Clusters: 4
Distance: 0.171863 to 1.03121 nm Average: 0.462375
Energy of the matrix: 2.8525
<span style="color: #008800; font-style: italic">#cluster | members(#res)</span>
<span style="color: #666666">1</span> | <span style="color: #666666">1</span> <span style="color: #666666">4</span> <span style="color: #666666">5</span> <span style="color: #666666">7</span> 9
<span style="color: #666666">2</span> | 2
<span style="color: #666666">3</span> | <span style="color: #666666">3</span> <span style="color: #666666">6</span> 10
<span style="color: #666666">4</span> | 8
Time: 7.8 Clusters: 6
Distance: 0.176229 to 1.07498 nm Average: 0.472193
Energy of the matrix: 3.18141
<span style="color: #008800; font-style: italic">#cluster | members(#res)</span>
<span style="color: #666666">1</span> | <span style="color: #666666">1</span> <span style="color: #666666">7</span> 9
<span style="color: #666666">2</span> | 2
<span style="color: #666666">3</span> | <span style="color: #666666">3</span> 8
<span style="color: #666666">4</span> | 4
<span style="color: #666666">5</span> | 5
<span style="color: #666666">6</span> | <span style="color: #666666">6</span> 10</pre></div>团簇数目
很简单的xvg文件, 无需多说
<div class="highlight"><pre style="line-height:125%"><span></span><span style="color: #008800; font-style: italic"># This file was created Fri Jul 6 09:24:17 2018</span>
<span style="color: #008800; font-style: italic"># Created by:</span>
<span style="color: #008800; font-style: italic"># :-) GROMACS - gmx mdcluster, 2016.5 (double precision) (-:</span>
<span style="color: #008800; font-style: italic">#</span>
<span style="color: #008800; font-style: italic"># Executable: C:\Users\Jicun\Downloads\GMX2016.5\bin\Release\gmx_d.exe</span>
<span style="color: #008800; font-style: italic"># Data prefix: C:\Program Files\Gromacs</span>
<span style="color: #008800; font-style: italic"># Working dir: C:\Users\Jicun\Downloads\GMX2016.5\bin\Release</span>
<span style="color: #008800; font-style: italic"># Command line:</span>
<span style="color: #008800; font-style: italic"># gmx_d mdcluster -g -no -cutoff .2 -xyz</span>
<span style="color: #008800; font-style: italic"># gmx mdcluster is part of G R O M A C S:</span>
<span style="color: #008800; font-style: italic">#</span>
<span style="color: #008800; font-style: italic"># Great Red Owns Many ACres of Sand</span>
<span style="color: #008800; font-style: italic">#</span>
@ title <span style="color: #BB4444">"Cluster Numbers"</span>
@ xaxis label <span style="color: #BB4444">"Time (ps)"</span>
@ yaxis label <span style="color: #BB4444">"Cluster Number"</span>
@TYPE xy
@g0 <span style="color: #AA22FF">type</span> bar
0.000000 9
0.200000 10
0.400000 10
0.600000 9
0.800000 8
1.000000 8
1.200000 9
1.400000 9
1.600000 9
1.800000 9</pre></div>团簇结构
列出每个时刻下每个团簇中所有原子的名称及其xyz坐标, 是常用的xyz格式, 标题行中还给出时刻, 团簇编号以及盒子大小的信息. 对团簇进行后处理的时候需要这些信息.
值得注意的是:
- 这些坐标是原始轨迹中的坐标, 没有经过PBC处理, 所以有些构型看起来好像不正确.
- 不要使用VMD来查看xyz文件, 因为VMD要求每个构型的原子数目是固定的, 但团簇分析结果显然无法满足这个条件, 所以VMD显示出来会有错误. 我一般用JMOL来查看这种文件.
<div class="highlight"><pre style="line-height:125%"><span></span>9
t: 20.000000 cl: <span style="color: #666666">1</span> box: 1.959580 0.000000 0.000000 0.000000 1.959580 0.000000 0.000000 0.000000 1.959580
OW 16.299245 17.166265 7.652279 <span style="color: #008800; font-style: italic">#res:1</span>
HW1 17.101772 16.680043 7.841409 <span style="color: #008800; font-style: italic">#res:1</span>
HW2 16.426502 17.503887 6.765685 <span style="color: #008800; font-style: italic">#res:1</span>
OW 13.016180 17.398965 3.690518 <span style="color: #008800; font-style: italic">#res:3</span>
HW1 13.727098 17.004212 3.185545 <span style="color: #008800; font-style: italic">#res:3</span>
HW2 13.171825 17.110682 4.589906 <span style="color: #008800; font-style: italic">#res:3</span>
OW 13.643707 16.886055 6.493071 <span style="color: #008800; font-style: italic">#res:6</span>
HW1 13.265058 16.279565 7.129487 <span style="color: #008800; font-style: italic">#res:6</span>
HW2 14.567202 16.945082 6.737825 <span style="color: #008800; font-style: italic">#res:6</span>
9
t: 20.000000 cl: <span style="color: #666666">2</span> box: 1.959580 0.000000 0.000000 0.000000 1.959580 0.000000 0.000000 0.000000 1.959580
OW 12.382308 0.610490 3.052188 <span style="color: #008800; font-style: italic">#res:2</span>
HW1 12.713196 -0.143198 3.540757 <span style="color: #008800; font-style: italic">#res:2</span>
HW2 11.454754 0.658140 3.283711 <span style="color: #008800; font-style: italic">#res:2</span>
OW 11.333401 1.919177 0.772347 <span style="color: #008800; font-style: italic">#res:7</span>
HW1 11.814307 1.534172 1.504967 <span style="color: #008800; font-style: italic">#res:7</span>
HW2 11.917793 2.596026 0.430886 <span style="color: #008800; font-style: italic">#res:7</span>
OW 14.576921 0.086829 1.378225 <span style="color: #008800; font-style: italic">#res:8</span>
HW1 13.888679 0.222418 2.029508 <span style="color: #008800; font-style: italic">#res:8</span>
HW2 14.172984 -0.473075 0.715220 <span style="color: #008800; font-style: italic">#res:8</span>
6
t: 20.000000 cl: <span style="color: #666666">3</span> box: 1.959580 0.000000 0.000000 0.000000 1.959580 0.000000 0.000000 0.000000 1.959580
OW 16.912391 17.797985 2.399200 <span style="color: #008800; font-style: italic">#res:4</span>
HW1 17.057798 17.103391 1.756840 <span style="color: #008800; font-style: italic">#res:4</span>
HW2 16.319146 18.408840 1.961993 <span style="color: #008800; font-style: italic">#res:4</span>
OW 16.936209 18.858645 4.993168 <span style="color: #008800; font-style: italic">#res:10</span>
HW1 16.868745 19.803851 4.858015 <span style="color: #008800; font-style: italic">#res:10</span>
HW2 16.985083 18.492818 4.109984 <span style="color: #008800; font-style: italic">#res:10</span>
3
t: 20.000000 cl: <span style="color: #666666">4</span> box: 1.959580 0.000000 0.000000 0.000000 1.959580 0.000000 0.000000 0.000000 1.959580
OW 13.452642 17.216688 0.500806 <span style="color: #008800; font-style: italic">#res:5</span>
HW1 12.790335 17.045542 1.170352 <span style="color: #008800; font-style: italic">#res:5</span>
HW2 12.953310 17.324260 -0.308717 <span style="color: #008800; font-style: italic">#res:5</span>
3
t: 20.000000 cl: <span style="color: #666666">5</span> box: 1.959580 0.000000 0.000000 0.000000 1.959580 0.000000 0.000000 0.000000 1.959580
OW 16.102296 1.618148 3.075337 <span style="color: #008800; font-style: italic">#res:9</span>
HW1 15.229665 1.230862 3.144304 <span style="color: #008800; font-style: italic">#res:9</span>
HW2 16.380787 1.424401 2.180275 <span style="color: #008800; font-style: italic">#res:9</span></pre></div>未完成
- 团簇分析时采用的截断距离只有一个, 这对复杂的体系可能不合适. 可扩展一下, 对不同的原子使用不同的截断值.
- 输出构型使用pdb格式可能更好, 虽然文件大些
- 可换用另外的团簇分析算法