近日遇到一个需求, 要将一些图片按一定要求放到Word文档中. 这些图片格式不统一, 有pdf, 有png, 有jpg. 宽高也各不同, 但要求图片横向放置, 每页一张. 当然, 如果手动操作, 就没有必要记录作法了.
将问题分解一下, 有下列步骤:
xpdf-tools
中的pdftopng
ImageMagick
前两步没有什么要说的.
Word支持的html文件与常规的有所不同, 具体哪些地方不同, 也没有必要仔细研究, 直接将一份简单的Word文件另存为html格式看看就可以知晓了. 我们的需求比较简单, 所以只需要保留其中重要的相关设置, 所得模板如下
<table class="highlighttable"><th colspan="2" style="text-align:left">html</th><tr><td><div class="linenodiv" style="background-color: #f0f0f0; padding-right: 10px"><pre style="line-height:125%"> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20</pre></div></td><td class="code"><div class="highlight"><pre style="line-height:125%"><span></span><<span style="color: #008000; font-weight: bold">style</span>> <span style="color: #AA22FF; font-weight: bold">@page</span> <span style="color: #008000; font-weight: bold">WordSection1</span> { <span style="color: #008000; font-weight: bold">size</span><span style="color: #AA22FF">:297mm</span> <span style="color: #008000; font-weight: bold">210mm</span><span style="color: #666666">;</span> <span style="color: #008000; font-weight: bold">margin</span><span style="color: #666666">:</span> <span style="color: #008000; font-weight: bold">12</span><span style="color: #0000FF">.7mm</span> <span style="color: #008000; font-weight: bold">12</span><span style="color: #0000FF">.7mm</span> <span style="color: #008000; font-weight: bold">12</span><span style="color: #0000FF">.7mm</span> <span style="color: #008000; font-weight: bold">12</span><span style="color: #0000FF">.7mm</span><span style="color: #666666">;</span> <span style="color: #008000; font-weight: bold">mso-page-orientation</span><span style="color: #AA22FF">:landscape</span><span style="color: #666666">;</span> <span style="color: #008000; font-weight: bold">mso-header-margin</span><span style="color: #AA22FF">:42</span><span style="color: #0000FF">.55pt</span><span style="color: #666666">;</span> <span style="color: #008000; font-weight: bold">mso-footer-margin</span><span style="color: #AA22FF">:49</span><span style="color: #0000FF">.6pt</span><span style="color: #666666">;</span> <span style="color: #008000; font-weight: bold">mso-paper-source</span><span style="color: #AA22FF">:0</span><span style="color: #666666">;</span> } <span style="color: #008000; font-weight: bold">div</span><span style="color: #0000FF">.WordSection1</span> {<span style="color: #AA22FF">page</span><span style="color: #666666">:</span>WordSection1;} </<span style="color: #008000; font-weight: bold">style</span>> <<span style="color: #008000; font-weight: bold">body</span>> <<span style="color: #008000; font-weight: bold">div</span> <span style="color: #BB4444">class</span><span style="color: #666666">=</span><span style="color: #BB4444">WordSection1</span> <span style="color: #BB4444">align</span><span style="color: #666666">=</span><span style="color: #BB4444">center</span> <span style="color: #BB4444">style</span><span style="color: #666666">=</span><span style="color: #BB4444">'text-align:center</span><span style="#FF0000">'</span>> <<span style="color: #008000; font-weight: bold">img</span> <span style="color: #BB4444">width</span><span style="color: #666666">=</span><span style="color: #BB4444">996</span> <span style="color: #BB4444">height</span><span style="color: #666666">=</span><span style="color: #BB4444">702</span> <span style="color: #BB4444">src</span><span style="color: #666666">=</span><span style="color: #BB4444">"1.png"</span>> <<span style="color: #008000; font-weight: bold">img</span> <span style="color: #BB4444">width</span><span style="color: #666666">=</span><span style="color: #BB4444">996</span> <span style="color: #BB4444">height</span><span style="color: #666666">=</span><span style="color: #BB4444">702</span> <span style="color: #BB4444">src</span><span style="color: #666666">=</span><span style="color: #BB4444">"2.jpg"</span>> </<span style="color: #008000; font-weight: bold">div</span>> </<span style="color: #008000; font-weight: bold">body</span>></pre></div> </td></tr></table>可以看到, Word可以识别html文件中自定义的页面设置, 其中可以设置页面大小, 页边距, 页面方向等. 这些设置浏览器未必支持. 需要注意的是, 其中的图片大小, 只能以像素为单位, 且默认根据96dpi计算.
使用Word打开这样的html文件, 可以看到图片, 但只是链接. 如果将Word文件转移到其他位置, 图片就无法显示了. 因此, 我们还需要将图片嵌入Word文件中. 对于Word2016, 依次
文件 | 信息 | 相关文档 | 编辑指向文件的链接
然后
全选 | 断开链接 | 将图片保存在文档中 | 确定
.
将上面的步骤总结为下面的脚本, 处理起大量的文件来就更容易了.
<table class="highlighttable"><th colspan="2" style="text-align:left">img2doc.bsh</th><tr><td><div class="linenodiv" style="background-color: #f0f0f0; padding-right: 10px"><pre style="line-height:125%"> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50</pre></div></td><td class="code"><div class="highlight"><pre style="line-height:125%"><span></span><span style="color: #008800; font-style: italic"># 定义程序路径</span> <span style="color: #B8860B">im</span><span style="color: #666666">=</span>/D/ImageMagick-7.1.1-19-Q16-HDRI <span style="color: #B8860B">xpdf</span><span style="color: #666666">=</span>/D/PDFtools/xpdf-tools-4.05/bin <span style="color: #008800; font-style: italic"># 先将pdf转换为相应的png文件</span> <span style="color: #AA22FF; font-weight: bold">for</span> pdf in *.pdf; <span style="color: #AA22FF; font-weight: bold">do</span> <span style="color: #B8860B">name</span><span style="color: #666666">=</span><span style="color: #BB6688; font-weight: bold">${</span><span style="color: #B8860B">pdf</span>%.pdf<span style="color: #BB6688; font-weight: bold">}</span> <span style="color: #B8860B">$xpdf</span>/pdftopng.exe <span style="color: #B8860B">$pdf</span> <span style="color: #B8860B">$name</span> mv <span style="color: #B8860B">$name</span>-000001.png <span style="color: #B8860B">$name</span>.png <span style="color: #AA22FF; font-weight: bold">done</span> <span style="color: #008800; font-style: italic"># 根据宽高旋转图片</span> mkdir -p backup <span style="color: #AA22FF; font-weight: bold">for</span> img in *.jpg *.png; <span style="color: #AA22FF; font-weight: bold">do</span> <span style="color: #AA22FF">eval</span> <span style="color: #AA22FF; font-weight: bold">$(</span><span style="color: #B8860B">$im</span>/identify.exe -ping -format <span style="color: #BB4444">'</span><span style="color: #B8860B">w</span><span style="color: #666666">=</span>%w; <span style="color: #B8860B">h</span><span style="color: #666666">=</span>%h;<span style="color: #BB4444">'</span> <span style="color: #B8860B">$img</span><span style="color: #AA22FF; font-weight: bold">)</span> <span style="color: #008800; font-style: italic"># 获取宽高</span> <span style="color: #666666">[[</span> <span style="color: #B8860B">$h</span> > <span style="color: #B8860B">$w</span> <span style="color: #666666">]]</span> <span style="color: #666666">&&</span> <span style="color: #666666">{</span> cp <span style="color: #B8860B">$img</span> ./backup/<span style="color: #B8860B">$img</span> <span style="color: #B8860B">$im</span>/convert.exe -rotate -90 ./backup/<span style="color: #B8860B">$img</span> <span style="color: #B8860B">$img</span> <span style="color: #666666">}</span> <span style="color: #AA22FF; font-weight: bold">done</span> <span style="color: #008800; font-style: italic"># 输出图片及其编号</span> rm -f _img <span style="color: #AA22FF; font-weight: bold">for</span> img in *.jpg *.png; <span style="color: #AA22FF; font-weight: bold">do</span> <span style="color: #B8860B">idx</span><span style="color: #666666">=</span><span style="color: #BB6688; font-weight: bold">${</span><span style="color: #B8860B">img</span>##*_<span style="color: #BB6688; font-weight: bold">}</span>; <span style="color: #B8860B">idx</span><span style="color: #666666">=</span><span style="color: #BB6688; font-weight: bold">${</span><span style="color: #B8860B">idx</span>%.*<span style="color: #BB6688; font-weight: bold">}</span> <span style="color: #AA22FF">echo</span> <span style="color: #B8860B">$idx</span> <span style="color: #B8860B">$img</span> >>_img <span style="color: #AA22FF; font-weight: bold">done</span> <span style="color: #008800; font-style: italic"># 排序图片并输出到html文件</span> sort -n _img | awk <span style="color: #BB4444">'</span> BEGIN <span style="color: #666666">{</span> print <span style="color: #BB4444">"<style>"</span> <span style="color: #BB6622; font-weight: bold">\</span> <span style="color: #BB4444">"\n@page WordSection1 {"</span> <span style="color: #BB6622; font-weight: bold">\</span> <span style="color: #BB4444">"\n size:297mm 210mm;"</span> <span style="color: #BB6622; font-weight: bold">\</span> <span style="color: #BB4444">"\n margin: 12.7mm 12.7mm 12.7mm 12.7mm;"</span> <span style="color: #BB6622; font-weight: bold">\</span> <span style="color: #BB4444">"\n mso-page-orientation:landscape;"</span> <span style="color: #BB6622; font-weight: bold">\</span> <span style="color: #BB4444">"\n mso-header-margin:42.55pt;"</span> <span style="color: #BB6622; font-weight: bold">\</span> <span style="color: #BB4444">"\n mso-footer-margin:49.6pt;"</span> <span style="color: #BB6622; font-weight: bold">\</span> <span style="color: #BB4444">"\n mso-paper-source:0;}"</span> <span style="color: #BB6622; font-weight: bold">\</span> <span style="color: #BB4444">"\ndiv.WordSection1 {page:WordSection1;}"</span> <span style="color: #BB6622; font-weight: bold">\</span> <span style="color: #BB4444">"\n</style>"</span> <span style="color: #BB6622; font-weight: bold">\</span> <span style="color: #BB4444">"\n"</span> <span style="color: #BB6622; font-weight: bold">\</span> <span style="color: #BB4444">"\n<body>"</span> <span style="color: #BB6622; font-weight: bold">\</span> <span style="color: #BB4444">"\n<div class=WordSection1 align=center style=\"text-align:center\">"</span> <span style="color: #BB6622; font-weight: bold">\</span> <span style="color: #BB4444">"\n"</span> <span style="color: #666666">}</span> <span style="color: #666666">{</span> print <span style="color: #BB4444">"<img width=996 height=702 src=\""</span><span style="color: #B8860B">$2</span><span style="color: #BB4444">"\">"</span> <span style="color: #666666">}</span> END <span style="color: #666666">{</span>print <span style="color: #BB4444">"</div></body>"</span><span style="color: #666666">}</span> <span style="color: #BB4444">'</span> >_.html</pre></div> </td></tr></table>