想批量提取pptx的文本到一个txt文件中怎么办
from pptx import Presentation
import os
for dirpath, dirnames, filenames in os.walk('D:\新建文件夹 (5)'):
for filename in filenames:
prs = Presentation(dirpath+'/'+filename)
text_runs = []
for slide in prs.slides:
for shape in slide.shapes:
if not shape.has_text_frame:
continue
for paragraph in shape.text_frame.paragraphs:
for run in paragraph.runs:
text_runs.append(run.text)
with open('2.txt', 'w',encoding='utf-8') as f:
f.write('\n'.join(text_runs))
这样好像只能写一个文件我该加什么,之前看到别人发过一个视频教程,但讲到一半后面没了。。
有没有人救一救
with open('2.txt', 'a',encoding='utf-8') as f:
f.write('\n'.join(text_runs))
把写入方式改成“a”,然后这两句放进for循环试试
text_runs = []
for slide in prs.slides:
for shape in slide.shapes:
if not shape.has_text_frame:
continue
for paragraph in shape.text_frame.paragraphs:
for run in paragraph.runs:
text_runs.append(run.text)
with open('2.txt', 'a',encoding='utf-8') as f:
f.write('\n'.join(text_runs))