利用python-docx库写入内容报错,报错内容说是编码问题,请教一下怎么解决这个问题?

from pptx import Presentation
from docx import Document

transfer = ""
"""获取内容"""
prs = Presentation(r"../PPT/04.pptx")
# 遍历PPT的页面
for i, slide in enumerate(prs.slides):
    # 获取30页以内的内容
    if i <= 30:
        for shape in slide.shapes:
            # 判断每个Shape中是否存在文字
            if shape.has_text_frame:
                text_frame = shape.text_frame
                text = text_frame.text
                transfer += text
                # print(text)
print(transfer)
"""写入内容"""
doc = Document(r"D:\Python Project\python练习\word.docx")
paragraph = doc.add_paragraph()  # 这里相当于输入了一个空格,后面等待着文字输入
# paragraph.add_run(transfer)
# paragraph.add_run(U"%s" % transfer)
paragraph.add_run(u"{}".format(transfer))
doc.save(r"D:\Python Project\python练习\word.docx")

报错内容:

Traceback (most recent call last):
  File "D:/Python Project/python练习/自制PY/PPT提取内容到Word.py", line 23, in <module>
    paragraph.add_run(transfer)
  File "D:\Python 3\lib\site-packages\python_docx-0.8.11-py3.8.egg\docx\text\paragraph.py", line 37, in add_run
    run.text = text
  File "D:\Python 3\lib\site-packages\python_docx-0.8.11-py3.8.egg\docx\text\run.py", line 163, in text
    self._r.text = text
  File "D:\Python 3\lib\site-packages\python_docx-0.8.11-py3.8.egg\docx\oxml\text\run.py", line 104, in text
    _RunContentAppender.append_to_run_from_text(self, text)
  File "D:\Python 3\lib\site-packages\python_docx-0.8.11-py3.8.egg\docx\oxml\text\run.py", line 134, in append_to_run_from_text
    appender.add_text(text)
  File "D:\Python 3\lib\site-packages\python_docx-0.8.11-py3.8.egg\docx\oxml\text\run.py", line 142, in add_text
    self.add_char(char)
  File "D:\Python 3\lib\site-packages\python_docx-0.8.11-py3.8.egg\docx\oxml\text\run.py", line 157, in add_char
    self.flush()
  File "D:\Python 3\lib\site-packages\python_docx-0.8.11-py3.8.egg\docx\oxml\text\run.py", line 165, in flush
    self._r.add_t(text)
  File "D:\Python 3\lib\site-packages\python_docx-0.8.11-py3.8.egg\docx\oxml\text\run.py", line 41, in add_t
    t = self._add_t(text=text)
  File "D:\Python 3\lib\site-packages\python_docx-0.8.11-py3.8.egg\docx\oxml\xmlchemy.py", line 273, in _add_child
    setattr(child, key, value)
  File "src\lxml\etree.pyx", line 1024, in lxml.etree._Element.text.__set__
  File "src\lxml\apihelpers.pxi", line 747, in lxml.etree._setNodeText
  File "src\lxml\apihelpers.pxi", line 735, in lxml.etree._createTextNode
  File "src\lxml\apihelpers.pxi", line 1540, in lxml.etree._utf8
ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters

问题代码:

# paragraph.add_run(transfer)
# paragraph.add_run(U"%s" % transfer)
paragraph.add_run(u"{}".format(transfer))

 

在最开头加上


# -*- coding: utf-8 -*-

试试

可参考:python生成报告标签_u014210048的专栏-CSDN博客

您好,我是有问必答小助手,您的问题已经有小伙伴解答了,您看下是否解决,可以追评进行沟通哦~

如果有您比较满意的答案 / 帮您提供解决思路的答案,可以点击【采纳】按钮,给回答的小伙伴一些鼓励哦~~

ps: 问答会员年卡【8折】购 ,限时加赠IT实体书,即可 享受50次 有问必答服务,了解详情>>>https://t.csdnimg.cn/RW5m