是否可以使用PHP将doc文件转换为HTML?

I am creating a website in which authors can create EPUB files. Users will be uploading their books in the .doc format. I need to create EPUB file out of that. One single doc file will be having multiple chapters. So I need to parse the doc file and split it into chapters. Authors will be using Heading 1 for their chapter titles.

So in PHP, is there any way to parse doc files to HTML and split it into chapters using Heading 1, so that I can create EPUB file.

After some research, I got one linux app. But I think, it will convert doc to plain text. So I will not be able to split the chapters.

Please suggest me the a solution if you have. Thanks in advance.

You can achieve this using PHPDOCX API.

First try to generate the XHTML from your Word document using this function reference

Something like this..

require_once '../../classes/TransformDoc.inc';

$document = new TransformDoc();
$document->setStrFile('../files/Text.docx');
$document->generateXHTML();
$document->validatorXHTML();
echo $document->getStrXHTML();

After getting the XHTML content you can do various processings like removing chapter,etc.

Complete documentation can be found here.