LDA Mallet出现CalledProcessError,如何解决该问题呢?

输入代码如下:

import os
from gensim.models.wrappers import LdaMallet
os.environ['MALLET_HOME'] = 'C:\\Users\\Pying\\Downloads\\mallet\\mallet-2.0.8'
mallet_path = 'C:\\Users\\Pying\\Downloads\\mallet\\mallet-2.0.8\\\bin\\mallet'
ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=corpus, num_topics=20, id2word=id2word)

报错如下:

---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
<ipython-input-33-fc011df5327b> in <module>
      4 os.environ['MALLET_HOME'] = 'C:\\Users\\Pying\\Downloads\\mallet\\mallet-2.0.8'
      5 mallet_path = 'C:\\Users\\Pying\\Downloads\\mallet\\mallet-2.0.8\\\bin\\mallet'
----> 6 ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=corpus, num_topics=20, id2word=id2word)

D:\lib\site-packages\gensim\models\wrappers\ldamallet.py in __init__(self, mallet_path, corpus, num_topics, alpha, id2word, workers, prefix, optimize_interval, iterations, topic_threshold, random_seed)
    129         self.random_seed = random_seed
    130         if corpus is not None:
--> 131             self.train(corpus)
    132 
    133     def finferencer(self):

D:\lib\site-packages\gensim\models\wrappers\ldamallet.py in train(self, corpus)
    270 
    271         """
--> 272         self.convert_input(corpus, infer=False)
    273         cmd = self.mallet_path + ' train-topics --input %s --num-topics %s  --alpha %s --optimize-interval %s '\
    274             '--num-threads %s --output-state %s --output-doc-topics %s --output-topic-keys %s '\

D:\lib\site-packages\gensim\models\wrappers\ldamallet.py in convert_input(self, corpus, infer, serialize_corpus)
    259             cmd = cmd % (self.fcorpustxt(), self.fcorpusmallet())
    260         logger.info("converting temporary corpus to MALLET format with %s", cmd)
--> 261         check_output(args=cmd, shell=True)
    262 
    263     def train(self, corpus):

D:\lib\site-packages\gensim\utils.py in check_output(stdout, *popenargs, **kwargs)
   1930             error = subprocess.CalledProcessError(retcode, cmd)
   1931             error.output = output
-> 1932             raise error
   1933         return output
   1934     except KeyboardInterrupt:

CalledProcessError: Command 'C:\Users\Pying\Downloads\mallet\mallet-2.0.8in\mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input C:\Users\Pying\AppData\Local\Temp\a9b5c9_corpus.txt --output C:\Users\Pying\AppData\Local\Temp\a9b5c9_corpus.mallet' returned non-zero exit status 1.

 

 

请问搞定了吗?

检查一下,Java环境配置是否正确?可能需要安装JDK。

百度去吧

你好,我是有问必答小助手,非常抱歉,本次您提出的有问必答问题,技术专家团超时未为您做出解答

本次提问扣除的有问必答次数,已经为您补发到账户,我们后续会持续优化,扩大我们的服务范围,为您带来更好地服务。

您好 我遇到同样的问题 请问有解决方法吗 重新配置了JDK以及其变量 不管用...