有段python代码看不明白


text_corpus = [
    "Human machine interface for lab abc computer applications",
    "A survey of user opinion of computer system response time",
    "The EPS user interface management system"
]
stop_list=set('for a of the and to in'.split(' '))
texts = [[word for word in document.lower().split() if word not in stop_list]for document in text_corpus]

主要是最后一行,应该是先执行for document in text_corpus ,也就是从text_corpus中每次去除一个字符串也就是doucument,然后把这个字符串document放到前面执行,然后前面一长串我就不知道咋执行的

拿到了doucument后先做处理document.lower().split()
然后循环遍历document.lower().split()的元素,如果不在stop_list中,那么就添加到新的列表,否则不添加

[x for x in a]是个列表推导式
你这明显是个嵌套的列表推导式,类似二重for循环
可以写为等价二重for循环如下:

a=[]
for document in text_corpus:
    b=[]
    a.append(b)
    for word in document.lower().split():
        if word not in stop_list:
            b.append(word)
texts = []
for document in text_corpus:
    word_list = document.lower().split()
    tmp_texts = []
    for word in word_list:
        if word not in stop_list:
            tmp_texts.append(word)
    texts.append(tmp_texts)

text_corpus = [
    "Human machine interface for lab abc computer applications",
    "A survey of user opinion of computer system response time",
    "The EPS user interface management system"
]
stop_list = set('for a of the and to in'.split(' '))
texts = [[word for word in document.lower().split() if word not in stop_list] for document in text_corpus]

# 第七行的代码等价于下面的for循环
texts1 = []
for document in text_corpus:
    line = []
    for word in document.lower().split():
        if word not in stop_list:
            line.append(word)
    texts1.append(line)

print(texts == texts1)