make_index 这个函数就是简单的做一个倒排索引,将出现了某种语言的文章与该语言关联起来,返回的结果是一个列表,列表中每个元素是一个 tuple 。
from collections import namedtuple
WikipediaArticle = namedtuple("WikipediaArticle", ["title", "text"])
def make_index(langs, articles):
result = []
for lang in langs:
# 创建包含该 lang 的文章的生成器
article_gen = (article for article in articles if article.text.find(lang) >= 0)
result.append((lang, article_gen))
return result
if __name__ == '__main__':
articles = [
WikipediaArticle('1', "Groovy is pretty interesting, and so is Erlang"),
WikipediaArticle('2', "Scala and Java run on the JVM"),
WikipediaArticle('3', "Scala is not purely functional"),
WikipediaArticle('4', "The cool kids like Haskell more than Java"),
WikipediaArticle('5', "Java is for enterprise developers")
]
langs = ["Scala", "Java", "Groovy", "Haskell", "Erlang"]
for item in make_index(langs, articles):
print(item[0], list(item[1])
然后跑出来的结果是:
Scala [WikipediaArticle(title='1', text='Groovy is pretty interesting, and so is Erlang')]
Java [WikipediaArticle(title='1', text='Groovy is pretty interesting, and so is Erlang')]
Groovy [WikipediaArticle(title='1', text='Groovy is pretty interesting, and so is Erlang')]
Haskell [WikipediaArticle(title='1', text='Groovy is pretty interesting, and so is Erlang')]
Erlang [WikipediaArticle(title='1', text='Groovy is pretty interesting, and so is Erlang')]
很明显这结果有问题,取得都是第一条数据, 但是如果在函数内部做一点修改,会得到以下结果:
# result.append((lang, article_gen)) 这一句改成 result.append((lang, list(article_gen)))
('Scala', [WikipediaArticle(title='2', text='Scala and Java run on the JVM'), WikipediaArticle(title='3', text='Scala is not purely functional')])
('Java', [WikipediaArticle(title='2', text='Scala and Java run on the JVM'), WikipediaArticle(title='4', text='The cool kids like Haskell more than Java'), WikipediaArticle(title='5', text='Java is for enterprise developers')])
('Groovy', [WikipediaArticle(title='1', text='Groovy is pretty interesting, and so is Erlang')])
('Haskell', [WikipediaArticle(title='4', text='The cool kids like Haskell more than Java')])
('Erlang', [WikipediaArticle(title='1', text='Groovy is pretty interesting, and so is Erlang')])
感觉好奇怪,为什么会这样,求解, 在函数里面解开生成器和在函数外面解开有什么区别吗?