小编Lon*_*inh的帖子

NLTK:语料级别的蓝色vs句级BLEU得分

我在python中导入了nltk来计算Ubuntu上的BLEU分数.我理解句子级BLEU分数是如何工作的,但我不明白语料库级BLEU分数是如何工作的.

以下是我的语料级BLEU分数代码:

import nltk

hypothesis = ['This', 'is', 'cat'] 
reference = ['This', 'is', 'a', 'cat']
BLEUscore = nltk.translate.bleu_score.corpus_bleu([reference], [hypothesis], weights = [1])
print(BLEUscore)

Run Code Online (Sandbox Code Playgroud)

出于某种原因,上述代码的bleu得分为0.我期待一个语料库级别的BLEU评分至少为0.5.

这是我的句子级BLEU分数的代码

import nltk

hypothesis = ['This', 'is', 'cat'] 
reference = ['This', 'is', 'a', 'cat']
BLEUscore = nltk.translate.bleu_score.sentence_bleu([reference], hypothesis, weights = [1])
print(BLEUscore)

Run Code Online (Sandbox Code Playgroud)

考虑到简短惩罚和缺失的单词"a",这里的句子级BLEU分数是0.71.但是,我不明白语料库级别的BLEU分数是如何工作的.

任何帮助,将不胜感激.

python nlp machine-learning nltk bleu

Lon*_*inh

lucky-day

11
推荐指数

2
解决办法

9182
查看次数