我尝试了苹果自己的例子:
import NaturalLanguage
let text = "The American Red Cross was established in Washington, D.C., by Clara Barton."
let tagger = NLTagger(tagSchemes: [.nameType])
tagger.string = text
let options: NLTagger.Options = [.omitPunctuation, .omitWhitespace, .joinNames]
let tags: [NLTag] = [.personalName, .placeName, .organizationName]
tagger.enumerateTags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .nameType, options: options) { tag, tokenRange in
// Get the most likely tag, and print it if it's a named entity.
if let tag = tag, tags.contains(tag) {
print("\(text[tokenRange]): \(tag.rawValue)")
}
// Get multiple possible tags with their associated confidence scores.
let (hypotheses, _) = tagger.tagHypotheses(at: tokenRange.lowerBound, unit: .word, scheme: .nameType, maximumCount: 1)
print(hypotheses)
return true
}
Run Code Online (Sandbox Code Playgroud)
但它将所有名称标签返回为Other. 我还尝试了另一个使用词汇类别标记句子的示例,它还将每个单词标记为OtherWord:
var text = "The American Red Cross was established in Washington, D.C., by Clara Barton."
let tagger = NLTagger(tagSchemes: [.lexicalClass])
tagger.string = text
let options: NLTagger.Options = [.omitPunctuation, .omitWhitespace, .joinNames]
print("language", tagger.dominantLanguage)
tagger.enumerateTags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass, options: options) { tag, tokenRange in
// Get the most likely tag, and print it if it's a named entity.
if let tag = tag {
print("\(text[tokenRange]): \(tag.rawValue)")
}
return true
}
Run Code Online (Sandbox Code Playgroud)
我通过设置语言正字法尝试回答这个问题,但没有帮助:
//tagger.setOrthography(NSOrthography(dominantScript: "Latn", languageMap: ["Latn": ["en"]]), range: text.startIndex..<text.endIndex)
tagger.setOrthography(NSOrthography.defaultOrthography(forLanguage: "en-US"), range: text.startIndex..<text.endIndex)
Run Code Online (Sandbox Code Playgroud)
有人知道为什么会这样吗?
顺便说一句,我的 Xcode 版本是今天最新的版本,14.3。
这似乎是 Xcode 14.3 的回归。我下载了 Xcode 14.2,NLTagger 可以正确工作.nameType并.lexicalClass进行标记。
Xcode 14.3 中的这种回归也会影响NLEmbedding. 例如,以下代码在 14.2 中正确获取单词邻居,但在 Xcode 14.3 中返回 nil 嵌入:
if let embedding = NLEmbedding.wordEmbedding(for: .english) {
print("found embedding")
print("embeddings for family: \(embedding.neighbors(for: "family", maximumCount: 3))")
print("embeddings for science: \(embedding.neighbors(for: "science", maximumCount: 3))")
} else {
print("no embedding found")
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
411 次 |
| 最近记录: |