PDF元数据中关键字之间的分隔符

5 pdf spaces comma keyword separator

我找不到关于PDF文件元数据中的关键字和关键字短语是用逗号还是用空格分隔的“官方”文档。

下面的示例演示了区别:

  • 关键字,关键字词组,另一个关键字词组
  • 关键字,关键字词组,另一个关键字词组

有高质量的参考资料吗?

我发现的在线资源质量低下。例如,Adobe新闻网页上说“关键字必须用逗号或分号分隔”,但是在示例中,我们看到在第一个关键字之前带有分号的分号,而在两个相邻关键字之间则带有分号的分号。在示例中,我们看不到关键字词组。

Mit*_*tch 6

The keywords metadata field is a single text field - not a list. You can choose whatever is visually pleasing to you. The search engine which operates on the keyword data may have other preferences, but I would imagine that either comma or semicolon would work with most modern search engines.

Reference: PDF 32000-1:2008 on page 550

ExifTool, for example parses for comma separated values, but if it does not find a comma it will split on spaces:

# separate tokens in comma or whitespace delimited lists
my @values = ($val =~ /,/) ? split /,+\s*/, $val : split ' ', $val;
foreach $val (@values) {
    $et->FoundTag($tagInfo, $val);
}
Run Code Online (Sandbox Code Playgroud)