我正在尝试衡量公司名称的相似性,但是在尝试匹配这些名称的缩写时遇到了困难。例如:
IBM
The International Business Machines Corporation
Run Code Online (Sandbox Code Playgroud)
我曾尝试使用fuzzywuzzy来衡量相似性:
>>> fuzz.partial_ratio("IBM","The International Business Machines Corporation")
33
>>> fuzz.partial_ratio("General Electric","GE Company")
20
>>> fuzz.partial_ratio("LTCG Holdings Corp","Long Term Care Group Inc")
39
>>> fuzz.partial_ratio("Young Innovations Inc","YI LLC")
33
Run Code Online (Sandbox Code Playgroud)
您是否知道任何技术可以衡量此类缩写的更高相似度?