小编use*_*005的帖子

如何在python中使用nlp或spacy提取位置名称、国家名称、城市名称、旅游地点

我正在尝试使用 python 中的 nlp 或 scapy 库从 txt 文件中提取位置名称、国家/地区名称、城市名称、旅游地点。

我已经尝试过以下：

import spacy
en = spacy.load('en')

sents = en(open('subtitle.txt').read())
place = [ee for ee in sents.ents]

Run Code Online (Sandbox Code Playgroud)

获取输出：

[1, 
, three, London, 
, 
, 
, 
, first, 
, 
, 00:00:20,520, 
, 
, London, the

4
00:00:20,520, 00:00:26,130
, Buckingham Palace, 
,

Run Code Online (Sandbox Code Playgroud)

我只想要位置名称、国家/地区名称、城市名称和城市内的任何地点。

我也尝试过使用 NLP：

import nltk
nltk.download('maxent_ne_chunker')
nltk.download('words')
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('stopwords')

with open('subtitle.txt', 'r') as f:
    sample = f.read()


sentences = nltk.sent_tokenize(sample)
tokenized_sentences = [nltk.word_tokenize(sentence) for sentence in sentences]
tagged_sentences = [nltk.pos_tag(sentence) …

Run Code Online (Sandbox Code Playgroud)

nlp machine-learning stanford-nlp python-3.x spacy

use*_*005

2018 10-09

6
推荐指数

1
解决办法

9783
查看次数