相关疑难解决方法(0)

算法可以检测讽刺

我被要求编写一个算法来检测讽刺,但我在逻辑中遇到了一个缺陷(或似乎是一个).

例如,如果一个人说

A:我喜欢Justin Beiber.你喜欢他吗？

B:是的.当然.我绝对爱他.

现在这可能被认为是讽刺与否,唯一知道的方法似乎是知道B是否严重.

(我不应该深入.我们被给了一堆短语,只是被告知,如果这些是在句子中那么它是讽刺但我感兴趣？)

有什么方法可以解决这个问题吗？或者在讽刺时计算机是否完全卡住了？

(我想它取决于扬声器的音调,但我的输入是文字)

algorithm nlp

cjd*_*jds

lucky-day

36
推荐指数

2
解决办法

9641
查看次数

Python 自然语言处理停用词

我只是在用 Python 对 NLP 进行一些研究，我发现了一些奇怪的东西。

关于以下负面推文的审查：

neg_tweets = [('I do not like this car', 'negative'),
          ('This view is horrible', 'negative'),
          ('I feel tired this morning', 'negative'),
          ('I am not looking forward to the concert', 'negative'),<---
          ('He is my enemy', 'negative')]

Run Code Online (Sandbox Code Playgroud)

并通过删除停用词进行一些处理。

clean_data = []
stop_words = set(stopwords.words("english"))

for (words, sentiment) in pos_tweets + neg_tweets:
words_filtered = [e.lower() for e in words.split() if e not in stop_words]
clean_data.append((words_filtered, sentiment))

Run Code Online (Sandbox Code Playgroud)

部分输出是：

 (['i', 'looking', 'forward', 'concert'], 'negative')

Run Code Online (Sandbox Code Playgroud)

我正在努力理解为什么停用词包括“不”，这会影响推文的情绪。

我的理解是停用词在情感方面没有价值。

所以，我的问题是为什么“不”包含在停用词列表中？

python text analysis nltk

And*_*aly

lucky-day

5
推荐指数

1
解决办法

1259
查看次数

标签统计

algorithm ×1

analysis ×1

nlp ×1

nltk ×1

python ×1

text ×1

算法可以检测讽刺

Python 自然语言处理停用词

标签 统计

标签统计