小编Gun*_*jan的帖子

将功能名称更新为scikit TFIdfVectorizer

我正在尝试这段代码

from sklearn.feature_extraction.text import TfidfVectorizer
import numpy as np

train_data = ["football is the sport","gravity is the movie", "education is imporatant"]
vectorizer = TfidfVectorizer(sublinear_tf=True, max_df=0.5,
                                                 stop_words='english')

print "Applying first train data"
X_train = vectorizer.fit_transform(train_data)
print vectorizer.get_feature_names()

print "\n\nApplying second train data"
train_data = ["cricket", "Transformers is a film","AIMS is a college"]
X_train = vectorizer.transform(train_data)
print vectorizer.get_feature_names()

print "\n\nApplying fit transform onto second train data"
X_train = vectorizer.fit_transform(train_data)
print vectorizer.get_feature_names()
Run Code Online (Sandbox Code Playgroud)

这个的输出是

Applying first train data
[u'education', u'football', u'gravity', u'imporatant', u'movie', u'sport'] …
Run Code Online (Sandbox Code Playgroud)

python nlp machine-learning scikit-learn

6
推荐指数
1
解决办法
1751
查看次数

获得英语单词的基本形式

我试图获得一个英语单词的基本英语单词,该单词是从其基本形式修改的.这个问题已在这里提出,但我没有看到正确的答案,所以我试图这样说.我尝试了两个来自NLTK包的词干器和一个词形变换器,它们是搬运器,干扰器,雪球器和wordnet lemmatiser.

我试过这段代码:

from nltk.stem.porter import PorterStemmer
from nltk.stem.snowball import SnowballStemmer
from nltk.stem.wordnet import WordNetLemmatizer

words = ['arrival','conclusion','ate']

for word in words:
    print "\n\nOriginal Word =>", word
    print "porter stemmer=>", PorterStemmer().stem(word)
    snowball_stemmer = SnowballStemmer("english")
    print "snowball stemmer=>", snowball_stemmer.stem(word)
    print "WordNet Lemmatizer=>", WordNetLemmatizer().lemmatize(word)
Run Code Online (Sandbox Code Playgroud)

这是我得到的输出:

Original Word => arrival
porter stemmer=> arriv
snowball stemmer=> arriv
WordNet Lemmatizer=> arrival


Original Word => conclusion
porter stemmer=> conclus
snowball stemmer=> conclus
WordNet Lemmatizer=> conclusion


Original Word => ate
porter stemmer=> ate
snowball stemmer=> ate
WordNet …
Run Code Online (Sandbox Code Playgroud)

python text-processing nlp stemming morphological-analysis

5
推荐指数
1
解决办法
5356
查看次数

哪个prolog实现对我的情况有帮助

我正在经历Prolog.我想用它来进行自然语言处理.我在IBM Watson系统中使用Prolog进行了本文的自然语言处理.正如文中所述,我想以类似的方式尝试一下.现在我想知道要使用哪个Prolog实现.我在Priki上看到了所有这些比较到维基上的内容.那么这些实现中的哪一个可以用于在Ubunutu上使用NLP的目的.也是一个很容易与python集成并且速度很快的那个.有没有人曾经做过这些实现.SWI-Prolog好吗?

感谢帮助.谢谢:)

nlp prolog

4
推荐指数
2
解决办法
371
查看次数