小编imi*_*org的帖子

从芹菜任务中获取芹菜工人的名字？

我希望celery任务能够获取执行它的工作人员的名字,以便进行日志记录.我需要在任务中处理这个问题,而不是直接查询代理.有没有办法做到这一点？我正在使用芹菜和RabbitMQ,如果这很重要的话.

python celery celery-task

imi*_*org

2014 05-26

10
推荐指数

3
解决办法

5618
查看次数

您可以在 scikit-learn 中添加到 CountVectorizer 吗？

我想在基于文本语料库的scikit-learn 中创建一个 CountVectorizer，然后稍后将更多文本添加到 CountVectorizer（添加到原始字典）。

如果我使用transform()，它会保留原始词汇，但不会添加新词。如果我使用fit_transform()，它只会从头开始重新生成词汇表。见下文：

In [2]: count_vect = CountVectorizer()

In [3]: count_vect.fit_transform(["This is a test"])
Out[3]: 
<1x3 sparse matrix of type '<type 'numpy.int64'>'
    with 3 stored elements in Compressed Sparse Row format>

In [4]: count_vect.vocabulary_  
Out[4]: {u'is': 0, u'test': 1, u'this': 2}

In [5]: count_vect.transform(["This not is a test"])
Out[5]: 
<1x3 sparse matrix of type '<type 'numpy.int64'>'
    with 3 stored elements in Compressed Sparse Row format>

In [6]: count_vect.vocabulary_
Out[6]: {u'is': 0, u'test': …

Run Code Online (Sandbox Code Playgroud)

python nlp scikit-learn

imi*_*org

2016 02-13

3
推荐指数

1
解决办法

1731
查看次数

在Python中查找列表之间不常见的项目

我在Python 2.6,a和b中有两个非常大的列表(比如50,000个字符串).

这有两个选择.哪个更快,为什么？有没有更好的办法？

c = [i for i in a if i not in b]

Run Code Online (Sandbox Code Playgroud)

要么...

c = list(a)  # I need to preserve a for future use, so this makes a copy
for x in b:
    c.remove(x)

Run Code Online (Sandbox Code Playgroud)

python list-comprehension python-2.6

imi*_*org

lucky-day

1
推荐指数

1
解决办法

1371
查看次数

标签统计

python ×3

celery ×1

celery-task ×1

list-comprehension ×1

nlp ×1

python-2.6 ×1

scikit-learn ×1

从芹菜任务中获取芹菜工人的名字？

您可以在 scikit-learn 中添加到 CountVectorizer 吗？

在Python中查找列表之间不常见的项目

标签 统计

小编imi_org的帖子

标签统计