标签: scipy

scipy/sparse/sparsetools/csr_wrap.cxx: In function ‘void init_csr()’:

scipy/sparse/sparsetools/csr_wrap.cxx:73303:21: warning: variable ‘md’ set but not used [-Wunused-but-set-variable]

c++: internal compiler error: Killed (program cc1plus)

Please submit a full bug report,

with preprocessed source if appropriate.

See <file:///usr/share/doc/gcc-4.7/README.Bugs> for instructions.

----------------------------------------
Cleaning up...
Command /home/ubuntu/pnr/bin/python -c "import setuptools;__file__='/home/ubuntu/pnr/build/scipy/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-t8Drvd-record/install-record.txt --single-version-externally-managed --install-headers /home/ubuntu/pnr/include/site/python2.7 failed with error code -9 in /home/ubuntu/pnr/build/scipy

Run Code Online (Sandbox Code Playgroud)

和

Traceback (most recent call last):
  File "/home/ubuntu/pnr/bin/pip", line 9, …

Run Code Online (Sandbox Code Playgroud)

python linux amazon-ec2 virtualenv scipy

Sal*_*til

2013 10-26

42
推荐指数

3
解决办法

1万
查看次数

如何通过广播的密集1d数组元素地乘以scipy.sparse矩阵？

假设我有一个2d稀疏数组.在我的实际用例中,行数和列数都要大得多(比如20000和50000),因此在使用密集表示时它无法适应内存:

>>> import numpy as np
>>> import scipy.sparse as ssp

>>> a = ssp.lil_matrix((5, 3))
>>> a[1, 2] = -1
>>> a[4, 1] = 2
>>> a.todense()
matrix([[ 0.,  0.,  0.],
        [ 0.,  0., -1.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  2.,  0.]])

Run Code Online (Sandbox Code Playgroud)

现在假设我有一个密集的1d数组,其中包含大小为3的所有非零组件(在我的实际案例中为50000):

>>> d = np.ones(3) * 3
>>> d
array([ 3.,  3.,  3.])

Run Code Online (Sandbox Code Playgroud)

我想使用numpy的常用广播语义来计算a和d的元素乘法.然而,scipy中的稀疏矩阵属于np.matrix:'*'运算符被重载使其行为类似于矩阵乘法而不是逐元乘法:

>>> a * d
array([ 0., -3.,  0.,  0.,  6.])

Run Code Online (Sandbox Code Playgroud)

一种解决方案是将'a'切换到'*'运算符的数组语义,这将产生预期的结果:

>>> a.toarray() * d
array([[ 0.,  0., …

Run Code Online (Sandbox Code Playgroud)

python numpy scipy sparse-matrix

ogr*_*sel

2010 07-14

41
推荐指数

2
解决办法

1万
查看次数

通过scipy.sparse向量(或矩阵)迭代

我想知道用scipy.sparse迭代稀疏矩阵的非零项最好的方法是什么.例如,如果我执行以下操作:

from scipy.sparse import lil_matrix

x = lil_matrix( (20,1) )
x[13,0] = 1
x[15,0] = 2

c = 0
for i in x:
  print c, i
  c = c+1

Run Code Online (Sandbox Code Playgroud)

输出是

0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13   (0, 0) 1.0
14 
15   (0, 0) 2.0
16 
17 
18 
19

Run Code Online (Sandbox Code Playgroud)

因此看起来迭代器正在触及每个元素,而不仅仅是非零条目.我已经看过API了

http://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.lil_matrix.html

并搜索了一下,但我似乎无法找到一个有效的解决方案.

python scipy sparse-matrix

Ran*_*Guy

2015 06-07

41
推荐指数

3
解决办法

3万
查看次数

如何在scipy/matplotlib中绘制和注释层次聚类树形图

我使用的是dendrogram从scipy使用绘制层次聚类matplotlib如下:

mat = array([[1, 0.5, 0.9],
             [0.5, 1, -0.5],
             [0.9, -0.5, 1]])
plt.subplot(1,2,1)
plt.title("mat")
dist_mat = mat
linkage_matrix = linkage(dist_mat,
                         "single")
print "linkage2:"
print linkage(1-dist_mat, "single")
dendrogram(linkage_matrix,
           color_threshold=1,
           labels=["a", "b", "c"],
           show_leaf_counts=True)
plt.subplot(1,2,2)
plt.title("1 - mat")
dist_mat = 1 - mat
linkage_matrix = linkage(dist_mat,
                         "single")
dendrogram(linkage_matrix,
           color_threshold=1,
           labels=["a", "b", "c"],
           show_leaf_counts=True)

Run Code Online (Sandbox Code Playgroud)

我的问题是:第一,为什么mat和1-mat在这里给同一聚类？第二,如何使用树的每个分支来注释距离,dendrogram以便可以比较节点对之间的距离？

最后似乎show_leaf_counts忽略了标志,有没有办法打开它,以便显示每个类中的对象数量？谢谢. 在此输入图像描述

python numpy matplotlib dendrogram scipy

作者

2013 04-07

41
推荐指数

2
解决办法

4万
查看次数

从numpy python中的稀疏矩阵生成密集矩阵

我有一个Sqlite数据库,其中包含以下类型的架构:

termcount(doc_num, term , count)

Run Code Online (Sandbox Code Playgroud)

此表包含文档中各自计数的术语.喜欢

(doc1 , term1 ,12)
(doc1, term 22, 2)
.
.
(docn,term1 , 10)

Run Code Online (Sandbox Code Playgroud)

该矩阵可以被认为是稀疏矩阵,因为每个文档包含非常少的将具有非零值的项.

如何使用numpy从这个稀疏矩阵创建一个密集矩阵,因为我必须使用余弦相似度计算文档之间的相似性.

这个密集矩阵看起来像一个表格,其中docid作为第一列,所有条款都将列为第一行.剩余的单元格将包含计数.

python arrays numpy scipy sparse-matrix

作者

2013 08-13

41
推荐指数

2
解决办法

8万
查看次数

Pythonic从numpy数组列表中创建numpy数组的方法

我在循环中生成一维numpy数组的列表,然后将此列表转换为2d numpy数组.如果我提前知道项目的数量,我会预先分配一个2d numpy数组,但我没有,因此我将所有内容都放在列表中.

模拟如下:

>>> list_of_arrays = map(lambda x: x*ones(2), range(5))
>>> list_of_arrays
[array([ 0.,  0.]), array([ 1.,  1.]), array([ 2.,  2.]), array([ 3.,  3.]), array([ 4.,  4.])]
>>> arr = array(list_of_arrays)
>>> arr
array([[ 0.,  0.],
       [ 1.,  1.],
       [ 2.,  2.],
       [ 3.,  3.],
       [ 4.,  4.]])

Run Code Online (Sandbox Code Playgroud)

我的问题如下:

是否有更好的方法(性能方面)来完成收集顺序数字数据(在我的情况下是numpy数组)的任务,而不是将它们放在一个列表中然后从中创建一个numpy.array(我正在创建一个新的obj并复制数据)？在经过良好测试的模块中是否有"可扩展"矩阵数据结构？

我的2d矩阵的典型大小将介于100x10和5000x10浮点之间

编辑:在这个例子中我使用map,但在我的实际应用程序中,我有一个for循环

python arrays performance numpy scipy

Dra*_*ric

lucky-day

40
推荐指数

3
解决办法

4万
查看次数

使用Matplotlib绘制正态分布

请帮我绘制下面数据的正态分布:

数据:

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

h = [186, 176, 158, 180, 186, 168, 168, 164, 178, 170, 189, 195, 172,
     187, 180, 186, 185, 168, 179, 178, 183, 179, 170, 175, 186, 159,
     161, 178, 175, 185, 175, 162, 173, 172, 177, 175, 172, 177, 180]

std = np.std(h) 
mean = np.mean(h)    
plt.plot(norm.pdf(h,mean,std))

Run Code Online (Sandbox Code Playgroud)

输出:

Standard Deriviation = 8.54065575872 
mean = 176.076923077

Run Code Online (Sandbox Code Playgroud)

情节不正确,我的代码出了什么问题？

python plot numpy matplotlib scipy

Ade*_*del

2013 11-16

40
推荐指数

2
解决办法

10万
查看次数

标签统计

python ×10

scipy ×10

numpy ×7

matplotlib ×4

sparse-matrix ×3

arrays ×2

amazon-ec2 ×1

blurry ×1

dendrogram ×1

linux ×1

mathematical-optimization ×1

output ×1

performance ×1

plot ×1

virtualenv ×1

visualization ×1

volume ×1

标签 统计

标签统计