小编ram*_*amu的帖子

Python:通过numpy.save保存字典

我在内存中有一个大型数据集(数百万行),采用numpy数组和字典的形式.

一旦构建了这个数据,我想将它们存储到文件中; 所以,稍后我可以快速将这些文件加载到内存中,而无需再次从头开始重建这些数据.

np.save和np.load函数可以为numpy数组顺利完成工作.
但是我遇到了dict对象的问题.

见下面的样本.d2是从文件加载的字典.请参阅#out [28]它已作为numpy数组加载到d2中,而不是作为dict.所以进一步的dict操作如get不起作用.

有没有办法从文件加载数据作为dict(而不是numpy数组)？

In [25]: d1={'key1':[5,10], 'key2':[50,100]}

In [26]: np.save("d1.npy", d1)

In [27]: d2=np.load("d1.npy")

In [28]: d2
Out[28]: array({'key2': [50, 100], 'key1': [5, 10]}, dtype=object)

In [30]: d1.get('key1')  #original dict before saving into file
Out[30]: [5, 10]

In [31]: d2.get('key2')  #dictionary loaded from the file
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-31-23e02e45bf22> in <module>()
----> 1 d2.get('key2')

AttributeError: 'numpy.ndarray' object has no attribute 'get'

Run Code Online (Sandbox Code Playgroud)

python dictionary numpy

ram*_*amu

lucky-day

23
推荐指数

2
解决办法

3万
查看次数

Couchbase：如何维护没有重复元素的数组？

我们有一个包含客户数据的 Couchbase 商店。

每个客户在这个存储桶中只有一个文档。
每日交易将导致更新此客户数据。

示例文档。让我们专注于purchase_product_ids数组。

{
  "customer_id" : 1000
  "purchased_product_ids" : [1, 2, 3, 4, 5 ] 
      # in reality this is a big array - hundreds of elements
  ... 
  ... many other elements ...
  ...
} 

Existing purchased_product_ids : 
    [1, 2, 3, 4, 5]

products purchased today : 
    [1, 2, 3, 6]  // 6 is a new entry, others existing already

Expected result after the update: 
    [1, 2, 3, 4, 5, 6]

Run Code Online (Sandbox Code Playgroud)

我使用Subdocument API来避免服务器和客户端之间的大量数据传输。 …

couchbase couchbase-java-api

ram*_*amu

lucky-day

5
推荐指数

1
解决办法

574
查看次数