嗨,我有一个类似于下面的熊猫df
information record
name apple
size {'weight':{'gram':300,'oz':10.5},'description':{'height':10,'width':15}}
country America
partiesrelated [{'nameOfFarmer':'John Smith'},{'farmerID':'A0001'}]
Run Code Online (Sandbox Code Playgroud)
我想把df转换成另一个像这样的df
information record
name apple
size_weight_gram 300
size_weight_oz 10.5
size_description_height 10
size_description_width 15
country America
partiesrelated_nameOfFarmer John Smith
partiesrelated_farmerID A0001
Run Code Online (Sandbox Code Playgroud)
在这种情况下,字典将解析成单行,其中size_weight_gram包含值.
的代码 df
df = pd.DataFrame({'information': ['name', 'size', 'country', 'partiesrealated'],
'record': ['apple', {'weight':{'gram':300,'oz':10.5},'description':{'height':10,'width':15}}, 'America', [{'nameOfFarmer':'John Smith'},{'farmerID':'A0001'}]]})
df = df.set_index('information')
Run Code Online (Sandbox Code Playgroud) 我有一个类似于下面的列表
['a','b','c','d','e','f','g','h','i','j']
Run Code Online (Sandbox Code Playgroud)
我想用索引列表分开
[1,4]
Run Code Online (Sandbox Code Playgroud)
在这种情况下,它将是
[['a'],['b','c'],['d','e','f','g','h','i','j']]
Run Code Online (Sandbox Code Playgroud)
作为
[:1] =['a']
[1:4] = ['b','c']
[4:] = ['d','e','f','g','h','i','j']
Run Code Online (Sandbox Code Playgroud)
情况 2:如果索引列表是
[0,6]
Run Code Online (Sandbox Code Playgroud)
这将是
[[],['a','b','c','d','e'],['f','g','h','i','j']]
Run Code Online (Sandbox Code Playgroud)
作为
[:0] = []
[0:6] = ['a','b','c','d','e']
[6:] = ['f','g','h','i','j']
Run Code Online (Sandbox Code Playgroud)
情况 3 如果索引是
[2,5,7]
Run Code Online (Sandbox Code Playgroud)
它将是 [['a','b'],['c','d','e'],['h','i','j']] 作为
[:2] =['a','b']
[2:5] = ['c','d','e']
[5:7] = ['f','g']
[7:] = ['h','i','j']
Run Code Online (Sandbox Code Playgroud) 嗨,我有两个类似下面的熊猫系列
盈亏
Product Name Price
Company A Orange 3000
Company B Apple 2000
Grapes 1000
Run Code Online (Sandbox Code Playgroud)
税
Product Name Price
Company A Orange 100
Company B Apple 100
Grapes 10
Run Code Online (Sandbox Code Playgroud)
我想将 pandas 系列转换为以下 JSON 格式
{'PnL':{'Company A':{'productName':'Orange','price':3000},
'Company B':[{'productName':'Apple','price':2000},
{'productName':'Grapes','price':1000}]
},
'Tax':{'Company A':{'productName':'Orange','price':100},
'Company B':[{'productName':'Apple','price':100},
{'productName':'Grapes','price':10}]
}
}
Run Code Online (Sandbox Code Playgroud)
我曾尝试使用下面的代码
convertedJson = json.dumps([{'company': k[0], 'productName':k[1],'price': v} for k,v in df.items()])
Run Code Online (Sandbox Code Playgroud)
但我无法形成我想要生成的 JSON。感谢您的帮助
我有一个 df 列,其中包含
Phone number
12399422/930201021
5451354;546325642
789888744,656313214
123456654
Run Code Online (Sandbox Code Playgroud)
我想把它分成两列
Phone number1 Phone number2
12399422 930201021
5451354 546325642
789888744 656313214
123456654
Run Code Online (Sandbox Code Playgroud)
我曾尝试使用申请,
df['TELEPHONE1'] = df['TELEPHONE'].str.split(',').str.get(0)
df['TELEPHONE2'] = df['TELEPHONE'].str.split(',').str.get(1)
df['TELEPHONE1'] = df['TELEPHONE'].str.split(';').str.get(0)
df['TELEPHONE2'] = df['TELEPHONE'].str.split(';').str.get(1)
df['TELEPHONE1'] = df['TELEPHONE'].str.split('/').str.get(0)
df['TELEPHONE2'] = df['TELEPHONE'].str.split('/').str.get(1)
Run Code Online (Sandbox Code Playgroud)
但它只能拆分 '/' 谢谢你的帮助
我参考了如何为 groupby DataFrame 创建滚动百分比
import pandas as pd
data = [
('product_a','1/31/2014',53)
,('product_b','1/31/2014',44)
,('product_c','1/31/2014',36)
,('product_a','11/30/2013',52)
,('product_b','11/30/2013',43)
,('product_c','11/30/2013',35)
,('product_a','3/31/2014',50)
,('product_b','3/31/2014',41)
,('product_c','3/31/2014',34)
,('product_a','12/31/2013',50)
,('product_b','12/31/2013',41)
,('product_c','12/31/2013',34)
,('product_a','2/28/2014',52)
,('product_b','2/28/2014',43)
,('product_c','2/28/2014',35)]
product_df = pd.DataFrame( data, columns=['prod_desc','activity_month','prod_count'] )
product_df.sort_values('activity_month', inplace = True, ascending=False)
product_df['pct_ch'] = product_df.groupby('prod_desc')['prod_count'].pct_change() + 1
print(product_df)
Run Code Online (Sandbox Code Playgroud)
但是,我无法像建议的答案那样产生输出。
产生的答案
prod_desc activity_month prod_count pct_ch
0 product_a 1/31/2014 53 NaN
1 product_b 1/31/2014 44 0.830189
2 product_c 1/31/2014 36 0.818182
3 product_a 11/30/2013 52 1.444444
4 product_b 11/30/2013 43 0.826923
5 product_c 11/30/2013 35 0.813953 …Run Code Online (Sandbox Code Playgroud) 我正在使用 spacy 示例 NER 代码进行测试。这是直接从 spacy 网站https://spacy.io/usage/training复制的。我只是自己添加了导入空间和随机
import spacy
import random
TRAIN_DATA = [
("Uber blew through $1 million a week", {'entities': [(0, 4, 'ORG')]}),
("Google rebrands its business apps", {'entities': [(0, 6, "ORG")]})]
nlp = spacy.blank('en')
optimizer = nlp.begin_training()
for i in range(20):
random.shuffle(TRAIN_DATA)
for text, annotations in TRAIN_DATA:
nlp.update([text], [annotations], sgd=optimizer)
nlp.to_disk('/model')
Run Code Online (Sandbox Code Playgroud)
但是,当我运行代码时。它显示错误。
Warning: Unnamed vectors -- this won't allow multiple vectors models to be loaded. (Shape: (0, 0))
Run Code Online (Sandbox Code Playgroud)
我在社区上搜索,但没有任何线索。感谢您的帮助