小编Flo*_*yan的帖子

使用 dplyr 更改特定行中数据框中的值

是否可以将数据框限制为特定行,然后更改其中一列中的某些值?

假设我计算GROWTH(SIZE_t+1 - SIZE_t)/SIZE_t,现在我可以看到GROWTH(例如 1000)有一些奇怪的值,原因是相应SIZE变量的损坏值。现在我想找到并替换SIZE.

如果我输入:

data <- mutate(filter(data, lead(GROWTH)==1000), SIZE = 2600)
Run Code Online (Sandbox Code Playgroud)

然后只存储损坏的行,data而我的数据帧的其余部分丢失。

我想做的是将左侧的“数据”过滤到损坏值的相应行,然后改变不正确的变量(在右侧):

filter(data, lead(GROWTH)==1000)  <- mutate(filter(data, lead(GROWTH)==1000), SIZE = 2600) 
Run Code Online (Sandbox Code Playgroud)

但这似乎不起作用。有没有办法使用 dplyr 处理这个问题?提前谢谢了

r dplyr

6
推荐指数
1
解决办法
1万
查看次数

递归计算DataFrame值

我正在尝试"递归地"计算pandas数据帧的列值.

假设存在两个不同日期的数据,每个数据有10个观察值,并且您想要计算一些变量r,其中只给出r的第一个值(每天),并且您想要计算剩余的2*9个条目,而每个后续值取决于在前一个r和一个额外的'同时'变量'x'上.

在此输入图像描述

第一个问题是我想单独执行每一天的计算,即我想pandas.groupby()在我的所有计算中使用该函数...但是当我尝试将数据子集化并使用该shift(1)函数时,我只得到"NaN"项

data.groupby(data.index)['r'] =   ( (1+data.groupby(data.index)['x']*0.25) * (1+data.groupby(data.index)['r'].shift(1)))
Run Code Online (Sandbox Code Playgroud)

对于我的第二种方法,我使用for循环来遍历索引(日期):

for i in range(2,21):
    data[data['rank'] == i]['r'] =  ( (1+data[data['rank'] == i]['x']*0.25) * (1+data[data['rank'] == i]['r'].shift(1))
Run Code Online (Sandbox Code Playgroud)

但是,这对我不起作用.有没有办法在DataFrames上执行这样的计算?也许像滚动申请?

数据:

df = pd.DataFrame({
  'rank' : [1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10],
  'x' : [0.00275,0.00285,0.0031,0.0036,0.0043,0.0052,0.0063,0.00755,0.00895,0.0105,0.0027,0.00285,0.0031,0.00355,0.00425,0.0051,0.00615,0.00735,0.00875,0.0103],
  'r' : [0.00158,'NaN','NaN','NaN','NaN','NaN','NaN','NaN','NaN','NaN',0.001485,'NaN','NaN','NaN','NaN','NaN','NaN','NaN','NaN','NaN']
  },index=['2014-01-02', '2014-01-02', '2014-01-02', '2014-01-02',
           '2014-01-02', '2014-01-02', '2014-01-02', '2014-01-02',
           '2014-01-02', '2014-01-02', '2014-01-03', '2014-01-03',
           '2014-01-03', '2014-01-03', '2014-01-03', '2014-01-03',
           '2014-01-03', '2014-01-03', '2014-01-03', '2014-01-03'])
Run Code Online (Sandbox Code Playgroud)

python dataframe pandas

5
推荐指数
1
解决办法
1428
查看次数

标签 统计

dataframe ×1

dplyr ×1

pandas ×1

python ×1

r ×1