小编Ali*_*Zia的帖子

只保留包含字符串列表中的字符串的 df 列值

我有一个像这样的字符串列表:

stringlist = [JAN, jan, FEB, feb, mar]
Run Code Online (Sandbox Code Playgroud)

我有一个如下所示的数据框:

**date**            **value**
01MAR16                1
05FEB16                12
10jan17                5
10mar15                9
03jan05                7
04APR12                3
Run Code Online (Sandbox Code Playgroud)

我只想保留包含 stringlist 中一个字符串的日期,结果应如下所示:

**date**            **value**
NA                     1
05FEB16                12
10jan17                5
10mar15                9
03jan05                7
NA                     3
Run Code Online (Sandbox Code Playgroud)

我刚开始使用正则表达式,因此在解决它时遇到了一些麻烦,希望得到一些帮助。

python dataframe python-3.x pandas python-re

6
推荐指数
1
解决办法
72
查看次数

扩展数据框以获得R中所有独特的catogorical列值的每月收入总和

我有一个df,其数据如下:

sub = c("X001","X002", "X001","X003","X002","X001","X001","X003","X002","X003","X003","X002") 
month = c("201506", "201507", "201506","201507","201507","201508", "201508","201507","201508","201508", "201508", "201508") 
tech = c("mobile", "tablet", "PC","mobile","mobile","tablet", "PC","tablet","PC","PC", "mobile", "tablet") 
brand = c("apple", "samsung", "dell","apple","samsung","apple", "samsung","dell","samsung","dell", "dell", "dell")

revenue = c(20, 15, 10,25,20,20, 17,9,14,12, 9, 11)

df = data.frame(sub, month, brand, tech, revenue)
Run Code Online (Sandbox Code Playgroud)

我想使用sub和month作为密钥,每个订户每月获得一行,显示该月份该订户的技术和品牌的唯一值的收入总和.这个例子很简单,列数较少,因为我有一个巨大的数据集,我决定尝试这样做data.table.

我已经设法为一个catagorical列做了这个,使用这个:技术或品牌:

df1 <- dcast(df, sub + month ~ tech,  fun=sum, value.var = "revenue")
Run Code Online (Sandbox Code Playgroud)

但我想为两个或更多的caqtogorical列做这个,到目前为止我已经尝试过这个:

df2 <- dcast(df, sub + month ~ tech+brand,  fun=sum, value.var = "revenue")
Run Code Online (Sandbox Code Playgroud)

它只是连接了catogorical列的唯一值和总和,但我不希望这样.我想为所有catogorical列的每个独特值分隔列.

我是R的新手,非常感谢任何帮助.

r data.table dcast

5
推荐指数
1
解决办法
65
查看次数

标签 统计

data.table ×1

dataframe ×1

dcast ×1

pandas ×1

python ×1

python-3.x ×1

python-re ×1

r ×1