小编Ria*_*iad的帖子

美学必须是长度为1或与dataProblems相同的长度

我想制作一个图,其中X值作为测量的子集,Y值作为测量数据的另一个子集.

在下面的例子中,我有4个产品p1,p2,p3和p4.每个都根据他们的歪斜,颜色和版本定价.我想创建一个多面图,描绘P3产品(Y轴)与P1产品(X轴).

我的下面的尝试失败了,出现以下错误:

错误:美学必须是长度为1或与dataProblems相同的长度:子集(price,product =="p1"),子集(price,product =="p3")

library(ggplot2)
product=c("p1","p1","p1","p1","p1","p1","p1","p1","p2","p2","p2","p2","p2","p2","p2","p2","p3","p3","p3","p3","p3","p3","p3","p3","p4","p4","p4","p4","p4","p4","p4","p4")
skew=c("b","b","b","b","a","a","a","a","b","b","b","b","a","a","a","a","b","b","b","b","a","a","a","a","b","b","b","b","a","a","a","a")
version=c(0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2,0.1,0.1,0.2,0.2)
color=c("C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2","C1","C2")
price=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32)
df = data.frame(product, skew, version, color, price)
# First plot all the data
p1 <- ggplot(df, aes(x=price, y=price, colour=factor(skew))) + geom_point(size=2, shape=19)
p1 <- p1 + facet_grid(version ~ color)
p1 # This gavea very good plot. So far so good
# Now plot P3 vs P1
p1 <- ggplot(df, aes(x=subset(price, product=='p1'), y=subset(price, product=='p3'), colour=factor(skew))) + geom_point(size=2, shape=19)
p1
# failed with: Error: Aesthetics must either be length one, …
Run Code Online (Sandbox Code Playgroud)

r ggplot2 aesthetics

31
推荐指数
3
解决办法
17万
查看次数

R编程:plyr如何使用ddply计算列中的值

我想总结一下我的数据的通过/失败状态,如下所示.换句话说,我想告诉每种产品/类型的通过和失败案例的数量.

library(ggplot2)
library(plyr)
product=c("p1","p1","p1","p1","p1","p1","p1","p1","p1","p1","p1","p1","p2","p2","p2","p2","p2","p2","p2","p2","p2","p2","p2","p2")
type=c("t1","t1","t1","t1","t1","t1","t2","t2","t2","t2","t2","t2","t1","t1","t1","t1","t1","t1","t2","t2","t2","t2","t2","t2")
skew=c("s1","s1","s1","s2","s2","s2","s1","s1","s1","s2","s2","s2","s1","s1","s1","s2","s2","s2","s1","s1","s1","s2","s2","s2")
color=c("c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3","c1","c2","c3")
result=c("pass","pass","fail","pass","pass","pass","fail","pass","fail","pass","fail","pass","fail","pass","fail","pass","pass","pass","pass","fail","fail","pass","pass","fail")
df = data.frame(product, type, skew, color, result)
Run Code Online (Sandbox Code Playgroud)

以下cmd返回传递+失败案例的总数,但我想要传递和失败的单独列

dfSummary <- ddply(df, c("product", "type"), summarise, N=length(result))
Run Code Online (Sandbox Code Playgroud)

结果是:

        product type N
 1      p1      t1   6
 2      p1      t2   6
 3      p2      t1   6
 4      p2      t2   6
Run Code Online (Sandbox Code Playgroud)

希望的结果是

         product type Pass Fail
 1       p1      t1   5    1
 2       p1      t2   3    3
 3       p2      t1   4    2
 4       p2      t2   3    3
Run Code Online (Sandbox Code Playgroud)

我尝试过这样的事情:

 dfSummary <- ddply(df, c("product", "type"), summarise, Pass=length(df$product[df$result=="pass"]), Fail=length(df$product[df$result=="fail"]) )
Run Code Online (Sandbox Code Playgroud)

但显然这是错误的,因为结果是失败和传递的重要结果.

提前感谢您的建议!此致,里亚德.

r plyr

7
推荐指数
1
解决办法
2万
查看次数

如何在时间序列图中使用geom_bar stat =“ identity”设置Bin宽度?

我想使用条形图绘制时间序列,并将Bin Width设置为0.9。我似乎无法做到这一点。我到处搜索,但到目前为止找不到任何有用的信息。如果stat =“ identity”,这是一个限制吗?

这是示例数据和图形。干杯!

time <- c('2015-06-08 00:59:00','2015-06-08 02:48:00','2015-06-08 06:43:00','2015-06-08 08:59:00','2015-06-08 10:59:00','2015-06-08 12:59:00','2015-06-08 14:58:00','2015-06-08 16:58:00','2015-06-08 18:59:00','2015-06-08 20:59:00','2015-06-08 22:57:00','2015-06-09 00:59:00','2015-06-09 01:57:00','2015-06-09 03:22:00','2015-06-09 06:14:00','2015-06-09 08:59:00','2015-06-09 10:59:00','2015-06-09 12:59:00','2015-06-09 14:59:00','2015-06-09 16:59:00','2015-06-09 18:59:00','2015-06-09 20:59:00','2015-06-09 22:58:00','2015-06-10 00:57:00','2015-06-10 02:34:00','2015-06-10 04:45:00','2015-06-10 06:24:00','2015-06-10 08:59:00','2015-06-10 10:59:00','2015-06-10 12:59:00','2015-06-10 14:59:00','2015-06-10 16:59:00','2015-06-10 18:59:00','2015-06-10 20:58:00','2015-06-10 22:52:00','2015-06-11 00:59:00','2015-06-11 02:59:00','2015-06-11 04:59:00','2015-06-11 06:59:00','2015-06-11 08:59:00','2015-06-11 10:59:00','2015-06-11 12:59:00','2015-06-11 14:59:00','2015-06-11 16:58:00','2015-06-11 18:58:00','2015-06-11 20:56:00','2015-06-11 21:49:00','2015-06-12 00:59:00','2015-06-12 02:59:00','2015-06-12 04:20:00','2015-06-12 08:55:00','2015-06-12 10:55:00','2015-06-12 12:59:00','2015-06-12 14:59:00','2015-06-12 16:59:00','2015-06-12 18:59:00','2015-06-12 20:55:00','2015-06-12 22:50:00','2015-06-13 00:16:00','2015-06-13 12:59:00','2015-06-13 14:35:00','2015-06-13 16:56:00','2015-06-13 18:59:00','2015-06-13 20:59:00','2015-06-13 22:44:00','2015-06-13 23:19:00','2015-06-14 08:53:00','2015-06-14 10:14:00','2015-06-14 12:59:00','2015-06-14 14:59:00','2015-06-14 16:56:00','2015-06-14 18:58:00','2015-06-14 …
Run Code Online (Sandbox Code Playgroud)

r time-series ggplot2 geom-bar

5
推荐指数
1
解决办法
5803
查看次数

Python:write.csv添加额外的回车

我正在使用python编写一个Excl to CSV转换器.我在Linux上运行,我的Python版本是:Python2 2.7(r271:86832,2012年12月4日,17:16:32)[GCC 4.1.2 20080704(Red Hat 4.1.2-51)] on linux2

在下面的代码中,当我评论5"csvFile.write"行时,生成的csv文件都很好.但是,使用代码,将在"wr.writerow"生成的所有行的末尾添加回车符.

问题:当"csvFile.write"存在时,为什么csv写入会添加额外的回车符?

import sys  # For interacting with the Unix Shell
import os   # For OS information (mainly path manipulation)

import time # For data and time
import xlrd # For reading both XLS/XLSX files
import csv  # Writing CSV files

# Get the Excel file from cmd line arguments
excelFile = sys.argv[1]

def csv_from_excel(excelFile):
  wb = xlrd.open_workbook(excelFile, encoding_override='utf8')
  sh = wb.sheet_by_index(0)
  print sh.name, sh.nrows, sh.ncols
  # Output file
  csvFileName= …
Run Code Online (Sandbox Code Playgroud)

python csv carriage-return python-2.7

3
推荐指数
1
解决办法
5370
查看次数

os.walk 带通配符的路径

我想遍历一个目录并搜索给定的文件。这是我写的一些代码:

import os
def find(filename, path):
  for root, dirs, files in os.walk(path):
    for file in files:
      if file==filename:
        print(os.path.join(root, file))

# Python boiler plate call.
if __name__ == "__main__":
  find('myFile.txt', '/path/to/one/user/dir/and/subDir1/and/subDir2')
Run Code Online (Sandbox Code Playgroud)

上面的方法效果很好。

问题 1:如何改进我的代码以处理如下内容:

  find('myFile.txt', '/path/to/one/*/dir/and/*/and/*')
Run Code Online (Sandbox Code Playgroud)

问题2:Pythonic 的方式是什么:

      if file==filename:
Run Code Online (Sandbox Code Playgroud)

python find os.walk

3
推荐指数
1
解决办法
9365
查看次数

R - 将数据帧乘以另一个数据帧

我有2个数据帧df1和df2.df1和df2具有相同的大小(行和列)和相同的因子.说:

df1 <- data.frame(a=c('alpha','beta','gamma'), b=c(1,2,3), c=c('x','y','z'), d=c(4,5,6))

      a b c d
1 alpha 1 x 4
2  beta 2 y 5
3 gamma 3 z 6
Run Code Online (Sandbox Code Playgroud)

df2 <- data.frame(a=c('alpha','beta','gamma'), b=c(7,8,9), c=c('x','y','z'), d=c(10,11,12))

      a b c  d
1 alpha 7 x 10
2  beta 8 y 11
3 gamma 9 z 12
Run Code Online (Sandbox Code Playgroud)

我想将这两个数据帧相乘并获得像tyhis这样的结果:

      a b  c d
1 alpha 7  x 40
2  beta 16 y 55
3 gamma 27 z 72
Run Code Online (Sandbox Code Playgroud)

我做了一些搜索并尝试了以下代码:

M <- merge(df1,df2,by=c('a','c'))
S <- M[,grepl("*\\.x$",names(M))] * …
Run Code Online (Sandbox Code Playgroud)

r multiplication dataframe

1
推荐指数
1
解决办法
2530
查看次数