小编dan*_*dan的帖子

从SQL数据库中选择特定的行和列

是否可以检索特定行的特定列SQL query

假设我正在从我的SQL表中选择名为my_table:a,b的行,使用此查询文本:

"select * from my_table where row_names in ('a', 'b') order by row_names"
Run Code Online (Sandbox Code Playgroud)

如何修改此查询文本以仅选择列2,12,22,32,42而不是所有1000列?

mysql sql

4
推荐指数
1
解决办法
5万
查看次数

将表添加到ggplot图中

我有剂量反应数据:

df <- data.frame(dose=c(10,0.625,2.5,0.15625,0.0390625,0.0024414,0.00976562,0.00061034,10,0.625,2.5,0.15625,0.0390625,0.0024414,0.00976562,0.00061034,10,0.625,2.5,0.15625,0.0390625,0.0024414,0.00976562,0.00061034),viability=c(6.117463479317,105.176885855348,57.9126197628863,81.9068445005286,86.484379347143,98.3093580807309,96.4351897372596,81.831197750164,27.3331232120347,85.2221817678203,80.7904933803092,91.9801454635583,82.4963735273569,110.440066995265,90.1705406346481,76.6265869905362,11.8651732228561,88.9673125759484,35.4484427232156,78.9756635057238,95.836828982968,117.339025930735,82.0786828300557,95.0717213053837),stringsAsFactors=F)
Run Code Online (Sandbox Code Playgroud)

我使用drc R包来适应这些数据的log-logistic模型:

library(drc)
fit <- drm(viability~dose,data=df,fct=LL.4(names=c("Slope","Lower Limit","Upper Limit","ED50")))
Run Code Online (Sandbox Code Playgroud)

然后,我使用以下标准错误绘制此曲线:

pred.df <- expand.grid(dose=exp(seq(log(max(df$dose)),log(min(df$dose)),length=100))) 

pred <- predict(fit,newdata=pred.df,interval="confidence") 
pred.df$viability <- pred[,1]
pred.df$viability.low <- pred[,2]
pred.df$viability.high <- pred[,3]



library(ggplot2)
p <- ggplot(df,aes(x=dose,y=viability))+geom_point()+geom_ribbon(data=pred.df,aes(x=dose,y=viability,ymin=viability.low,ymax=viability.high),alpha=0.2)+labs(y="viability")+
geom_line(data=pred.df,aes(x=dose,y=viability))+coord_trans(x="log")+theme_bw()+scale_x_continuous(name="dose",breaks=sort(unique(df$dose)),labels=format(signif(sort(unique(df$dose)),3),scientific=T))+ggtitle(label="all doses")
Run Code Online (Sandbox Code Playgroud)

最后,我想将参数估计值作为表格添加到图中.我尝试着:

params.df <- cbind(data.frame(param=gsub(":\\(Intercept\\)","",rownames(summary(fit)$coefficient)),stringsAsFactors=F),data.frame(summary(fit)$coefficient))
      rownames(params.df) <- NULL

ann.df <- data.frame(param=gsub(" Limit","",params.df$param),value=signif(params.df[,2],3),stringsAsFactors=F)
rownames(ann.df) <- NULL
xmin <- sort(unique(df$dose))[1]
xmax <- sort(unique(df$dose))[3]
ymin <- df$viability[which(df$dose==xmin)][1]
ymax <- max(pred.df$viability.high)
p <- p+annotation_custom(tableGrob(ann.df),xmin=xmin,xmax=xmax,ymin=ymin,ymax=ymax)
Run Code Online (Sandbox Code Playgroud)

但得到错误:

Error: annotation_custom only works with Cartesian coordinates 
Run Code Online (Sandbox Code Playgroud)

任何的想法?

此外,一旦绘制,有没有办法抑制行名称?

r ggplot2

4
推荐指数
1
解决办法
1593
查看次数

向下舍入数字

numeric喜欢这个:

a <- -1.542045
Run Code Online (Sandbox Code Playgroud)

我想将它们向下舍入(或向上舍入abs)到小数点后的2位数. signif(a,3)将它向下舍入并给我1.54作为结果但是对于这个例子我想要的结果是-1.55.

任何的想法?

r significant-digits

3
推荐指数
1
解决办法
2万
查看次数

R找不到Rcpp函数

我在遵循Hadley的手册中构建了一个R package(被叫myUtils),它使用了一个cpp文件.我的文件驻留在运行:之后创建的目录中,并且在我的目录下我有一个名为的文件,其中包含以下行:RStudiocppsrcdevtools::use_rcpp()RmyUtils.R

#' myUtils: A package with various functions for my analyses
#'
#'
#' @docType package
#' @name myUtils
#' @useDynLib myUtils
#' @importFrom Rcpp sourceCpp
NULL
Run Code Online (Sandbox Code Playgroud)

这是我的cpp档案:

// [[Rcpp::depends(RcppArmadillo, RcppEigen)]]

#include <RcppArmadillo.h>
#include <RcppEigen.h>

using namespace Rcpp;

// [[Rcpp::export]]
SEXP armaMatMult(arma::mat A, arma::mat B){
  arma::mat C = A * B;

  return Rcpp::wrap(C);
}

// [[Rcpp::export]]
SEXP eigenMatMult(Eigen::MatrixXd A, Eigen::MatrixXd B){
  Eigen::MatrixXd C …
Run Code Online (Sandbox Code Playgroud)

r devtools package rcpp rstudio

3
推荐指数
1
解决办法
1126
查看次数

使用渐变按组对散点图进行颜色编码

我有XY我想使用曲线图的数据scatter plot,用Rplotly包.

set.seed(1)
df <- data.frame(x=c(rnorm(30,1,1),rnorm(30,5,1),rnorm(30,9,1)),
                 y=c(rnorm(30,1,1),rnorm(30,5,1),rnorm(30,9,1)),
                 group=c(rep("A",30),rep("B",30),rep("C",30)),score=runif(90,0,1))
Run Code Online (Sandbox Code Playgroud)

每个点都分配给三个组中的一个(df$group)并且在该[0,1]范围内具有分数.

我正在寻找一种方法来绘制数据图形,使每组用不同颜色着色,但颜色(或强度)的阴影反映了分数.

所以我认为这会奏效:

library(dplyr)
library(plotly)

    plot_ly(marker=list(size=10),type='scatter',mode="markers",x=~df$x,y=~df$y,color=~df$score,colors=c("#66C2A5","#FC8D62","#8DA0CB")) %>%
  layout(xaxis=list(title="X",zeroline=F,showticklabels=F),yaxis=list(title="Y",zeroline=F,showticklabels=F))
Run Code Online (Sandbox Code Playgroud)

但我得到: 在此输入图像描述

如果我只是通过group以下颜色代码:

plot_ly(marker=list(size=10),type='scatter',mode="markers",x=~df$x,y=~df$y,color=~df$group,colors=c("#66C2A5","#FC8D62","#8DA0CB")) %>%
      layout(xaxis=list(title="X",zeroline=F,showticklabels=F),yaxis=list(title="Y",zeroline=F,showticklabels=F))
Run Code Online (Sandbox Code Playgroud)

我明白了: 在此输入图像描述

所以看起来它正在混合group颜色和score渐变.

我正在寻找的是有绿色的色调(比如从有色左下方组graydarkgreen)对应于score(从低到高),与同为分别在橙色和蓝色,其他两组.

gradient r scatter-plot plotly

3
推荐指数
1
解决办法
594
查看次数

在情节次要情节中合并图例

我有几个组,每个组有几个类,我测量了连续值:

set.seed(1)

df <- data.frame(value = c(rnorm(100,1,1), rnorm(100,2,1), rnorm(100,3,1),
                           rnorm(100,3,1), rnorm(100,1,1), rnorm(100,2,1),
                           rnorm(100,2,1), rnorm(100,3,1), rnorm(100,1,1)),
                 class = c(rep("c1",100), rep("c2",100), rep("c3",100),
                           rep("c2",100), rep("c4",100), rep("c1",100),
                           rep("c4",100), rep("c3",100), rep("c2",100)),
                 group = c(rep("g1",300), rep("g2",300), rep("g3",300)))

df$class <- factor(df$class, levels =c("c1","c2","c3","c4"))
df$group <- factor(df$group, levels =c("g1","g2","g3"))
Run Code Online (Sandbox Code Playgroud)

并非数据中的每个组都具有相同的类,或者换句话说,每个组都具有所有类的子集。

我试图R plotly为每个组生成密度曲线,按类别进行颜色编码,然后使用plotlyssubplot函数将它们全部组合成一个图。

这就是我正在做的:

library(dplyr)
library(ggplot2)
library(plotly)


set.seed(1)

df <- data.frame(value = c(rnorm(100,1,1), rnorm(100,2,1), rnorm(100,3,1),
                           rnorm(100,3,1), rnorm(100,1,1), rnorm(100,2,1),
                           rnorm(100,2,1), rnorm(100,3,1), rnorm(100,1,1)),
                 class = c(rep("c1",100), rep("c2",100), rep("c3",100),
                           rep("c2",100), rep("c4",100), rep("c1",100),
                           rep("c4",100), rep("c3",100), rep("c2",100)),
                 group …
Run Code Online (Sandbox Code Playgroud)

r legend subplot plotly r-plotly

3
推荐指数
1
解决办法
2712
查看次数

让geom_tile绘制正方形而不是矩形单元格

我正在尝试heatmap使用ggplot's 生成一个情节geom_tile.我的数据行数多于列数.

  set.seed(1)
  df <- data.frame(val=rnorm(100),gene=rep(letters[1:20],5),cell=c(sapply(LETTERS[1:5],function(l) rep(l,20))))
Run Code Online (Sandbox Code Playgroud)

运行:

library(ggplot2)
ggplot(df,aes(y=gene,x=cell,fill=val))+geom_tile(color="white")
Run Code Online (Sandbox Code Playgroud)

生产: 在此输入图像描述

如何使heatmap单元格具有对称尺寸 - 正方形而不是矩形(高度=宽度)?不会扭曲图形的尺寸.

r heatmap ggplot2

2
推荐指数
2
解决办法
4696
查看次数

根据给定的顺序对数据帧进行排序

可能是一个简单的问题。

我有一个data.frame(样本名称,它们的因子水平以及因子水平的副本):

df <- data.frame(name=c("DP_A","DP_B","PA_A","PA_B","PA_C"),
                 level=c("DP","DP","PA","PA","PA"),
                 replicate=c("A","B","A","B","C"),
                 stringsAsFactors = F)
Run Code Online (Sandbox Code Playgroud)

以及其中一列的给定所需顺序-因子水平:

level.order <- c("PA","DP")
Run Code Online (Sandbox Code Playgroud)

因此,我想dflevel.order(含义df$level)订购。

奖励是,如果我可以第二次添加到level.order我可以订购的产品df$replicate,那么订购产品可以是character(如本例中所示)integer,或它们的组合(例如,A1,A2等)

在这种情况下,顺序df为:

df <- data.frame(name=c("PA_A","PA_B","PA_C","DP_A","DP_B"),
                 level=c("PA","PA","PA","DP","DP"),
                 replicate=c("A","B","C","A","B"),
                 stringsAsFactors = F)
Run Code Online (Sandbox Code Playgroud)

r dataframe

2
推荐指数
1
解决办法
2297
查看次数

使用dplyr枚举data.frame中的冗余值

我有data.frame两套ID,两者都可能是多余的.

这是一个例子:

df <- data.frame(id1 = c("id.1","id.1","id.1","id.1","id.1","id.2","id.2","id.3"),
                 id2 = c("id.1.a","id.1.b","id.1.a","id.1.c","id.1.b","id.2.a","id.2.b","id.3.a"))
Run Code Online (Sandbox Code Playgroud)

我想要做的是添加另一个ID列,其中df$id1将有一个数字后缀,其值增加,遵循的顺序df$id2.

因此,对于上面的示例,生成的data.frame将是:

res.df <- data.frame(id1 = c("id.1","id.1","id.1","id.1","id.1","id.2","id.2","id.3"),
                     id2 = c("id.1.a","id.1.b","id.1.a","id.1.c","id.1.b","id.2.a","id.2.b","id.3.a"),
                     id3 = c("id.1.01","id.1.03","id.1.02","id.1.05","id.1.04","id.2.01","id.2.02","id.3"))
Run Code Online (Sandbox Code Playgroud)

因此,由于id.1映射到id.1.a两次,id.1.b两次,id.1.c一次,它变为:id.1.01, id.1.03, id.1.02, id.1.05, id.1.04

不知道如何与拉这一关dplyr还是tidyr

r dataframe dplyr tidyr

2
推荐指数
1
解决办法
64
查看次数

渲染带有标题/标题和预选行的闪亮数据表

我正在尝试编写shiny app用于绘制xy数据的代码。每个xy点都与几个因素相关:

set.seed(1)
data.df <- data.frame(x = rnorm(1000), y = rnorm(1000),
                      sex = sample(c("F", "M"), 1000, replace = T),
                      age = sample(c("Y", "O"), 1000, replace = T),
                      group = sample(c("A", "B", "C", "D"), 1000, replace = T),
                      stringsAsFactors = F)

design.df <- data.frame(factor.name = c(c(rep("sex",2), rep("age",2), rep("group",4))),
                        factor.levels = c("F", "M","Y", "O","A", "B", "C", "D"), stringsAsFactors = F)
Run Code Online (Sandbox Code Playgroud)

我希望用户能够根据在 in中使用的多行选择来对xydata( ) 进行子集化,其中默认选择是 的所有行。使用此代码效果很好:data.dfdesign.dfDT::renderDTrenderUIserverdesign.df

suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(shiny))
suppressPackageStartupMessages(library(DT)) …
Run Code Online (Sandbox Code Playgroud)

datatable r shiny dt

2
推荐指数
1
解决办法
8513
查看次数