小编Jac*_*ong的帖子

使用ggplot2将抖动应用于箱线图中的异常值数据

您是否知道如何将抖动应用于箱线图的异常数据？这是代码

ggplot(data = a, aes(x = "", y = a$V8)) +
geom_boxplot(outlier.size = 0.5)+
geom_point(data=a, aes(x="", y=a$V8[54]), colour="red", size=3) + 
theme_bw()+
coord_flip()

Run Code Online (Sandbox Code Playgroud)

谢谢!!

r ggplot2 boxplot jitter

Jac*_*ack

2019 06-21

8
推荐指数

2
解决办法

2685
查看次数

调整 fviz_cluster 中的输出

我想改变我的fviz_clust情节的结果。具体来说，将图例更改为“簇”而不是“簇”，同时删除图例中找到的卷曲线（我认为它们是字母，但不完全确定）。

我知道fviz_cluster与其他元素一起工作，ggplot.因此我的第一个想法是更改每个scale_..._..情节中的图例标题，但这仍然导致原始图例显示。其次，我以为我可以引入一个scale_shape_manual()对象ggplot，但情节忽略了它。

代码：

km.res <- kmeans(iris[,-5], 3)
p <- fviz_cluster(km.res, iris[,-5]) +
scale_color_brewer(palette='Set2') + # set guides=FALSE to remove legend
scale_fill_brewer(palette='Set2') +
scale_shape_manual('1'=22,'2'=23,'3'=24) # plot ignores this
ggtitle(label='')
p

Run Code Online (Sandbox Code Playgroud)

理想情况下，我想显示一个与 fviz_cluster 生成的非常相似的图例，但图例中每个形状周围都有形状和颜色框。最后的标题是“集群”。

r cluster-analysis ggplot2

Jac*_*ong

2018 12-05

6
推荐指数

1
解决办法

7298
查看次数

在 R 中跨列应用用户定义的函数

我在 R 中有两个函数可以将弧度和角度转换为笛卡尔坐标，如下所示：

x_cart<-function(theta,r){
  return(r * cos (theta))
}
y_cart<-function(theta,r){
  return(r * sin (theta))
}

Run Code Online (Sandbox Code Playgroud)

然后我想应用这个函数在我的数据框中创建两个新列作为x和y从列angle和radius. 当我使用 lapply 时，我得到一个错误，参数 r is missing with no default。

df$x<-apply(df[,c("angle_adj","hit_distance")],1, x_cart())

Run Code Online (Sandbox Code Playgroud)

测试数据

angle<-c(10,15,20)
radius<-c(15,35,10)
df<-data.frame(angle,radius)

Run Code Online (Sandbox Code Playgroud)

r apply

Jac*_*ong

2021 05-27

4
推荐指数

1
解决办法

63
查看次数

查找单词中第一个元音出现的位置

我正在尝试编写一个将英语翻译成 PigLatin 的程序。我目前正在尝试解决在哪里找到单词的第一个元音的部分，这样程序就可以正确地分割单词并正确地重新排列它。

例如，字符串“hello I am a Guy”变为“ellohay Iyay amyay aayy uygay”。（在列表中，我认为我的猪拉丁语是正确的，这与我创建的示例不同。

因此，“what”这个词就变成了“atwhay”。程序发现第一个元音位于槽 2，然后给出整数 2。

我想首先将它与一个字符串进行比较，元音=“aeiouy”，然后从那里开始，但我被困住了。这是我所拥有的：

public static int indexOfFirstVowel(String word){
   int index=0;
   String vowels="aeiouy";
   return index;

}

Run Code Online (Sandbox Code Playgroud)

理论上索引将更新到第一个元音所在的位置。

java arrays sorting string

Jac*_*ong

lucky-day

3
推荐指数

1
解决办法

1万
查看次数

使用插入符号构建随机森林

我试图按照此处的步骤在插入符号中构建 RandomForest 模型。本质上，他们设置了 RandomForest，然后是最好的 mtry，然后是最好的 maxnodes，然后是最好的树数。这些步骤是有道理的，但是搜索这三个因素的相互作用而不是一次搜索一个不是更好吗？

其次，我了解对 mtry 和 ntrees 执行网格搜索。但我不知道如何设置最小节点数或最大节点数。通常建议保留默认节点大小，如下所示？

library(randomForest)
library(caret)
mtrys<-seq(1,4,1)
ntrees<-c(250, 300, 350, 400, 450, 500, 550, 600, 800, 1000, 2000)
combo_mtrTrees<-data.frame(expand.grid(mtrys, ntrees))
colnames(combo_mtrTrees)<-c('mtrys','ntrees')

tuneGrid <- expand.grid(.mtry = c(1: 4))
for (i in 1:length(ntrees)){
  ntree<-ntrees[i]
  set.seed(65)
  rf_maxtrees <- train(Species~.,
                       data = df,
                       method = "rf",
                       importance=TRUE,
                       metric = "Accuracy",
                       tuneGrid = tuneGrid,
                       trControl = trainControl( method = "cv",
                                                 number=5,
                                                 search = 'grid',
                                                 classProbs = TRUE,
                                                 savePredictions = "final"),
                       ntree = ntree
                       )
  Acc1<-rf_maxtrees$results$Accuracy[rf_maxtrees$results$mtry==1]
  Acc2<-rf_maxtrees$results$Accuracy[rf_maxtrees$results$mtry==2] …

Run Code Online (Sandbox Code Playgroud)

r random-forest r-caret

Jac*_*ong

2019 09-15

3
推荐指数

1
解决办法

5061
查看次数

线性SVM并提取权重

我正在使用虹膜数据集在R中练习SVM，并且我想从模型中获取特征权重/系数，但是鉴于我的输出为我提供了32个支持向量，因此我认为我可能会误解某些东西。假设我要分析四个变量，我将得到四个。我知道使用该svm()函数时有一种方法，但是我尝试使用train()插入符号中的函数来生成我的SVM。

library(caret)

# Define fitControl
fitControl <- trainControl(## 5-fold CV
              method = "cv",
              number = 5,
              classProbs = TRUE,
              summaryFunction = twoClassSummary )

# Define Tune
grid<-expand.grid(C=c(2^-5,2^-3,2^-1))

########## 
df<-iris head(df)
df<-df[df$Species!='setosa',]
df$Species<-as.character(df$Species)
df$Species<-as.factor(df$Species)

# set random seed and run the model
set.seed(321)
svmFit1 <- train(x = df[-5],
                 y=df$Species,
                 method = "svmLinear", 
                 trControl = fitControl,
                 preProc = c("center","scale"),
                 metric="ROC",
                 tuneGrid=grid )
svmFit1

Run Code Online (Sandbox Code Playgroud)

我以为这很简单，svmFit1$finalModel@coef但是当我相信我应该得到4时，我得到了32个向量。为什么呢？

r svm r-caret

Jac*_*ong

2019 06-12

1
推荐指数

1
解决办法

86
查看次数