我正在尝试在 R 中画一个正方形:
ggplot() +
geom_rect(aes(xmin = 1, xmax = sqrt(pi), ymin = 1, ymax = sqrt(pi)))
Run Code Online (Sandbox Code Playgroud)
但这会产生一个看起来更像矩形的形状 - 我认为这是因为缩放比例不正确?
有人可以告诉我如何解决这个问题吗?
我使用这段代码生成 3 个随机数,加起来为 72:
# /sf/ask/1739213661/
rand_vect <- function(N, M, sd = 1, pos.only = TRUE) {
vec <- rnorm(N, M/N, sd)
if (abs(sum(vec)) < 0.01) vec <- vec + 1
vec <- round(vec / sum(vec) * M)
deviation <- M - sum(vec)
for (. in seq_len(abs(deviation))) {
vec[i] <- vec[i <- sample(N, 1)] + sign(deviation)
}
if (pos.only) while (any(vec < 0)) {
negs <- vec < 0
pos <- vec > 0
vec[negs][i] <- vec[negs][i <- sample(sum(negs), 1)] …Run Code Online (Sandbox Code Playgroud) 我的全局环境中有这些文件:
x <- sapply(sapply(ls(), get), is.data.frame)
n = names(x)[(x==TRUE)]
n
[1] "sample_1" "sample_10" "sample_2" "sample_3" "sample_4" "sample_5" "sample_6" "sample_7" "sample_8" "sample_9" "table_i"
Run Code Online (Sandbox Code Playgroud)
我想删除所有以“samp”开头的文件。我找到了可以执行此操作的代码(如何从工作区中仅清除一些特定对象?):
rm(list = apropos("samp_"))
Run Code Online (Sandbox Code Playgroud)
现在,我想学习如何用不同的方式做同样的事情。我找到了另一种方法来查找全局环境中以“samp”开头的所有文件:
nn = grep("samp", n, value = TRUE)
[1] "sample_1" "sample_10" "sample_2" "sample_3" "sample_4" "sample_5" "sample_6" "sample_7" "sample_8" "sample_9"
Run Code Online (Sandbox Code Playgroud)
然后,我尝试删除这些文件:
for (file in nn){
nn[i] <- NULL
}
do.call(file.remove, list(nn))
Run Code Online (Sandbox Code Playgroud)
谢谢你!
我用 R 中的模拟数据制作了以下地图:
首先我加载了库:
library(leaflet)
library(leaflet.extras)
library(dplyr)
Run Code Online (Sandbox Code Playgroud)
然后,我模拟了这个例子的随机数据:
myFun <- function(n = 5000) {
a <- do.call(paste0, replicate(5, sample(LETTERS, n, TRUE), FALSE))
paste0(a, sprintf("%04d", sample(9999, n, TRUE)), sample(LETTERS, n, TRUE))
}
Lat = abs(rnorm(1000, 42.2, 0.1))
Long = abs(rnorm(1000, -70, 0.1))
City = myFun(1000)
cities = data.frame(Lat, Long, City)
Run Code Online (Sandbox Code Playgroud)
最后我做了一张地图:
# download icon from here: https://leafletjs.com/examples/custom-icons/leaf-green.png
leaflet(cities) %>%
addProviderTiles(providers$OpenStreetMap) %>%
addMarkers( clusterOptions = markerClusterOptions(), popup = ~paste("title: ", City)) %>%
addResetMapButton() %>%
# these markers will be "invisible" on the …Run Code Online (Sandbox Code Playgroud) 我在 R 中有这个数据集:
set.seed(123)
myFun <- function(n = 5000) {
a <- do.call(paste0, replicate(5, sample(LETTERS, n, TRUE), FALSE))
paste0(a, sprintf("%04d", sample(9999, n, TRUE)), sample(LETTERS, n, TRUE))
}
col1 = myFun(100)
col2 = myFun(100)
col3 = myFun(100)
col4 = myFun(100)
group <- c("A","B","C","D")
group = sample(group, 100, replace=TRUE)
example = data.frame(col1, col2, col3, col4, group)
col1 col2 col3 col4 group
1 SKZDZ9876D BTAMF8110T LIBFV6882H ZFIPL4295E A
2 NXJRX7189Y AIZGY5809C HSMIH4556D YJGJP8022H C
3 XPTZB2035P EEKXK0873A PCPNW1021S NMROS4134O A
4 LJMCM3436S KGADK2847O …Run Code Online (Sandbox Code Playgroud) 我正在尝试学习如何在 R 中使用 Microbenchmark 函数。
作为示例,我模拟了一些不同大小的随机数据集:
# load the lubridate package
library(lubridate)
library(microbenchmark)
library(forecast)
my_list = list()
index = c(100, 1000, 10000, 50000, 100000, 250000, 500000, 750000, 1000000)
for (i in 1:length(index))
{
my_data_i = data.frame(dates = sample(seq(as.Date('2010/01/01'), as.Date('2023/01/01'), by="day"), replace = TRUE, index[i]), visits = 1)
my_list[[i]] = my_data_i
}
Run Code Online (Sandbox Code Playgroud)
然后我创建了一个我想在每个数据集上重复测量的函数:
my_function = function(){
# aggregate the data by week
my_data_i_weekly <- aggregate(my_data_i$visits, list(week = week(my_data_i$dates), year = year(my_data_i$dates)), sum)
# convert the data frame to a time series …Run Code Online (Sandbox Code Playgroud) 我正在使用 R 编程语言。
我有以下数据集:
factor_1 <- c("A", "B", "C", "D", "E")
factor_2 <- c("AA", "BB", "CC", "DD", "EE")
factor_3 <- c("AAA", "BBB", "CCC", "DDD", "EEE")
var_1 <- as.factor(sample(factor_1, 10000, replace=TRUE, prob=c(0.2, 0.2, 0.2, 0.2, 0.2)))
var_2 <- as.factor(sample(factor_2, 10000, replace=TRUE, prob=c(0.2, 0.2, 0.2, 0.2, 0.2)))
var_3 <- as.factor(sample(factor_3, 10000, replace=TRUE, prob=c(0.2, 0.2, 0.2, 0.2, 0.2)))
var_4 <- rnorm(1000,10,10)
var_5 <- rnorm(1000,10,10)
my_data = data.frame(var_1, var_2, var_3, var_4, var_5)
var_1 var_2 var_3 var_4 var_5
1 B AA EEE 13.645347 13.058532
2 …Run Code Online (Sandbox Code Playgroud) 我正在尝试抓取以下页面:
http://mywebsite.com
特别是,我想获取每个条目的名称。我注意到我感兴趣的文本始终位于(MY TEXT)这两个标签的中间: <div class="title"> <a href="your text"> MY TEXT </a>
我知道如何单独搜索这些标签:
#load libraries
library(rvest)
library(httr)
library(XML)
library(rvest)
# set up page
url<-"https://www.mywebsite.com"
page <-read_html(url)
#option 1
b = page %>% html_nodes("title")
option1 <- b %>% html_text() %>% strsplit("\\n")
#option 2
b = page %>% html_nodes("a")
option2 <- b %>% html_text() %>% strsplit("\\n")
Run Code Online (Sandbox Code Playgroud)
有什么方法可以指定“html_nodes”参数,以便它在“我的文本”上拾取 - 即在 <div class="title">和之间刮擦</a>:
<div class="title"> <a href="your text"> MY TEXT </a>
Run Code Online (Sandbox Code Playgroud) 我在 R 中有以下数据集 - 该数据集包含地理编码地址(加拿大)及其邮政编码、经度和纬度的列表:https ://www.dropbox.com/scl/fi/9kjoqsppb85ip0tdc5wmr/stackoverflow_example.csv?rlkey= cwjk222jnoz8c9cgbt01p6bep&dl=0
我将这些数据加载到 R 中:
library(httr)
library(data.table)
url <- "https://www.dropbox.com/scl/fi/9kjoqsppb85ip0tdc5wmr/stackoverflow_example.csv?rlkey=cwjk222jnoz8c9cgbt01p6bep&dl=1"
response <- GET(url)
df <- fread(content(response, "text"), sep = ",", quote = "", fill = TRUE)
> head(df)
"" "latitude" "longitude" "source_id" "id" "group_id" "street_no" "street" "str_name" "str_type" "str_dir" "unit" "city" "postal_code"
1: "999976" 44.25845 -76.46308 "" "6baaa1692aaaa7b496aa" 2494328 "104" "POINT ST. MARK DR" "POINT ST. MARK" "DR" "" "" "KINGSTON" "K7K 6X8"
2: "999977" 44.26391 -76.45090 "" "1f01e7839e59727d95a3" 2508891 "229" "GREENLEES DR" "GREENLEES" …Run Code Online (Sandbox Code Playgroud) 假设我有以下数独:
problem <- matrix(c(
5, 3, 0, 0, 7, 0, 0, 0, 0,
6, 0, 0, 1, 9, 5, 0, 0, 0,
0, 9, 8, 0, 0, 0, 0, 6, 0,
8, 0, 0, 0, 6, 0, 0, 0, 3,
4, 0, 0, 8, 0, 3, 0, 0, 1,
7, 0, 0, 0, 2, 0, 0, 0 ,6,
0 ,6 ,0 ,0 ,0 ,0 ,2 ,8 ,0,
0 ,0 ,0 ,4 ,1 ,9 ,0 ,0 ,5,
0 ,0 ,0 ,0 …Run Code Online (Sandbox Code Playgroud) 我正在使用 R 编程语言。
我有以下数据集(“df”):
df <- structure(list(student = c(1L, 1L, 1L, 1L, 2L, 2L, 2L),
var1 = c("a", "b", "b", "a", "c", "a", "b"),
start = structure(c(14610, 14610, 15869, 17439, 14610, 16436, 17897), class = "Date"),
end = structure(c(15706, 15706, 16679, 17723, 16071, 17492, 18791), class = "Date")),
row.names = c(NA, -7L), class = "data.frame")
student var1 start end
1 1 a 2010-01-01 2013-01-01
2 1 b 2010-01-01 2013-01-01
3 1 b 2013-06-13 2015-09-01
4 1 a 2017-09-30 2018-07-11
5 …Run Code Online (Sandbox Code Playgroud) 我正在使用 R 编程语言。
最近,我看到这篇文章How can i make a stacked multiple Density Plot with ggplot? 其中显示了一个非常有趣的图表:
我正在尝试学习如何复制该图。
我首先模拟了该图的数据:
text = c(
"Morena, pvem, pt 307[282-326] Actual 413",
"Morena, PT 263[244-280] Actual 303",
"Morena, pvem 265[243-282] Actual 267",
"PAN, PRI, PRD, MC 193 [167-211] Actual 163",
"PAN,PRI, PRD 180[155-199] Actual 137",
"PAN, PRI 152[131-167] Actual 126",
"PAN, PRD 112[95-125] Actual 89",
"PRI, PRF 97[83-111] Actual 59"
)
means = c(300,250,200,150,140,130,120,110)
data = data.frame()
for(i in seq_along(text)){
random_numbers = rnorm(100, means[i], 10) …Run Code Online (Sandbox Code Playgroud)