是否有一种优雅的方法来平衡不平衡的面板数据集?我想从一个不平衡的小组开始(即,一些人缺少一些数据)并最终得到一个平衡的小组(即,所有个人都没有丢失数据).下面是一些示例代码.正确的最终结果是对'Frank'和'Edward'的所有观察结果保留,并且对'Tony'的所有观察都要删除,因为他有一些缺失的数据.谢谢.
unbal <- data.frame(PERSON=c(rep('Frank',5),rep('Tony',5),rep('Edward',5)), YEAR=c(2001,2002,2003,2004,2005,2001,2002,2003,2004,2005,2001,2002,2003,2004,2005), Y=c(21,22,23,24,25,5,6,NA,7,8,31,32,33,34,35), X=c(1:15))
unbal
Run Code Online (Sandbox Code Playgroud) 我正在使用不平衡的面板数据,我希望从中抽取随机样本,该样本不受每单位不同观察数量的影响.例如,在下面的代码中,IBM被选中的可能性是GOOG的两倍,被选中的可能性是MSFT的五倍.有没有办法对这些数据进行抽样,好像每个公司/年都有相同的被选中概率?可能通过使用采样包?
df <- data.frame(COMPANY=c(rep('IBM',50),rep('GOOG',25),rep('MSFT',10)), YEAR=c(1961:2010,1988:2012,1996:2005), PROFIT=rnorm(85))
df
df[sample(nrow(df), 20, replace=FALSE), ]
Run Code Online (Sandbox Code Playgroud) 我想使用优秀的rugarch包在两个不同的时间序列上估计 EGARCH 模型,但求解器无法收敛。我不想使用“混合”求解器选项,因为这会在循环通过“gosolnp”求解器时引入随机性。我的两个问题是:(1)我的数据是否有一些奇怪的东西导致收敛失败,(2)如果没有,有没有办法修改ugarchfit()函数,以便它“更努力地”找到解决方案?以下是我正在使用的数据和代码。
library("rugarch")
ABC <- c(-0.003311,-0.009967,-0.010067,-0.023729,0.006944,-0.010345,0.02439,-0.006803,-0.017123,-0.003484,0.017483,0.054983,0.032573,-0.018927,-0.006817,-0.019608,-0.003333,0.006689,0.006645,-0.009901,0,0.01,0.006601,0,0.009836,0.022727,-0.003175,-0.009554,0.022508,-0.003145,0.006309,-0.021944,0.012821,-0.015823,-0.028939,-0.009934,0.020067,0.045902,-0.012539,-0.003175,0.003185,0.012698,-0.003135,0.009434,-0.003115,-0.00625,0.003145,0.003135,-0.025,0.00641,0.012739,-0.003145,0.009464,-0.009375,0.009464,0,-0.0125,0,0.003165,-0.009464,0.006369,-0.028481,0.035831,-0.003145,0.009464,-0.00625,0.003145,-0.00627,-0.009464,-0.012739,0,-0.006452,0.016234,-0.003195,0.012821,0,0.003544,-0.003185,-0.00639,-0.022508,0.009868,0.006515,-0.003236,0,-0.012987,0.013158,-0.003247,-0.013029,0.0033,0,-0.016447,-0.006689,-0.003367,0.003378,0.013468,0.063123,0.0125,0.006173,-0.006135,-0.033951,-0.003195,-0.003205,0.022508,0.025157,0,-0.006135,-0.009259,-0.018692,0.009524,0.006289,-0.003125,0.015674,0.003086,0.003077,-0.009202,0,0.003096,-0.006173,-0.006211,0,-0.009375,-0.006309,-0.006349,0.00639,-0.003175,0,0.003185,-0.009524,0.009615,-0.003175,-0.009554,0.003215,-0.003205,0,0,0.003215,-0.009615,0.006472,-0.003215,0.000387,0.003257,-0.003247,0.006515,-0.003236,0.012987,0.022436,-0.003135,-0.006289,-0.003165,0.009524,0.044025,0.006024,0.005988,-0.005952,0,-0.017964,-0.003049,0,-0.006116,-0.009231,-0.018634,0.009494,-0.00627,-0.003155,0.009494,0.015674,0.021605,0,0.003021,-0.003012,0,-0.003021,-0.006061,-0.003049,-0.006116,-0.003077,0.003086,-0.006154,0.009288,-0.003067,-0.006154,0,0,-0.01548,0.012579,0.009317,-0.003077,-0.003086,0.006192,-0.006154,0,-0.012384,-0.00627,-0.006309,0.003175,-0.018987,0.016129,-0.009524,0.009615,0.003175,0.018987,-0.006211,0.025,-0.005732,-0.009288,0.00625,-0.006211,-0.009375,0.012618,-0.012461,0.009464,-0.00625,0.003145,-0.003135,0,0.003145,-0.003135,0,-0.006289,0.009494,-0.003135,0.009434,-0.006231,-0.015674,-0.009554,-0.025723,0.0033,-0.003289,-0.006601,0.006645,-0.013201,-0.006689,0.013468,-0.003322,-0.003333,0.006689,0.013289,-0.019672,0.006689,-0.006645,-0.003344,0.006711,0.036667,0.006431,0,-0.00639,0.009646,0.015924,0.003135,0.03125,0.012121,-0.005988,0.021084)
DEF <- c(0.004876,0.029923,-0.072242,-0.015235,-0.011603,0.015652,-0.021832,-0.015755,-0.008448,-0.038565,0.035914,-0.052679,0.005703,0.02741,-0.028059,0.004733,-0.00895,-0.035646,0.176934,-0.023869,-0.039468,-0.016079,0.00227,-0.015851,-0.02439,-0.021226,0.001928,-0.025493,0.027641,0.036023,0.02828,0.001803,-0.011251,0.015476,-0.035858,-0.003719,-0.0042,0.009372,-0.019499,0.023201,0.018047,0.005,0.037087,0.012647,-0.03273,0.036509,0.016323,0.040152,-0.001219,-0.002441,0.039967,0.023137,0.006899,0.007613,-0.007933,-0.026276,-0.003911,0.006677,0.023875,-0.014144,-0.002714,-0.031104,0.027689,0.003124,0.005839,-0.020898,0.030435,0.034906,0.036694,0.004648,-0.017438,-0.034408,-0.006752,0.010196,0.043738,-0.053725,0.008327,-0.035285,0.002724,-0.006209,-0.052714,-0.006595,0.025726,-0.024272,-0.011194,0.005451,-0.004587,0.002514,0.035102,-0.008478,0.052117,0.010836,0.009188,-0.016692,0.033179,-0.025766,0.013415,-0.00643,0.059764,0.002155,0.005376,-0.001069,-0.00571,0,0.005025,-0.001786,0.030411,0.003125,0.010038,-0.014051,-0.025721,-0.018195,0.005451,0.011926,-0.005714,0.002874,0.022206,0.018921,-0.016162,0.013632,-0.048276,-0.018841,0.038405,0.043385,0.000341,0.001363,-0.006805,0.030832,-0.000332,0.016955,0.019941,-0.019551,-0.033998,0.016582,0,0.008655,-0.00099,0.008259,-0.017038,0.007,-0.011917,0.01206,0.005958,0.009543,0.088983,-0.027237,0,-0.004615,0.007728,0.003681,-0.012836,0.017028,0,0.005784,-0.006659,-0.001828,0.000611,-0.012508,0.022552,-0.01148,0.008863,0.003332,0.003925,0.005714,-0.007775,-0.009946,-0.007915,-0.013194,-0.000622,0.015557,0.026961,-0.002387,-0.009569,-0.020229,0.00678,-0.015611,0.001866,0.007759,-0.020942,0.003146,0.017874,0.029883,0.014358,-0.009142,-0.004167,0.002926,0.003287,-0.010125,-0.000903,-0.003312,-0.010876,0.006109,-0.006679,-0.005807,0.006148,0.001528,-0.00244,0.017431,-0.011422,-0.00304,0.021653,-0.017612,-0.005773,-0.018643,0.000934,-0.009023,-0.000314,-0.009736,-0.001269,-0.005081,-0.019151,0.020827,0.000956,-0.018153,-0.013947,0.008224,-0.014356,-0.012248,0.009048,-0.003985,-0.012671,-0.008105,0.011236,-0.017508,0.019877,0.014113,-0.003976,-0.018629,0.002373,-0.002705,-0.014242,-0.02924,-0.00567,0.002851,0.000711,0.01598,0.019224,0.00823,-0.009524,0.015797,-0.025693,0.001388,-0.014553,0.014065,-0.003467,-0.008699,-0.004914,0.00388,-0.002811,-0.003524,-0.004597,-0.004263)
ugarchfit(spec = ugarchspec(mean.model = list(armaOrder = c(0, 0), include.mean = TRUE), variance.model = list(model = "eGARCH", garchOrder = c(1, 1))), data = ABC)
ugarchfit(spec = ugarchspec(mean.model = list(armaOrder = c(0, 0), include.mean = TRUE), variance.model = list(model = "eGARCH", garchOrder = c(1, 1))), data = DEF)
Run Code Online (Sandbox Code Playgroud) 使用基础 R,我想在嵌套列表上使用 mapply 函数。例如,在下面的代码中,我试图从嵌套列表的每个元素中删除字母“a”。我想用一行代码替换最后两行。
mylist <- list(
list(c("a", "b", "c"), c("d", "e", "f")),
list(c("a", "v", "w"), c("x", "y"), c("c", "b", "a"))
)
mylist
not_a <- lapply(mylist, lapply, `!=`, "a")
not_a
mylist[[1]] <- mapply(`[`, mylist[[1]], not_a[[1]], SIMPLIFY = FALSE)
mylist[[2]] <- mapply(`[`, mylist[[2]], not_a[[2]], SIMPLIFY = FALSE)
Run Code Online (Sandbox Code Playgroud) 我正在尝试通过三个变量(组、id 和日期)交叉连接 data.table。下面的 R 代码完全实现了我想要做的事情,即每个组中的每个 id 都被扩展以包含所有想要的日期。但是有没有办法使用优秀的 data.table 包更有效地完成同样的事情?
library(data.table)
data <- data.table(
group = c(rep("A", 10), rep("B", 10)),
id = c(rep("frank", 5), rep("tony", 5), rep("arthur", 5), rep("edward", 5)),
date = seq(as.IDate("2020-01-01"), as.IDate("2020-01-20"), by = "day")
)
data
dates_wanted <- seq(as.IDate("2020-01-01"), as.IDate("2020-01-31"), by = "day")
names_A <- data[group == "A"][["id"]]
names_B <- data[group == "B"][["id"]]
names_A <- CJ(group = "A", id = names_A, date = dates_wanted, unique = TRUE)
names_B <- CJ(group = "B", id = names_B, date = dates_wanted, …Run Code Online (Sandbox Code Playgroud) 我正在使用 R BEST 包来测试两组之间的均值差异,并且我想按组设置先验信念。
在下面的 R 代码中,我可以按组设置先验均值(请参阅priors2),但不能按组设置先验标准差(请参阅priors3)。
难道我做错了什么?
library(BEST)
y1 <- c(5.77, 5.33, 4.59, 4.33, 3.66, 4.48)
y2 <- c(3.88, 3.55, 3.29, 2.59, 2.33, 3.59)
priors1 <- list(muM = 6, muSD = 2)
BESTout1 <- BESTmcmc(y1, y2, priors=priors1, parallel=FALSE)
priors2 <- list(muM = c(6, 4), muSD = 2)
BESTout2 <- BESTmcmc(y1, y2, priors=priors2, parallel=FALSE)
priors3 <- list(muM = c(6, 4), muSD = c(2, 2))
BESTout3 <- BESTmcmc(y1, y2, priors=priors3, parallel=FALSE)
Run Code Online (Sandbox Code Playgroud) 我想了解 R kknn 包如何计算二元分类问题的权重、距离和类概率。在下面的 R 代码中,训练样本中有 3 个观察值,保留样本中有 1 个观察值。两个预测变量是身高和体重。使用欧几里德距离,训练样本中每个观测值的距离为:
sqrt((6-8)^2 + (4-5)^2) = 2.24
sqrt((6-3)^2 + (4-7)^2) = 4.24
sqrt((6-7)^2 + (4-3)^2) = 1.41。
当 k=3 且权重相等时,我得到的保留概率为:
(1/3 * 1) + (1/3 * 0) + (1/3 * 1) = 0.67。
当 k=2 且权重相等时,我得到的保留概率为:
(1/2 * 1) + (1/2 * 1) = 1.00。
我想了解 R kknn 包如何使用“三角形”、“高斯”和“逆”权重(以及更一般的)进行相同的计算。
library(kknn)
training <- data.frame(class = c(1, 0, 1), height = c(8, 3, 7), weight = c(5, 7, 3))
holdouts <- data.frame(class = 1, …Run Code Online (Sandbox Code Playgroud) r ×7
balance ×1
bayesian ×1
convergence ×1
cross-join ×1
data.table ×1
mapply ×1
nested ×1
panel ×1
panel-data ×1
volatility ×1