是否可以从库中的使用函数列中选择所有唯一值?用符号表示" " .data.frameselectdplyrSELECT DISTINCT field1 FROM table1SQL
谢谢!
我试图估计汉密尔顿(1989)的基本马尔可夫转换模型,如电子观点网页中的帖子.该模型本身就是RATS中现有的精确复制.
这是示例的时间序列:
gnp <-
structure(c(2.59316410021381, 2.20217123302681, 0.458275619103479,
0.968743815568942, -0.241307564718414, 0.896474791426144, 2.05393216767198,
1.73353647046698, 0.938712869506845, -0.464778333117193, -0.809834082445603,
-1.39763692441103, -0.398860927649558, 1.1918415768741, 1.4562004729396,
2.1180822079447, 1.08957867423914, 1.32390272784813, 0.87296368144358,
-0.197732729861307, 0.45420214345009, 0.0722187603196887, 1.10303634435563,
0.820974907499614, -0.0579579499110212, 0.584477722838197, -1.56192668045796,
-2.05041027007508, 0.536371845140342, 2.3367684244086, 2.34014568267516,
1.23392627573662, 1.88696478737248, -0.459207909351867, 0.84940472194713,
1.70139850766727, -0.287563102546191, 0.095946277449187, -0.860802907461483,
1.03447124467041, 1.23685943797014, 1.42004498680119, 2.22410642769683,
1.3021017302965, 1.0351769691057, 0.925342521818, -0.165599507925585,
1.3444381723048, 1.37500136316918, 1.73222186043569, 0.716056342342333,
2.21032138350616, 0.853330335823775, 1.00238777849592, 0.427254413549543,
2.14368353713136, 1.4378918561536, 1.5795993028646, 2.27469837381376,
1.95962653201067, 0.2599239932111, 1.01946919515563, 0.490163994319276,
0.563633789161385, 0.595954621290765, 1.43082852218349, 0.562301244017229,
1.15388388887095, 1.68722847001462, 0.774382052478202, -0.0964704476805431,
1.39600141863966, 0.136467982223878, 0.552237133917267, …Run Code Online (Sandbox Code Playgroud) 我使用R来估计4个时间序列的多变量GARCH(1,1)模型.我用rmgarch包试了一下.好像我使用它错了,但我不知道我的错误是什么.第一次使用.
library(quantmod)
library(fBasics)
library(rmgarch)
#load data, time series closing prices, 10 year sample
#DAX 30
getSymbols('^GDAXI', src='yahoo', return.class='ts',from="2005-01-01", to="2015-01-31")
GDAXI.DE=GDAXI[ , "GDAXI.Close"]
#S&P 500
getSymbols('^GSPC', src='yahoo', return.class='ts',from="2005-01-01", to="2015-01-31")
GSPC=GSPC[ , "GSPC.Close"]
#Credit Suisse Commodity Return Strat I
getSymbols('CRSOX', src='yahoo', return.class='ts',from="2005-01-01", to="2015-01-31")
CRSOX=CRSOX[ , "CRSOX.Close"]
#iShares MSCI Emerging Markets
getSymbols('EEM', src='yahoo', return.class='ts',from="2005-01-01", to="2015-01-31")
EEM=EEM[ , "EEM.Close"]
#calculating log returns of the time series
log_r1=diff(log(GDAXI.DE[39:2575]))
log_r2=diff(log(GSPC))
log_r3=diff(log(CRSOX))
log_r4=diff(log(EEM))
#return vector
r_t=cbind(log_r1, log_r2,log_r3, log_r4)
#specifying and fitting the model
model = multispec(replicate(4, ugarchspec(variance.model …Run Code Online (Sandbox Code Playgroud) 如果您有一个data.frame带有数字列的转换是没有问题的,如解释here。
dtf=data.frame(matrix(rep(5,10),ncol=2))
#str(dtf)
dtfz <- zoo(dtf)
class(dtfz)
#[1] "zoo"
str(as.data.frame(dtfz))
#'data.frame': 5 obs. of 2 variables:
# $ X1: num 5 5 5 5 5
# $ X2: num 5 5 5 5 5
Run Code Online (Sandbox Code Playgroud)
但是,如果您有一个data.frame带有文本列的所有内容都将转换为因子,即使在设置时stringsAsFactors = FALSE
dtf=data.frame(matrix(rep("d",10),ncol=2),stringsAsFactors = FALSE)
#str(dtf)
dtfz <- zoo(dtf)
#class(dtfz)
#dtfz
Run Code Online (Sandbox Code Playgroud)
以下所有将字符串转换为因子:
str(as.data.frame(dtfz))
str(as.data.frame(dtfz,stringsAsFactors = FALSE))
str(data.frame(dtfz))
str(data.frame(dtfz,stringsAsFactors = FALSE))
str(as.data.frame(dtfz, check.names=FALSE, row.names=NULL,stringsAsFactors = FALSE))
#'data.frame': 5 obs. of 2 variables:
# $ X1: Factor w/ 1 level …Run Code Online (Sandbox Code Playgroud) 如果您的data.frame因素不包含NAs作为级别,则可以毫无问题地过滤数据.
set.seed(123)
df=data.frame(a = factor(as.character(c(1, 1, 2, 2, 3, NA,3,NA)),exclude=NULL),
b= runif(8))
#str(df)
df[df$a==3,]
# a b
# 5 3 0.9404673
# 7 3 0.5281055
Run Code Online (Sandbox Code Playgroud)
如果您需要按NA级别进行过滤,则会出现问题.以下不起作用:
df[df$a==NA,]
df[df$a=="NA",]
df[is.na(df$a),]
Run Code Online (Sandbox Code Playgroud)
我发现的唯一方法是将因子转换为数字并将其与级别数进行比较.
df[as.numeric(df$a)==4,]
# a b
#6 <NA> 0.0455565
#8 <NA> 0.8924190
Run Code Online (Sandbox Code Playgroud)
有没有其他更直观/更优雅的方法来获得相同的结果?
I am conducting a log binomial regression in R. I want to control for covariates in the model (age and BMI- both continuous variables) whereas the dependent variable is Outcome(Yes or No) and independent variable is Group (1 or 2).
fit<-glm(Outcome~Group, data=data.1, family=binomial(link="log"))
Run Code Online (Sandbox Code Playgroud)
and it works fine.
When I try putting age in the model, it still works fine. However, when I put BMI in the model, it gives me the following:
Error: no valid set of coefficients has been …Run Code Online (Sandbox Code Playgroud) 我是 R/ggplot 的新手,我已经调查了 StackOverflow 的类似问题,但无济于事。对于家里的ggplot高手来说可能是个小问题,如果是这样,我期待一个快速的答案!
所以这里是:我试图从存储在数据框中的变量绘制 4 条曲线,同时使用颜色和线型。不幸的是,图例标签与曲线不匹配,这违背了图例的全部目的。
这是生成它的代码:
# declare variables
alpha = 0.5
m = 500
x = seq(m)
fdr_thresh = x/m*alpha
lvl_thresh = rep(alpha,m)
pvals = sin(2*pi*x/20)
pvalsA = exp(-x/100)*cos(2*pi*x/20)
# plot
df <- data.frame(pvals=pvals, pvalsA = pvalsA, FDR = fdr_thresh, level = lvl_thresh, x=x)
p4 <- ggplot(data = df) + geom_line(aes(x=x, y=pvals,color="Pure Sinusoid",linetype="Pure Sinusoid"))
p4 <- p4 + geom_line(aes(x=x, y=pvalsA,color="Damped Sinusoid",linetype="Damped Sinusoid"))
p4 <- p4 + geom_line(aes(x=x, y=FDR,color = 'FDR', linetype='FDR'))
p4 <- …Run Code Online (Sandbox Code Playgroud) 当您在Eviews中进行回归时,您会获得如下统计信息的面板:

在R中是否有一种方法可以在一个列表中获得关于R中的回归的所有/大部分统计数据?
我正在寻找一种简单的方法将至少有一个NA的所有行移动到dataframe/datatable的底部.例如 :
> df <- data.table(aaa=c(1,2,3,4,NA,6,7),
bbb=c(1,9,5,NA,3,NA,9),
ccc=c(NA,3,NA,4,8,NA,2)
)
> df
aaa bbb ccc
1: 1 1 NA
2: 2 9 3
3: 3 5 NA
4: 4 NA 4
5: NA 3 8
6: 6 NA NA
7: 7 9 2
Run Code Online (Sandbox Code Playgroud)
会变成这样的:
> df2 <- moveNAtoBottom(df)
> df2
aaa bbb ccc
1: 2 9 3
2: 7 9 2
3: 1 1 NA
4: 3 5 NA
5: 4 NA 4
6: NA 3 8
7: 6 NA NA …Run Code Online (Sandbox Code Playgroud) r ×9
dataframe ×2
eviews ×2
data.table ×1
dplyr ×1
filtering ×1
ggplot2 ×1
glm ×1
legend ×1
regression ×1
select ×1
time-series ×1
unique ×1
zoo ×1