我在R.工作.我有十进制度的一系列坐标,我想按这些数字的小数位数排序这些坐标(即我想丢弃小数位数太少的坐标).
R中是否有一个函数可以返回一个数字所具有的小数位数,我可以将其合并到函数编写中?
输入示例:
AniSom4 -17.23300000 -65.81700
AniSom5 -18.15000000 -63.86700
AniSom6 1.42444444 -75.86972
AniSom7 2.41700000 -76.81700
AniLac9 8.6000000 -71.15000
AniLac5 -0.4000000 -78.00000
Run Code Online (Sandbox Code Playgroud)
理想情况下,我会编写一个废弃AniLac9和AniLac 5的脚本,因为这些坐标没有以足够的精度记录.我想丢弃经度和纬度都少于3个非零十进制值的坐标.
我一直在试图将这个问题转化为data.table解决方案.(为了保持简单,我将使用相同的数据集)
当V2 == "b我想在两者之间交换列V1 <-> V3.
dt <- data.table(V1=c(1,2,4), V2=c("a","a","b"), V3=c(2,3,1))
#V1 V2 V3
#1: 1 a 2
#2: 2 a 3
#3: 4 b 1
Run Code Online (Sandbox Code Playgroud)
下面的代码将是工作的解决方案data.frame,但是由于我给我的挫折感,因为我使用了一个data.table没有意识到我现在决定找到data.table的解决方案.
dt <- data.table(V1=c(1,2,4), V2=c("a","a","b"), V3=c(2,3,1))
df <- as.data.frame(dt)
df[df$V2 == "b", c("V1", "V3")] <- df[df$V2 == "b", c("V3", "V1")]
# V1 V2 V3
#1 1 a 2
#2 2 a 3
#3 1 b 4
Run Code Online (Sandbox Code Playgroud)
我尝试编写一个lapply循环遍历目标交换列表的函数,尝试将问题缩小到只替换一个值,尝试以不同方式调用列名但都没有成功.
这是我设法获得的最接近的尝试:
> dt[dt$V2 …Run Code Online (Sandbox Code Playgroud) 我使用了以下代码:
library(XML)
library(RCurl)
getGoogleURL <- function(search.term, domain = '.co.uk', quotes=TRUE)
{
search.term <- gsub(' ', '%20', search.term)
if(quotes) search.term <- paste('%22', search.term, '%22', sep='')
getGoogleURL <- paste('http://www.google', domain, '/search?q=',
search.term, sep='')
}
getGoogleLinks <- function(google.url)
{
doc <- getURL(google.url, httpheader = c("User-Agent" = "R(2.10.0)"))
html <- htmlTreeParse(doc, useInternalNodes = TRUE, error=function(...){})
nodes <- getNodeSet(html, "//a[@href][@class='l']")
return(sapply(nodes, function(x) x <- xmlAttrs(x)[[1]]))
}
search.term <- "cran"
quotes <- "FALSE"
search.url <- getGoogleURL(search.term=search.term, quotes=quotes)
links <- getGoogleLinks(search.url)
Run Code Online (Sandbox Code Playgroud)
我想找到我的搜索产生的所有链接,我得到以下结果:
> links
list()
Run Code Online (Sandbox Code Playgroud)
我怎样才能获得链接?另外我想获得谷歌搜索结果的头条新闻和总结如何才能获得它?最后是否有办法获取ChillingEffects.org结果中的链接?
为了提取下面两个数据帧之间的不匹配,我已经设法创建了一个新的数据帧,其中不匹配被替换.
我现在需要的是一系列不匹配:
dfA <- structure(list(animal1 = c("AA", "TT", "AG", "CA"), animal2 = c("AA", "TB", "AG", "CA"), animal3 = c("AA", "TT", "AG", "CA")), .Names = c("animal1", "animal2", "animal3"), row.names = c("snp1", "snp2", "snp3", "snp4"), class = "data.frame")
# > dfA
# animal1 animal2 animal3
# snp1 AA AA AA
# snp2 TT TB TT
# snp3 AG AG AG
# snp4 CA CA CA
dfB <- structure(list(animal1 = c("AA", "TT", "AG", "CA"), animal2 = c("AA", "TB", "AG", "DF"), animal3 = c("AA", …Run Code Online (Sandbox Code Playgroud) 两天来,我一直在尝试使用 Lapack 安装 Openblas/atlas 并在 R 中使用它。这让我发疯了。我已经到了无法再思考的地步。
我的服务器使用:
Red Hat Enterprise Linux Server 6.6 版(圣地亚哥)
这是我到目前为止安装的内容:
[root@tpdb05 atlas]# yum install atlas.x86_64 blas.x86_64 lapack.x86_64 Loaded plugins: product-id, refresh-packagekit, rhnplugin, security, subscription-manager
Setting up Install Process
Package atlas-3.8.4-2.el6.x86_64 already installed and latest version
Package blas-3.2.1-4.el6.x86_64 already installed and latest version
Package lapack-3.2.1-4.el6.x86_64 already installed and latest version
[root@tpdb05 ruser]#
yum install lapack.i686
Installed:
lapack.i686 0:3.2.1-4.el6
Dependency Installed:
blas.i686 0:3.2.1-4.el6 glibc.i686 0:2.12-1.166.el6_7.3 libgfortran.i686 0:4.4.7-16.el6
nss-softokn-freebl.i686 0:3.14.3-23.el6_7
Dependency Updated:
glibc.x86_64 0:2.12-1.166.el6_7.3 glibc-common.x86_64 0:2.12-1.166.el6_7.3 glibc-devel.x86_64 …Run Code Online (Sandbox Code Playgroud) 我仍然发现R中的ifelse结构有点令人困惑,我有以下数据框:
df <- structure(list(snp = structure(1:11, .Label = c("AL0009", "AL00014", "AL0021", "AL00046", "AL0047", "AS0005", "AS0014", "AS00021", "AS0047", "AS0071", "DR0001" ), class = "factor"), CHROMOSOME = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), COUNT_ALLELE = structure(c(1L, 1L, 1L, 3L, 1L, 1L, 1L, 2L, 3L, 3L, 1L), .Label = c("A", "C", "G"), class = "factor"), OTHER_ALLELE = structure(c(3L, 3L, 2L, 1L, 3L, 2L, 2L, 1L, 1L, 1L, 3L), .Label = c("A", "C", "G"), class = "factor"), `116601888` = …Run Code Online (Sandbox Code Playgroud) 使用我自己的数据集,我有太多的系数.我只是想在没有(或部分)打印系数的情况下打印摘要.
示例脚本:
lm.fit <- lm(iris$Sepal.Length ~ iris$Petal.Length)
summary(lm.fit)
Run Code Online (Sandbox Code Playgroud)
输出:
> summary(lm.fit)
Call:
lm(formula = iris$Sepal.Length ~ iris$Petal.Length)
Residuals:
Min 1Q Median 3Q Max
-1.24675 -0.29657 -0.01515 0.27676 1.00269
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.30660 0.07839 54.94 <2e-16 ***
iris$Petal.Length 0.40892 0.01889 21.65 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4071 on 148 degrees of freedom
Multiple R-squared: 0.76, Adjusted R-squared: 0.7583
F-statistic: 468.6 on …Run Code Online (Sandbox Code Playgroud) 我有以下数据:
> dput(bla)
structure(list(V1 = structure(c(4L, 4L, 4L, 2L), .Label = c("DDDD",
"EEEE", "NNNN", "PPPP", "ZZZZ"), class = "factor"), V2 = c(100014096L,
100014098L, 100014099L, 100014995L), V3 = c(0.742, 0.779, 0.744,
0.42), V4 = c(1.077, 1.054, 1.049, 0.984), V5 = c(0.662, 0.663,
0.671, 0.487), V6 = c(1.107, 1.14, 1.11, 0.849), V7 = c(0.456,
0.459, 0.459, 1.278)), .Names = c("V1", "V2", "V3", "V4", "V5",
"V6", "V7"), class = "data.frame", row.names = c(NA, 4L))
> bla
V1 V2 V3 V4 V5 V6 V7 …Run Code Online (Sandbox Code Playgroud) 由于我必须使用的数据集的大小Speedlm,fastLm或biglm.不幸的是我坚持使用speedlm作为fastlm不具有update的功能,并且biglm只支持单核心.
使用speedlm我想显示所有残差.我知道,lm或者fastlm我可以简单地使用该residuals()功能.然而事实证明speedlm不支持这一点.
lmfit <- speedglm(formula , res)
print(names(lmfit))
[1] "coefficients" "coef" "df.residual" "XTX" "Xy" "nobs" "nvar" "ok" "A" "RSS" "rank" "pivot" "sparse" "yy" "X1X" "intercept" "method" "terms" "call"
lmfit <- fastLm(formula, res)
print(names(lmfit))
[1] "coefficients" "stderr" "df.residual" "fitted.values" "residuals" "call" "intercept" "formula"
Run Code Online (Sandbox Code Playgroud)
有没有办法显示所有残差speedlm?
尝试print(residuals(lmfit))时只打印一个NULL
编辑:
当使用@Roland提到的方法时,它返回纯粹NA的
lmfit <- speedlm(formula , res, …Run Code Online (Sandbox Code Playgroud)