小编Gau*_*hal的帖子

条件格式 - 颜色基于一列缩放整行

假设我想根据列中的值(使用条件格式菜单中的 excel inbuilt色标选项)对完整行进行着色.我该如何实现这一目标？请参阅下图

excel vba conditional-formatting excel-vba

Gau*_*hal

2018 07-10

13
推荐指数

3
解决办法

3万
查看次数

jupyter 中的 r 图形 - 无法启动 png() 设备

我在 Jupyter 中使用 R，但无法在笔记本本身中绘制图形。

这是一个可重现的例子

set.seed(123)
mat = as.matrix(x = rnorm(100), y = rnorm(100))
plot(mat)

Run Code Online (Sandbox Code Playgroud)

在 Jupyter 中：

Error in png(tf, width, height, "in", pointsize, bg, res, antialias = antialias): unable to start png() device
Traceback:

Run Code Online (Sandbox Code Playgroud)

如果我使用以下，我可以在当前工作目录中以png格式保存图像。

png('test.png')
plot(mat)
dev.off()

Run Code Online (Sandbox Code Playgroud)

编辑：

SessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)

Run Code Online (Sandbox Code Playgroud)

我经历了以下过程，但没有一个能解决我的问题。

png r jupyter-notebook jupyter-irkernel

Gau*_*hal

2018 06-20

6
推荐指数

1
解决办法

3094
查看次数

优化R代码,根据自定义距离函数创建距离矩阵

我正在尝试基于自定义距离函数为字符串创建距离矩阵(用于聚类).我在6000字的列表上运行代码,并且自上次90分钟后它仍在运行.我有8 GB RAM和Intel-i5,所以问题只在于代码.这是我的代码:

library(stringdist)
#Calculate distance between two monograms/bigrams
stringdist2 <- function(word1, word2)
{
    #for bigrams - phrases with two words
    if (grepl(" ",word1)==TRUE) {
        #"Hello World" and "World Hello" are not so different for me
        d=min(stringdist(word1, word2),
        stringdist(word1, gsub(word2, 
                          pattern = "(.*) (.*)", 
                          repl="\\2,\\1")))
    }
    #for monograms(words)
    else{
        #add penalty of 5 points if first character is not same
        #brave and crave are more different than brave and bravery
        d=ifelse(substr(word1,1,1)==substr(word2,1,1),
                            stringdist(word1,word2),
                            stringdist(word1,word2)+5)
    }   
    d
}
#create distance matrix
stringdistmat2 …

Run Code Online (Sandbox Code Playgroud)

string performance r edit-distance levenshtein-distance

Gau*_*hal

2015 09-02

5
推荐指数

1
解决办法

1137
查看次数

处理 R 中的字节顺序标记 (BOM)

有时，.CSV 文件的开头会出现字节顺序标记 (BOM)。当您使用记事本或 Excel 打开文件时，该符号不可见，但是，当您使用各种方法在 R 中读取文件时，您会在第一列的名称中看到不同的符号。这是一个例子

\n\n

开头带有 BOM 的示例 csv 文件。

\n\n

ID,title,clean_title,clean_title_id\n1,0 - 0,,0\n2,"""0 - 1,000,000""",,0\n27448,"20yr. rope walker\nigger",Rope Walker Igger,1832700817\n

Run Code Online (Sandbox Code Playgroud)\n\n

通读read.csv基础 R 包

\n\n

(x1 = read.csv("file1.csv",stringsAsFactors = FALSE))\n#   \xc3\xaf..ID                raw_title        semi_clean semi_clean_id\n# 1     1                    0 - 0                               0\n# 2     2          "0 - 1,000,000"                               0\n# 3 27448 20yr. rope walker\\nigger Rope Walker Igger    1832700817\n

Run Code Online (Sandbox Code Playgroud)\n\n

通读freaddata.table包中的内容

\n\n

(x2 = data.table::fread("file1.csv"))\n#    \xc3\xaf\xc2\xbb\xc2\xbfID                raw_title        semi_clean semi_clean_id\n# 1:     1                    0 - 0                               0\n# 2:     2 …

Run Code Online (Sandbox Code Playgroud)

byte-order-mark r data.table read.csv readr

Gau*_*hal

2016 09-20

5
推荐指数

1
解决办法

3197
查看次数

合并非关键变量的所有"出现次数"

我有两个数据集,我想要的可能被宽泛地称为"非关键变量的外连接".

这是数据集

数据集1

oc  oc2 state_id    r_state 
A011    A01 1808    1.00    
A011    A01 1810    0.50    
A012    A01 1810    0.50    
A011    A01 1814    0.33    
A012    A01 1814    0.33    
A013    A01 1814    0.33

Run Code Online (Sandbox Code Playgroud)

数据集2

oc  r_country
A011    0.62
A012    0.14
A013    0.24

Run Code Online (Sandbox Code Playgroud)

我想要的输出如下:

oc  oc2 state_id    r_state r_country
A011    A01 1808    1.00    0.62
A012    A01 1808    NA      0.14
A013    A01 1808    NA      0.24
A011    A01 1810    0.50    0.62
A012    A01 1810    0.50    0.14
A013    A01 1810    NA      0.24
A011    A01 1814    0.33 …

Run Code Online (Sandbox Code Playgroud)

merge r outer-join data.table

Gau*_*hal

2017 12-20

4
推荐指数

1
解决办法

59
查看次数

计算R中的尾随零

如何计算字符串向量中的尾随零.例如,如果我的字符串向量是:

x = c('0000','1200','1301','X230','9900')

Run Code Online (Sandbox Code Playgroud)

答案应该是

> numZeros
[1] 4 2 0 1 2

Run Code Online (Sandbox Code Playgroud)

我不想使用多个,ifelse因为我认为应该存在更优雅和更快的解决方案.我尝试使用模数,就像这样

y = as.integer(x)
numZeros = (!(y%%10000))+(!(y%%1000))+(!(y%%100))+(!(y%%10))

Run Code Online (Sandbox Code Playgroud)

但这需要两个条件才能成真.

字符串的最大长度是固定的(在我的情况下也是如此)和
向量中的所有字符串都可以转换为整数,在我的情况下不是这样.

然后使用stringr包并创建了一个解决方案,但它非常冗长.

library(stringr)
numZeros = 
4*str_detect(x,"0000") + 
3*str_detect(x,"[1-9 A-Z]000") + 
2*str_detect(x,"[1-9 A-Z]{2}00") + 
str_detect(x,"[1-9 A-Z]{3}0")

Run Code Online (Sandbox Code Playgroud)

另外,我无法通过查看定义来弄清楚是否str_detect使用.ifelsestr_detect

我在这里发现了同样的问题,但对于python.如果已经回答了R,请提供链接.

regex string r stringr

Gau*_*hal

lucky-day

2
推荐指数

1
解决办法

339
查看次数