数据框 AEbySOC 包含两列 - 具有字符级别的因子 SOC 和整数计数 Count:
> str(AEbySOC)
'data.frame': 19 obs. of 2 variables:
$ SOC : Factor w/ 19 levels "","Blood and lymphatic system disorders",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Count: int 25 50 7 3 1 49 49 2 1 9 ...
Run Code Online (Sandbox Code Playgroud)
SOC 的级别之一是空字符串:
> l = levels(AEbySOC$SOC)
> l[1]
[1] ""
Run Code Online (Sandbox Code Playgroud)
我想用非空字符串替换此级别的值,例如“未指定”。这不起作用:
> library(plyr)
> revalue(AEbySOC$SOC, c(""="Not specified"))
Error: attempt to use zero-length variable name
Run Code Online (Sandbox Code Playgroud)
这也不行:
> AEbySOC$SOC[AEbySOC$SOC==""] …Run Code Online (Sandbox Code Playgroud) 我有一个包含数字、字母和空格的文本字符串。它的一些子字符串是月份的缩写。我想执行基于条件的模式替换,即当且仅当满足给定条件时,才将月份缩写括在空格中。例如,让条件如下:“前面是一个数字,后面是一个字母”。
我试过stringr包,但我没有结合功能str_replace_all()和str_locate_all():
# Input:
txt = "START1SEP2 1DECX JANEND"
# Desired output:
# "START1SEP2 1 DEC X JANEND"
# (A) What I could do without checking the condition:
library(stringr)
patt_month = paste("(", paste(toupper(month.abb), collapse = "|"), ")", sep='')
str_replace_all(string = txt, pattern = patt_month, replacement = " \\1 ")
# "START1 SEP 2 1 DEC X JAN END"
# (B) But I actually only need replacements inside the condition-based bounds: …Run Code Online (Sandbox Code Playgroud)