R删除仅包含数字的数据框条目中的数字

Question

R删除仅包含数字的数据框条目中的数字

siu*_*shi 5 regex r filter dataframe dplyr

我正在读取在线csv文件中的数据框，但是创建文件的人不小心将一些数字输入了应该是城市名称的列。cities.data表样本。

City        Population   Foo   Bar
Seattle     10           foo1  bar1
98125       20           foo2  bar2
Kent 98042  30           foo3  bar3
98042 Kent  30           foo4  bar4

Run Code Online (Sandbox Code Playgroud)

删除城市列中仅包含数字的行后的所需输出：

City        Population   Foo   Bar
Seattle     10           foo1  bar1
Kent 98042  30           foo3  bar2
98042 Kent  30           foo4  bar4

Run Code Online (Sandbox Code Playgroud)

我想删除城市列中仅包含数字的行。Kent 98042和98042 Kent都可以，因为它包含城市名称，但是由于98125不是城市，因此我删除了该行。

我无法使用，is.numeric因为该数字在csv文件中被作为字符串读取。我尝试使用正则表达式，

cities.data <- cities.data[which(grepl("[0-9]+", cities.data) == FALSE)]

Run Code Online (Sandbox Code Playgroud)

但这会删除具有任何数字的行，而不是仅删除仅包含数字的行，例如

City        Population   Foo   Bar
Seattle     10           foo1  bar1

Run Code Online (Sandbox Code Playgroud)

"Kent 98042"即使我想保留该行也被删除了。有什么建议吗？请和谢谢！

Answer 1

Jan*_*Jan 1

简单地说R：

df <- data.frame(City = c('Seattle', '98125', 'Kent 98042'),
                 Population = c(10, 20, 30),
                 Foo = c('foo1', 'foo2', 'foo3'))
df2 <- df[-grep('^\\d+$', df$City),]
df2

Run Code Online (Sandbox Code Playgroud)

这产生

        City Population  Foo
1    Seattle         10 foo1
3 Kent 98042         30 foo3

Run Code Online (Sandbox Code Playgroud)

这个想法是寻找^\d+$（仅数字）并将它们从集合中删除。注意两侧的锚点。

归档时间：	7 年，10 月前
查看次数：	4739 次
最近记录：	6 年，8 月前