我想从我的数据框中删除一些行.我认为使用subset它将是最简单的方法.
我之前使用以下代码删除了一些行:
data_selected <- subset(tbl_data, Name.x != "XXX" & Name.y != "YYY")
Run Code Online (Sandbox Code Playgroud)
问题是如何从我的表中删除两个单元格(同一行)中具有相同字符串的行.
我认为这mtcars可以作为一个例子:
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 …Run Code Online (Sandbox Code Playgroud) 前段时间我做了类似的线程但不幸的是我在那里使用的方法并没有给我任何有希望的结果.我想到了如何以不同的方式做到这一点.所以我在这里.
当然示例数据:
structure(list(Name1 = c("Mazda RX4", "Mazda RX4", "KIA Ceed",
"Mazda RX4 Wag", "KIA Ceed", "Valiant", "KIA Classic", "Mazda RX4",
"Dacia", "Merc 280", "Duster 360", "Merc 230"), Name2 = c("Mazda RX4 Wag",
"Merc 230", "KIA Sport", "Merc 230", "KIA Classic", "Merc 230",
"KIA Sport", "Merc 240D", "Mazda RX4 Wag", "Merc 450SE", "Valiant",
"Duster 360")), .Names = c("Name1", "Name2"), class = "data.frame", row.names = c(NA,
12L))
Run Code Online (Sandbox Code Playgroud)
该数据框仅包含两列.原始数据有更多,但这次我将只关注那些列.
只是为了表明我想如何将这些人聚集在一起,我会提出所需的输出:
structure(list(Name1 = c("Mazda RX4", "Mazda RX4", "KIA Ceed",
"Mazda RX4 Wag", …Run Code Online (Sandbox Code Playgroud) 我有这样的数据:
dput(tbl_data[1:5])
structure(list(Name = c("Mark", "Anders", "Tom", "Vin", "Marcel",
"Tyta", "Gerta", "Moses", "Hank", "Rita", "Margary"), Col = c(1769380097.5,
1444462500, 1499146687.5, 1276309375, 22279500, 3114023471, 2961012500,
3978937423.5, 1703925000, 1838885550, 1483386250), dKO1 = c(1534931323.07692,
1794881375, 2292661687.5, 855786250, 21915500, 3056061512.25,
3581940000, 3766909703.25, 2043300000, 2135859875, 1482031250
), dKO2 = c(1628137500, 1781982737.5, 1659391250, 741220687.5,
41242000, 2833327766.38514, 3675450000, 3592650662.5, 1586512500,
1934575000, 1467271250), sdi1 = c(1545572702.88461, 1748600000,
1745026687.5, 1556481250, NaN, 3551716021.25, 3108137500, 3718036445,
1380278750, 2217526000, 1026813750)), .Names = c("Name", "Col",
"dKO1", "dKO2", "sdi1"), row.names = c(29L, 30L, 1278L, …Run Code Online (Sandbox Code Playgroud) 假设我有4个向量:
a <- c("Mark","Kate","Greg", "Mathew")
b <- c("Mark","Tobias","Mary", "Mathew", "Greg")
c <- c("Mary","Chuck","Igor", "Mathew", "Robin", "Tobias")
d <- c("Kate","Mark","Igor", "Greg", "Robin", "Mathew")
Run Code Online (Sandbox Code Playgroud)
我想从这些向量中选择重叠名称,并假设名称必须出现在这4个向量中的至少3个中.当然,我希望能够轻松地使用名称必须存在的向量百分比.
我能intersect以某种方式修改吗?
我有四个数据集,我想用它来创建条形图.我必须为昨天制作那些条款......所以如果答案能让我满意的话,我可以保证创造一个价值100分的赏金.谢谢!让我们首先向您展示数据:
data1 <- structure(c(1.071378962, 0.845918605, 0.943329547, 0.814648308, 1.190586037, 0.842555028, 1.006615522, 0.943341723, 0.970762595, 0.842846286, 0.554258129, 0.87289711, 0.890745502, 0.941635877, 0.999470449, 0.973126826, 0.857023562, 0.868671406, 0.959087679, 0.931290542, 1.240037136, 1.262066016, 0.776554623, 1.032276723, 1.39086975, 0.89611471, 1.022911942, 0.99951195, 0.943864517, 1.021282723, 0.997836642, 1.095591277, 0.851299889, 1.066302784, 1.049684459, 0.751507841, 0.698681059, 1.12356928, 1.11429691,1.377308152, 1.075640032, 1.011348603, 0.712689025, 1.160469927, 0.995125559, 1.00048805, 1.090401663, 1.10721464, 0.931480983, 0.941864276, 0.876112393, 1.018348149, 1.082099793, 1.006545876, 1.233818591, 1.053432293, 1.088147123, 1.421639524, 1.20566246, 1.234880201),
.Dim = c(6L, 10L), .Dimnames = list(c("Glutamic acid (3TMS)", "Glutamine (3TMS)", "Glutamine [-H2O] (3TMS) MP", "Glycine (3TMS)", "Homoserine …Run Code Online (Sandbox Code Playgroud) 我想使用以下数据来创建可视化:
> dput(data)
structure(c(1264L, 2190L, 2601L, 1441L, 1129L, 2552L, 1820L, 306L,
1124L, 298L, 233L, 493L, 2316L, 461L, 294L, 238L, 15L, 89L, 152L,
163L, 116L, 60L, 80L, 31L, 27L, 41L, 33L, 58L, 55L, 89L, 7L,
7L, 13L, 23L, 54L), .Dim = c(7L, 5L), .Dimnames = list(c("Mark1",
"Mark2", "Greg1", "Greg2", "Tom1", "Tom2", "Martin"), c("1", "2", "3", "4",
"5")))
Run Code Online (Sandbox Code Playgroud)
我希望以类似于下图所示的条形图的方式对其进行可视化:
但是,我不想将条形堆叠在一起,而是将条形图彼此放置.总共应该有4个窗口.这里的窗口指的是一组两个条形图(例如,对于Mark1和Mark2)或者在Martin的情况下是单个条形图.
为了更清楚:
Mark1和Mark2 - 应位于同一窗口中/上
Greg1和Greg2 - 如上所述
Tom1和Tom2 - 如上所述
马丁 - 窗口由单个barplot组成
那是一个数据:
structure(list(Fasta.headers = c("Person01050.1", "Person01080.1",
"Person01090.1", "Person01100.4", "Person01140.1", "Person01220.1"),
ToRemove.Gr_1 = c(0, 1107200, 17096000, 0, 0, 0), ToRemove.Gr_10 = c(0,
37259000, 1104800000, 783870, 0, 1308600), ToRemove.Gr_11 = c(1835800,
53909000, 623960000, 0, 0, 0), ToRemove.Gr_12 = c(0, 19117000,
808600000, 0, 0, 719400), ToRemove.Gr_13 = c(2544200, 2461400,
418770000, 0, 0, 0), ToRemove.Gr_14 = c(5120400, 1373700,
117330000, 0, 0, 0), ToRemove.Gr_15 = c(6623500, 0, 73336000,
0, 0, 0), ToRemove.Gr_16 = c(0, 0, 31761000, 0, 0, 0), ToRemove.Gr_17 = c(13475000,
0, 29387000, 0, 0, …Run Code Online (Sandbox Code Playgroud) 我想连接两个向量。我已经知道怎么做,但只有一个问题 - 订单。
所以,我使用下面的函数来链接向量,但输出不是我想要的:
var_names <- c("Tim", "Jack", "Tom")
var_tp<- c("0", "40", "-u")
vec <- paste(rep(var_names, 3), var_tp, sep="_")
Run Code Online (Sandbox Code Playgroud)
输出:
> vec
[1] "Tim_0" "Jack_40" "Tom_-u" "Tim_0" "Jack_40" "Tom_-u" "Tim_0" "Jack_40" "Tom_-u"
Run Code Online (Sandbox Code Playgroud)
我想要的输出是保持原始向量中的顺序,优先names于tp.
期望的输出:
> vec
[1] "Tim_0" "Tim_40" "Tim_-u" "Jack_0" "Jack_40" "Jack_-u" "Tom_0" "Tom_40" "Tom_-u"
Run Code Online (Sandbox Code Playgroud) 所以我对这个问题有类似的问题: 删除R中的重复行
在我的情况下,我想保留所有列(不像它建议使用unique前3列的功能).我想从数据框中只考虑2列,如果两个提到的列中的"值"相同,则只保留1行.
数据看起来像:
structure(list(P1 = structure(c(1L, 1L, 3L, 3L, 5L, 5L, 5L, 5L,
4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 2L, 2L), .Label = c("Apple",
"Grape", "Orange", "Peach", "Tomato"), class = "factor"), P2 = structure(c(4L,
4L, 3L, 3L, 5L, 5L, 5L, 5L, 6L, 6L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 6L, 6L), .Label = c("Banana", "Cucumber", "Lemon", "Orange",
"Potato", "Tomato"), class = "factor"), P1_location_subacon = structure(c(NA,
NA, 1L, 1L, 1L, 1L, 1L, …Run Code Online (Sandbox Code Playgroud) 这就是我的数据的样子:
> dput(data)
structure(list(Name = c("Mark", "Tere", "Marcus", "Heidi", "Georg",
"Tieme", "Joan", "Tytus", "Mark", "Tere", "Marcus", "Heidi",
"Georg", "Tieme", "Joan", "Tytus", "Mark", "Tere", "Marcus",
"Heidi", "Georg", "Tieme", "Joan", "Tytus", "Mark", "Tere", "Marcus",
"Heidi", "Georg", "Tieme", "Joan", "Tytus", "Mark", "Tere", "Marcus",
"Heidi", "Georg", "Tieme", "Joan", "Tytus", "Mark", "Tere", "Marcus",
"Heidi", "Georg", "Tieme", "Joan", "Tytus"), position = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, …Run Code Online (Sandbox Code Playgroud)