如何从R中的大文件中采样特定比例的行？

Question

如何从R中的大文件中采样特定比例的行？

我有一个大约 1.25 亿行的巨大坐标文件。我想对这些线条进行采样以获得所有线条的 1%，以便我可以绘制它们。有没有办法在R中做到这一点？文件很简单，只有3列，我只对前两列感兴趣。该文件的示例如下：

Run Code Online (Sandbox Code Playgroud)

任何帮助/指针都受到高度赞赏。

Answer 1

Jil*_*ina 1

据我理解你的问题，这可能会有所帮助

> set.seed(1)
> big.file <- matrix(rnorm(1e3, 100, 3), ncol=2) # simulating your big data
> 
> 
> # choosing 1% randomly
> one.percent <- big.file[sample(1:nrow(big.file), 0.01*nrow(big.file)), ]
          [,1]      [,2]
[1,]  99.40541 106.50735
[2,]  98.44774  98.53949
[3,] 101.50289 102.74602
[4,]  96.24013 104.97964
[5,] 101.67546 102.30483

Run Code Online (Sandbox Code Playgroud)

然后你可以绘制它

>  plot(one.percent)

Run Code Online (Sandbox Code Playgroud)

归档时间：	12 年，2 月前
查看次数：	2455 次
最近记录：	8 年，3 月前