小编Yan*_*ang的帖子

如何在简单的ggplot2散点图中清晰地标记点？

参见http://had.co.nz/ggplot2/geom_text.html中的示例; 他们非常糟糕.标签相互重叠,在情节之外运行等.

我认为直接标签可能有所帮助,但事实并非如此:

direct.label(qplot(wt,mpg,data=mtcars,colour=rownames(mtcars)))

Run Code Online (Sandbox Code Playgroud)

手动定位每个标签是繁琐的.希望有一些东西可以使标签更有用.任何可能适合该法案的东西？

r ggplot2

Yan*_*ang

2011 06-20

7
推荐指数

1
解决办法

1200
查看次数

R内存管理建议(插入符号,模型矩阵,数据帧)

我在正常的8GB服务器上运行内存不足,在服务器学习环境中使用相当小的数据集:

> dim(basetrainf) # this is a dataframe
[1] 58168   118

我采用的唯一预先建模步骤显着增加了内存消耗,即将数据帧转换为模型矩阵.这是因为caret,cor等仅适用于(模型)矩阵.即使在去除具有多个级别的因子之后,矩阵(mergem下面)也相当大.(sparse.model.matrix/ Matrix一般来说支持得很差,所以我不能用它.)

> lsos()
                 Type      Size PrettySize   Rows Columns
mergem         matrix 879205616   838.5 Mb 115562     943
trainf     data.frame  80613120    76.9 Mb 106944     119
inttrainf      matrix  76642176    73.1 Mb    907   10387
mergef     data.frame  58264784    55.6 Mb 115562      75
dfbase     data.frame  48031936    45.8 Mb  54555     115
basetrainf data.frame  40369328    38.5 Mb  58168     118
df2        data.frame  34276128    32.7 Mb  54555     103
tf         data.frame  33182272    31.6 …

memory r

Yan*_*ang

2011 06-24

6
推荐指数

2
解决办法

5139
查看次数

有什么方法可以不在 Postgresql 中使用服务器端准备好的语句？

在（说）Python 中，我可以发出：

psycopg2.connect(...).cursor().execute("select * from account where id='00100000006ONCrAAO'")

Run Code Online (Sandbox Code Playgroud)

这在服务器上导致以下日志条目：

2011-07-18 18:56:08 PDT LOG:  duration: 6.112 ms  statement: select * from account where id='00100000006ONCrAAO'

Run Code Online (Sandbox Code Playgroud)

但是，在 Java 中，发出：

conn.createStatement().executeQuery("select * from account where id = '00100000006ONCrAAO'");

Run Code Online (Sandbox Code Playgroud)

结果是：

2011-07-18 18:44:59 PDT LOG:  duration: 4.353 ms  parse <unnamed>: select * from account where id = '00100000006ONCrAAO'
2011-07-18 18:44:59 PDT LOG:  duration: 0.230 ms  bind <unnamed>: select * from account where id = '00100000006ONCrAAO'
2011-07-18 18:44:59 PDT LOG:  duration: 0.246 ms  execute <unnamed>: select * …

Run Code Online (Sandbox Code Playgroud)

postgresql jdbc prepared-statement

Yan*_*ang

lucky-day

6
推荐指数

1
解决办法

3548
查看次数

理解Scala中的"类型参数不符合类型参数边界"错误

以下为什么不工作？

scala> abstract class Foo[B<:Foo[B]]
defined class Foo

scala> class Goo[B<:Foo[B]](x: B)
defined class Goo

scala> trait Hoo[B<:Foo[B]] { self: B => new Goo(self) }
<console>:9: error: inferred type arguments [Hoo[B] with B] do not conform to class Goo's type parameter bounds [B <: Foo[B]]
       trait Hoo[B<:Foo[B]] { self: B => new Goo(self) }
                                         ^

scala> trait Hoo[B<:Foo[B]] extends Foo[B] { new Goo(this) }
<console>:9: error: inferred type arguments [Hoo[B]] do not conform to class Goo's type parameter bounds [B …

Run Code Online (Sandbox Code Playgroud)

scala bounded-quantification

Yan*_*ang

2014 07-08

6
推荐指数

1
解决办法

2669
查看次数

R .libPaths()RStudio和命令行R之间的区别

当我从命令行运行R时:

> library(ggplot2)
...
> path.package('ggplot2')
[1] "/home/yang/R/x86_64-pc-linux-gnu-library/2.13/ggplot2"
> .libPaths()
[1] "/home/yang/R/x86_64-pc-linux-gnu-library/2.13"
[2] "/usr/local/lib/R/site-library"                
[3] "/usr/lib/R/site-library"                      
[4] "/usr/lib/R/library"                           
> Sys.getenv('R_LIBS_USER')
[1] "~/R/x86_64-pc-linux-gnu-library/2.13"

Run Code Online (Sandbox Code Playgroud)

(注意:当我从我的shell中检查时,该环境变量实际上不存在.)

但是从同一个盒子上运行的RStudio Server,以及作为同一用户登录后:

> path.package('ggplot2')
[1] "/home/yang/R/library/ggplot2"
> .libPaths()
[1] "/home/yang/R/library"              "/usr/local/lib/R/site-library"    
[3] "/usr/lib/R/site-library"           "/usr/lib/R/library"               
[5] "/usr/lib/rstudio-server/R/library"
> Sys.getenv('R_LIBS_USER')
[1] "/home/yang/R/library"

Run Code Online (Sandbox Code Playgroud)

你能解释为什么这些在默认情况下不同吗？这是RStudio定制吗？(为什么？)先谢谢.

r rstudio

Yan*_*ang

lucky-day

6
推荐指数

2
解决办法

5686
查看次数

sklearn随机森林不并行化

我在 Ubuntu 12.04 上使用 sklearn 0.16 并运行：

from sklearn.ensemble import RandomForestClassifier
import numpy as np
X=np.random.rand(5000,500)
y=(np.random.rand(5000).round())
RandomForestClassifier(n_jobs=10,n_estimators=1000).fit(X,y)

Run Code Online (Sandbox Code Playgroud)

然而，它并没有耗尽我的核心，并且花费的时间与 n_jobs=1 相同。关于如何调试这里发生的事情有什么想法吗？

此屏幕截图显示了其他一些正在运行的繁忙事物，但 htop 始终显示可用的 CPU：

在此输入图像描述

scikit-learn

Yan*_*ang

2015 03-29

6
推荐指数

1
解决办法

2238
查看次数

在Scala中,如何解决TraversableLike.toIterator的低效中间流

(Gen)TraversableOnce.toIterator被覆盖为TraversableLikeas toStream.iterator,导致中间流发生.

举一个简单的例子,假设我正在尝试实现一个简单的izip实用程序,它总是在调用之前强制它的迭代器参数,zip以便在两个集合上产生一个有效的迭代器.

以下是低效的(由于中间流):

def izip[A,B](xs: TraversableOnce[A], ys: TraversableOnce[B]) =
  xs.toIterator zip ys.toIterator

Run Code Online (Sandbox Code Playgroud)

并且必须扩展到:

def izip[A,B](xs: Iterable[A], ys: Iterable[B]) =
  xs.iterator zip ys.iterator
def izip[A,B](xs: Iterator[A], ys: Iterable[B]) =
  xs zip ys.iterator
def izip[A,B](xs: Iterable[A], ys: Iterator[B]) =
  xs.iterator zip ys
def izip[A,B](xs: Iterator[A], ys: Iterator[B]) =
  xs zip ys
// .. and more needed to handle Traversables as well

Run Code Online (Sandbox Code Playgroud)

有没有更好的办法？

collections scala

Yan*_*ang

2011 07-14

5
推荐指数

1
解决办法

252
查看次数

如何在ScalaQuery中指定Postgresql架构？

我试过,例如:

object WebCache extends Table[(...)]("myschema.mytable") {
  ...
}

Run Code Online (Sandbox Code Playgroud)

但这不起作用.

scalaquery

Yan*_*ang

lucky-day

5
推荐指数

1
解决办法

1197
查看次数

如何将`raw`转换为R中的整数向量？

我需要通过JRI将原始数据传递回Java,但它不支持原始数据,只支持其他各种矢量类型(例如整数).如何将原始(字节向量)转换为整数向量？

我尝试将数据作为字符串向量传回,但由于JRI没有正确解码字符串(例如'\ x89'被丢弃为""),因此会中断.

如果效率更高(未装箱)也会很好.as.integer不起作用 - 它不返回数组中字符的字节值,更不用说rawToChar为nuls生成"".

Yan*_*ang

2011 08-24

5
推荐指数

2
解决办法

4501
查看次数

Scala视图边界与子类型一起使用？

Scala中有什么像视图绑定但可以匹配子类型？

由于Scala中的视图不链接,我目前有以下内容:

implicit def pimpIterable[A, I[_]](x: I[A])(implicit f: I[A] => Iterable[A]) =
  new { def mylength = x.size }

Run Code Online (Sandbox Code Playgroud)

让我写:

Array(1,2,3).mylength
Seq(1,2,3).mylength

Run Code Online (Sandbox Code Playgroud)

上面的表单似乎是必要的,因为如果我尝试用以下内容简化我的函数签名:

implicit def pimpIterable[A, I <% Iterable[A]](x: I) =
  new { def mylength = x.size }

Run Code Online (Sandbox Code Playgroud)

那么隐式转换对数组不起作用,因为没有从Array到Iterable的直接视图(只有Iterable的子类,第一种形式能够找到).

这也迫使所有其他短手也以长篇形式写出来.可能是什么:

implicit def pimpIterable[A: Scalar, I <% Iterable[A]](x: I) = ...

Run Code Online (Sandbox Code Playgroud)

现在必须写成:

implicit def pimpIterable[A, I[_]](x: I[A])(implicit f: I[A] => Iterable[A], m: Scalar[A]) = ...

Run Code Online (Sandbox Code Playgroud)

有没有更好的办法？

scala

Yan*_*ang

2011 10-31

5
推荐指数

1
解决办法

348
查看次数

标签统计

r ×4

scala ×3

bounded-quantification ×1

collections ×1

ggplot2 ×1

jdbc ×1

memory ×1

postgresql ×1

prepared-statement ×1

rstudio ×1

scalaquery ×1

scikit-learn ×1

标签 统计

小编Yan_ang的帖子

标签统计