我安装了一个模型,其中:
Y~A + A ^ 2 + B + mixed.effect(C)
Y是连续的A是连续的B实际上指的是DAY,目前看起来像这样:
Levels: 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 11 < 12
Run Code Online (Sandbox Code Playgroud)
我可以轻松地更改数据类型,但我不确定将B视为数字,因子或有序因子是否更合适.AND当被视为数字或有序因子时,我不太清楚如何解释输出.
当被视为有序因子时,summary(my.model)输出如下内容:
Linear mixed model fit by REML ['lmerMod']
Formula: Y ~ A + I(A^2) + B + (1 | mixed.effect.C)
Fixed effects:
Estimate Std. Error t value
(Intercept) 19.04821 0.40926 46.54
A -151.01643 7.19035 -21.00
I(A^2) 457.19856 31.77830 14.39
B.L -3.00811 0.29688 -10.13
B.Q -0.12105 0.24561 …Run Code Online (Sandbox Code Playgroud) 我正在调整这里提出的现有perl脚本: grep -f的快速替代
我需要过滤许多非常大的文件(Map文件),每个文件大约1000万行x 5个字段宽,使用一个长列表(过滤文件)和匹配的地图文件中的打印行.我尝试使用grep -f,但它只是花了太长时间.我读到这种方法会更快.
这就是我的文件的样子:
过滤文件:
DB775P1:276:C2R0WACXX:2:1101:10000:77052
DB775P1:276:C2R0WACXX:2:1101:10003:51920
DB775P1:276:C2R0WACXX:2:1101:10004:36433
DB775P1:276:C2R0WACXX:2:1101:10004:57256
Run Code Online (Sandbox Code Playgroud)
地图文件:
DB775P1:276:C2R0WACXX:2:1101:10000:70401 chr5 21985760 21985780 -
DB775P1:276:C2R0WACXX:2:1101:10000:77052 chr18 14723904 14723924 -
DB775P1:276:C2R0WACXX:2:1101:10000:77052 chr18 14745586 14745606 -
DB775P1:276:C2R0WACXX:2:1101:10000:77052 chr4 7944241 7944261 -
DB775P1:276:C2R0WACXX:2:1101:10000:77052 chr4 8402856 8402876 +
DB775P1:276:C2R0WACXX:2:1101:10000:77052 chr8 10864708 10864728 +
DB775P1:276:C2R0WACXX:2:1101:10002:88487 chr17 5681227 5681249 -
DB775P1:276:C2R0WACXX:2:1101:10004:74842 chr13 2569168 2569185 +
DB775P1:276:C2R0WACXX:2:1101:10004:74842 chr14 13253418 13253435 -
DB775P1:276:C2R0WACXX:2:1101:10004:74842 chr14 13266344 13266361 -
Run Code Online (Sandbox Code Playgroud)
我希望输出行看起来像这样,因为它们包含map和filter文件中的字符串.
DB775P1:276:C2R0WACXX:2:1101:10000:77052 chr18 14723904 14723924 -
DB775P1:276:C2R0WACXX:2:1101:10000:77052 chr18 14745586 14745606 -
DB775P1:276:C2R0WACXX:2:1101:10000:77052 chr4 7944241 7944261 -
DB775P1:276:C2R0WACXX:2:1101:10000:77052 …Run Code Online (Sandbox Code Playgroud)