我收到一个奇怪的错误
Error in `[.data.frame`(data, , lvls[1]) : undefined columns selected
Run Code Online (Sandbox Code Playgroud)
当我使用插入符号训练glmnet模型时的消息.我对序数模型使用了基本相同的代码和相同的预测器(y然后只是使用不同的因子)并且它工作正常.它耗费了400个核心小时来计算,所以我不能在这里展示它.
#Source a small subset of data
source("https://gist.githubusercontent.com/FredrikKarlssonSpeech/ebd9fccf1de6789a3f529cafc496a90c/raw/efc130e41c7d01d972d1c69e59bf8f5f5fea58fa/voice.R")
trainIndex <- createDataPartition(notna$RC, p = .75,
list = FALSE,
times = 1)
training <- notna[ trainIndex[,1],] %>%
select(RC,FCoM_envel:ATrPS_freq,`Jitter->F0_abs_dif`:RPDE)
testing <- notna[-trainIndex[,1],] %>%
select(RC,FCoM_envel:ATrPS_freq,`Jitter->F0_abs_dif`:RPDE)
fitControl <- trainControl(## 10-fold CV
method = "CV",
number = 10,
allowParallel=TRUE,
savePredictions="final",
summaryFunction=twoClassSummary)
vtCVFit <- train(x=training[-1],y=training[,"RC"],
method = "glmnet",
trControl = fitControl,
preProcess=c("center", "scale"),
metric="Kappa"
)
Run Code Online (Sandbox Code Playgroud)
我无法找到任何明显错误的数据.没有NAs
table(is.na(training))
FALSE
43166
Run Code Online (Sandbox Code Playgroud)
并且不明白为什么它会尝试在列数之外进行索引.
有什么建议?
我希望我没有错过任何明显的答案,但如果我有 - 我道歉.
System.Posix.User功能a getLoginName用于查找当前登录用户的登录名.现在,在Windows平台上获取相同信息的方法是什么?
我在Haskell中生成SQL查询并使用HDBC将它们提交到SQLite(3)数据库.现在,此函数返回一个查询:
import Database.HDBC.Sqlite3
import Database.HDBC
data UmeQuery = UmeQuery String [SqlValue] deriving Show
tRunUmeQuery :: UmeQuery -> FilePath -> IO [[SqlValue]]
tRunUmeQuery (UmeQuery q args) dbFile = do
conn <- connectSqlite3 dbFile
stat <- prepare conn q
s <- execute stat args
res <- fetchAllRows' stat
disconnect conn
return $ res
selectPos targetlt parentlt op pos = let
q= "select TARGET.* from levels tl, labeltypes tlt, segments TARGET,
(select TARGET.session_id session_id,SECONDARY.labeltype_id labeltype_id,
SECONDARY.label_id label_id,min(TARGET.label_id) min_childlabel_id from
levels tl, labeltypes tlt, …Run Code Online (Sandbox Code Playgroud) 我想创建一个函数,它接受一个函数并对 a 中的每一行应用一次,tibble参数存储在 a 的相应命名列中tibble
我意识到这听起来有点奇怪,但我希望面向用户的函数/功能简单。
在大多数情况下,处理会花费很多时间,所以我真的更喜欢有进度条功能,这就是我发现很大麻烦的地方:
此代码有效(然后没有进度条):
library(tibble)
library(dplyr)
library(purrr)
library(furrr)
library(tidyr)
library(wrassp)
library(progressr)
xf <- function(x,trim,na.rm,ds="ded"){
return(x*trim*na.rm)
}
xf2 <- function(x,trim,na.rm,ds="ded"){
return(list("a"=x,"b"=trim))
}
xf3 <- function(x,trim,na.rm,ds="ded"){
return(data.frame("a"=x,"b"=trim))
}
mymap <- function(f,...){
plan(multisession)
exDF <- tribble(
~x, ~trim, ~na.rm, ~notarg, ~listOfFiles, ~toFile,
0.5, 0, TRUE, 11.2, "~/Desktop/a1.wav", FALSE,
0.4, 0.5, TRUE, 12, "~/Desktop/a1.wav", FALSE
)
dotArgs <- list(...)
dotArgsRT <- as_tibble_row(dotArgs)
dotArgsNames <- names(dotArgs)
allArgsNames <- formalArgs(f)
exDF %>%
select(-any_of(!!dotArgsNames)) %>%
bind_cols(dotArgsRT) %>%
select(any_of(allArgsNames)) %>%
rowwise() …Run Code Online (Sandbox Code Playgroud) 我正在努力...在特定的情况下化解我的论点,但我不明白为什么。
我可以创建一个这样的函数并...适当地化解:
library(dplyr)\nlibrary(tidyr)\n\nfill_na <- function(.x,...){\n\n dotArgs <- rlang::dots_list(...,.named=TRUE,.homonyms="last")\n tidyr::replace_na(.x,dotArgs) \n\n}\n\ndf <- tibble::tribble(\n ~colA, ~colB,\n "a", 1,\n "b", 2,\n "c", NA,\n NA, 4\n)\n\n> fill_na(df,colA="c",colB=2)\n# A tibble: 4 \xc3\x97 2\n colA colB\n <chr> <dbl>\n1 a 1\n2 b 2\n3 c 2\n4 c 4\nRun Code Online (Sandbox Code Playgroud)\n太好了,但是如果我做这个功能
\nmyFun <- function(inside_of,from_what, ... ,.metadata_defaults=list("Gender"="Undefined","Age"=35),.by_maxFormantHz=TRUE,.recompute=FALSE,.package="superassp"){\n\n dotArgs <- rlang::dots_list(...,.named=TRUE,.homonyms="last")\n return(1)\n\n}\nRun Code Online (Sandbox Code Playgroud)\n我得到这个结果:
\n> myFun(inside_of=ae,from_what=forest,fs=fm, fbw=bw)\nError in rlang::dots_list(..., .named = TRUE, .homonyms = "last") : \n object 'fm' not found\nRun Code Online (Sandbox Code Playgroud)\n为什么这里的争论没有被化解,而在第一个例子中却被化解了? …
我有大量的IConnection conn => conn - > IO()函数,我需要执行这些函数来正确设置数据库.现在,它并不是很漂亮,但我在Haskell中太过初衷,无法让它变得更好.
setup :: IConnection conn => conn -> IO ()
setup conn = do
setupUtterances conn
commit conn
setupSegments conn
commit conn
setupLevels conn
commit conn
setupLevelLevel conn
commit conn
setupTCLevelLevel conn
commit conn
setupPaths conn
commit conn
setupLabelTypes conn
commit conn
setupLegalLabels conn
commit conn
setupTracks conn
commit conn
setupVariables conn
commit conn
setupFeatures conn
commit conn
setupAssociations conn
commit conn
return ()
Run Code Online (Sandbox Code Playgroud)
无论如何缩短它?我在玩
sequence $ map ($ conn) [func1, func2,...]
Run Code Online (Sandbox Code Playgroud)
但我无法让它发挥作用.建议?
我需要允许函数的用户添加新列,其中包含根据提供的 tibble 值创建的字符串。一个简单的例子:
\ndf <- data.frame(session=rep(LETTERS[1:3],2),\n bundle=rep(letters[1:2],3),\n xn1=sample(1:100,6),\n xn2=sample(1:100,6))\n\nmyfun <- function(df, ...){\n dplyr::mutate(.data=df,!!!rlang::enexprs(...)) |>\n dplyr::mutate(across(where(is.character),~ glue::glue))\n}\n\nmyfun(df,a=session,b="dsds{session}") \n\nRun Code Online (Sandbox Code Playgroud)\n如果df是这样:
> df\n session bundle xn1 xn2\n1 A a 31 95\n2 B b 45 64\n3 C a 12 38\n4 A b 56 13\n5 B a 70 93\n6 C b 53 73\nRun Code Online (Sandbox Code Playgroud)\n那么我想得到这个输出myfun
\n> myfun(df,a=session,b="dsds{a}")\n session bundle xn1 xn2 a b\n1 A a 31 95 A dsdsA\n2 B b 45 64 B dsdsB \n3 C a …Run Code Online (Sandbox Code Playgroud) 我有几条 tidymodels /parsnip 模型性能的 ROC 曲线,我想在一个图中相互展示以进行视觉比较:
roc1 <- structure(list(.threshold = c(-Inf, 0.188422381048697, 0.23446542423272,
0.241282102642437, 0.259726705912688, 0.29097010004365, 0.309897370938121,
0.33607659920306, 0.348797482584728, 0.371543061749991, 0.37849110465008,
0.403024193339376, 0.408074451522232, 0.425203432699806, 0.43288528993523,
0.437168077386449, 0.441435377101706, 0.454812465942723, 0.46890082819098,
0.469324015885685, 0.471191285258535, 0.473285736958109, 0.484067175067965,
0.501634453233048, 0.502895404815678, 0.505260074955513, 0.509400496728661,
0.512826032440735, 0.514474796037162, 0.520894854910534, 0.52482313756493,
0.544137627333669, 0.546168394598085, 0.555557692971751, 0.562118235565918,
0.564565992908277, 0.572138872116962, 0.5792082477202, 0.611888118194463,
0.621908020887883, 0.623655143605973, 0.629887735979754, 0.632025630132792,
0.636193886667259, 0.638203230744601, 0.646775289308722, 0.655148011873394,
0.658581199234482, 0.658707835285112, 0.66292920495746, 0.6753497980617,
0.691520083977918, 0.702288194696498, 0.704440842146043, 0.724494989785773,
0.735933141947951, 0.756427437462373, 0.785412673453098, 0.831367501773009,
0.831554130258554, 0.840204698487284, 0.845340108802608, 0.876022993703215,
Inf), specificity = c(0, 0, 0.032258064516129, 0.0645161290322581, …Run Code Online (Sandbox Code Playgroud) 试图安装criterion,我遇到了我想要重新安装的其他软件包的麻烦.重新安装它们会破坏一切(我已经尝试过了).
$ cabal install criterion
Resolving dependencies...
In order, the following would be installed:
Glob-0.7.5 (new package)
abstract-deque-0.3 (new package)
abstract-par-0.3.3 (new package)
blaze-builder-0.4.0.1 (new package)
cereal-0.4.1.1 (new package)
erf-2.0.0.0 (new package)
ieee754-0.7.6 (new package)
hastache-0.6.1 (new package)
monad-par-extras-0.3.3 (new package)
parallel-3.2.0.6 (new package)
primitive-0.5.4.0 (latest: 0.6) (new version)
vector-0.10.12.3 (reinstall) changes: primitive-0.6 -> 0.5.4.0
aeson-0.8.0.2 +old-locale (reinstall) changes: mtl-2.1.3.1 -> 2.2.1
cassava-0.4.2.2 (new package)
mwc-random-0.13.3.2 (new package)
monad-par-0.3.4.7 (new package)
vector-algorithms-0.6.0.3 (new package)
vector-binary-instances-0.2.1.0 (new package)
vector-th-unbox-0.2.1.2 …Run Code Online (Sandbox Code Playgroud) 只需要了解与Parsec相关的内容.
parseTest (many1 alphaNum) "re2re1?"
"re2re1\916"
:t parseTest (many1 alphaNum)
parseTest (many1 alphaNum) :: Text.Parsec.Prim.Stream s Data.Functor.Identity.Identity Char =>
s -> IO ()
Run Code Online (Sandbox Code Playgroud)
因此,Unicode的输出(应该是UTF-8,因为我在OSX上)打印为十六进制(?)代码(应该是希腊三角形字符).现在,putChar不会在同一个ghci会话(和同一个终端)内进行相同的转换
Text.Parsec.Char> putChar '?'
?
Run Code Online (Sandbox Code Playgroud)
怎么会?它们应该只是'Char'类型以某种方式......?
我需要一些帮助,试图理解为什么这些定义
data SegmentList
= SegmentList SegmentlistHeader [Segment]
| AugmentedSegmentList SegmentlistHeader [AugmentedSegment]
deriving (Show)
data SegmentlistHeader
= SegmentlistHeader DatabaseName Query LabelType TimeStamp
deriving (Show)
data Segment
= Segment SegmentLabel SegmentStart SegmentEnd Session Checksum
| AugmentedSegment SegmentLabel SegmentStart SegmentEnd Session Checksum Metadata
deriving (Show)
type DatabaseName = String
type SegmentLabel = String
type SegmentStart = Double
type SegmentEnd = Double
type Session = String
type LabelType = String
type Query = String
type TimeStamp = String
type Checksum = String
type Metadata …Run Code Online (Sandbox Code Playgroud) 我不得不承认我已经有一段时间没有使用 ggplot 了,但这似乎有点傻。我在尝试制作密度图时遗漏了一些基本的东西,或者 ggplot2 (v3.3.2) 中存在错误
test <- data.frame(Time=rnorm(100),Age=rnorm(100))
ggplot(test,aes(y=Time,x=Age)) +
geom_density(aes(y=Time,x=Age))
Run Code Online (Sandbox Code Playgroud)
产生
ggplot(测试,aes(y=时间,x=年龄))+
- geom_density(aes(y=Time,x=Age)) 错误:geom_density 需要以下缺失的美学:y
怎么会缺少“y”美学?
我正在使用System.FilePath.Findfilemanip模块以递归方式查找我需要处理的所有文件(这里我将仅使用打印到控制台作为要执行的操作,以免混淆事物).现在,这段代码:
import System.Environment (getArgs)
import System.FilePath (FilePath)
import System.Directory (doesDirectoryExist, getDirectoryContents,doesFileExist)
import Control.Monad
import System.FilePath.Find (find,always,fileType,(==?),FileType(..),(&&?),extension)
main= do
[dbFile,input]<- getArgs
files <- findFiles input
mapM_ putStrLn files
return ()
searchExtension :: String
searchExtension = ".hs"
findFiles :: FilePath -> IO [String]
findFiles = find (always) ( fileType ==? RegularFile &&? extension ==? searchExtension)
Run Code Online (Sandbox Code Playgroud)
适合这个电话
./myprog tet.
在这种情况下,将get忽略该参数(稍后将是输出数据库文件),并且为匹配文件递归搜索第二个参数.它还允许我只指定一个文件,这是完美的!
但是,我希望能够指明
./myprog tet path1 path2 path4 file1
但这当然在模式匹配中失败了:
./myprog tet..
myprogt:用户错误(myprog.hs中的do表达式中的模式匹配失败:11:9-22)
现在,我如何使这个程序更灵活,以便我可以采取两个以上的参数?
很抱歉这个问题,但是我的Haskell知识是有限的,但是在我的第一个项目中我必须做的每件新事情都在增加.