为什么rubridate包中的dmy()不适用于NA?什么是好的解决方法?

Chr*_*h_J 11 r lubridate

我在lubridate包中偶然发现了一个奇怪的行为:dmy(NA)拖出一个错误,而不仅仅是返回一个NA.当我想转换一个包含一些元素为NA的列和一些通常转换没有问题的日期字符串时,这会导致我出现问题.

这是最小的例子:

library(lubridate)
df <- data.frame(ID=letters[1:5],
              Datum=c("01.01.1990", NA, "11.01.1990", NA, "01.02.1990"))
df_copy <- df
#Question 1: Why does dmy(NA) not return NA, but throws an error?
df$Datum <- dmy(df$Datum)
Error in function (..., sep = " ", collapse = NULL)  : invalid separator
df <- df_copy
#Question 2: What's a work around?
#1. Idea: Only convert those elements that are not NAs
#RHS works, but assigning that to the LHS doesn't work (Most likely problem::
#column "Datum" is still of class factor, while the RHS is of class POSIXct)
df[!is.na(df$Datum), "Datum"] <- dmy(df[!is.na(df$Datum), "Datum"])
Using date format %d.%m.%Y.
Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = c(NA_integer_, NA_integer_,  :
invalid factor level, NAs generated
df #Only NAs, apparently problem with class of column "Datum"
ID Datum
1  a  <NA>
2  b  <NA>
3  c  <NA>
4  d  <NA>
5  e  <NA>
df <- df_copy
#2. Idea: Use mapply and apply dmy only to those elements that are not NA
df[, "Datum"] <- mapply(function(x) {if (is.na(x)) {
                                 return(NA)
                               } else {
                                 return(dmy(x))
                               }}, df$Datum)
df #Meaningless numbers returned instead of date-objects
ID     Datum
1  a 631152000
2  b        NA
3  c 632016000
4  d        NA
5  e 633830400
Run Code Online (Sandbox Code Playgroud)

总而言之,我有两个问题:1)为什么dmy(NA)不起作用?基于大多数其他函数,我认为这是一个很好的编程实践,每次转换(如dmy())NA都会NA再次返回(就像那样2 + NA)?如果打算这样做,如何通过函数转换data.frame包含NAs 的列dmy()

jth*_*zel 6

Error in function (..., sep = " ", collapse = NULL) : invalid separator是由lubridate:::guess_format()功能引起的.将NA被传递为sep在一个呼叫paste(),特别是在fmts <- unlist(mlply(with_seps, paste)).你可以去改进它lubridate:::guess_format()来解决这个问题.

否则,您可以只更改NA为字符("NA")吗?

require(lubridate)
df <- data.frame(ID=letters[1:5],
    Datum=c("01.01.1990", "NA", "11.01.1990", "NA", "01.02.1990")) #NAs are quoted
df_copy <- df

df$Datum <- dmy(df$Datum)
Run Code Online (Sandbox Code Playgroud)