Numpy isnat()在datetime对象上返回值错误

and*_*son 7 python datetime numpy pandas

一直在处理在datetime,Timestamp和datetime64之间转换中列出的选项; 但是,numpy的isnat()似乎无法识别日期时间对象,或者我错过了函数输入所需的其他类型的日期时间对象.

以下是数据框的概述:

>>> time_data.head()
     Date          Name               In AM              Out AM  \
0  2017-12-04  AUSTIN LEWIS 1900-01-01 07:03:11 1900-01-01 12:01:50   
1  2017-12-05  AUSTIN LEWIS 1900-01-01 05:24:07 1900-01-01 12:08:21   
2  2017-12-06  AUSTIN LEWIS 1900-01-01 11:58:32                 NaT   
3  2017-12-07  AUSTIN LEWIS 1900-01-01 08:31:23 1900-01-01 12:49:51   
4  2017-12-11  AUSTIN LEWIS 1900-01-01 06:55:21 1900-01-01 12:02:08   

            In PM              Out PM Sick Time  
0 1900-01-01 12:28:52 1900-01-01 17:34:53       NaT  
1 1900-01-01 12:35:12 1900-01-01 16:15:17       NaT  
2                 NaT 1900-01-01 23:59:01       NaT  
3 1900-01-01 13:18:34 1900-01-01 18:10:35       NaT  
4 1900-01-01 12:30:49 1900-01-01 17:39:54       NaT  

>>> time_data.dtypes
Date                 object
Name                 object
In AM        datetime64[ns]
Out AM       datetime64[ns]
In PM        datetime64[ns]
Out PM       datetime64[ns]
Sick Time    datetime64[ns]
dtype: object

>>> type(time_data['In AM'][3])
<class 'pandas._libs.tslib.Timestamp'>

>>> type(time_data['In AM'][3].to_datetime())
<type 'datetime.datetime'>
Run Code Online (Sandbox Code Playgroud)

if np.isnat(time_data['Out AM'][row].to_datetime()) & np.isnat(time_data['In PM'][row].to_datetime()):

抛出"ValueError:ufunc'isnat'仅为datetime和timedelta定义"

我在这里想念的是什么?!

wim*_*wim 5

呃,这是一个非常糟糕的错误信息! np.isnat("不是时间")只适用于numpy的日期时间.ufunc的典型用法是使用数组np.datetime64np.timedelta64dtype:

>>> dt = datetime.now()
>>> np.isnat(np.array([dt], dtype=np.datetime64))
array([False])
>>> np.isnat(np.array([dt], dtype=object))
TypeError: ufunc 'isnat' is only defined for datetime and timedelta.
Run Code Online (Sandbox Code Playgroud)

有关支持的输入类型,请参阅文档.


Ian*_*son 2

pd.to_datetime您还可以使用所需的日期时间列从头开始转换所有内容:

df = pd.DataFrame({
    'date' : [
        '2017-12-04',
        '2017-12-05',
        '2017-12-06',
        '2017-12-07',
        '2017-12-11'
    ],
    'name' : ['AUSTIN LEWIS'] * 5,
    'in_am' : [
        '1900-01-01 07:03:11',
        '1900-01-01 05:24:07',
        '1900-01-01 11:58:32',
        '1900-01-01 08:31:23',
        '1900-01-01 06:55:21'
    ],
    'out_am' : [
        '1900-01-01 12:01:50',
        '1900-01-01 12:08:21',
        '',
        '1900-01-01 12:49:51',
        '1900-01-01 12:02:08'
    ],
    'in_pm' : [
        '1900-01-01 12:28:52',
        '1900-01-01 12:35:12',
        '',
        '1900-01-01 13:18:34',
        '1900-01-01 12:30:49'
    ],
    'out_pm' : [
        '1900-01-01 17:34:53',
        '1900-01-01 16:15:17',
        '1900-01-01 23:59:01',
        '1900-01-01 18:10:35',
        '1900-01-01 17:39:54'
    ],
    'sick_time' : [''] * 5
})
Run Code Online (Sandbox Code Playgroud)

输入

# all dtypes should be object
df.dtypes
Run Code Online (Sandbox Code Playgroud)

数据类型

# convert to datetimes
for col in df.columns.drop('name').tolist():
    df[col] = pd.to_datetime(df[col])

# name should be only object
df.dtypes
Run Code Online (Sandbox Code Playgroud)

新数据类型

# np.isnat should now work
np.isnat(df.loc[:, df.dtypes == 'datetime64[ns]'])
Run Code Online (Sandbox Code Playgroud)

输出