防止 Pandas read_csv 将 NA 解释为 NaN,但为空值保留 NaN

Gau*_*sal 6 python csv nan pandas

我的问题与此相关。我有一个名为“test.csv”的文件,其中“NA”作为 的值region。我想将其读为“NA”,而不是“NaN”。但是,test.csv 中的其他列中缺少值,我想将其保留为“NaN”。我怎样才能做到这一点?

# test.csv looks like this:
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

这是我尝试过的:

import pandas as pd
# This reads NA as NaN
df = pd.read_csv(test.csv)
df
    region  date    expenses
0   NaN   1/1/2019  53
1   EU    1/2/2019  NaN

# This reads NA as NA, but doesn't read missing expense as NaN
df = pd.read_csv('test.csv', keep_default_na=False, na_values='_')
df
    region  date    expenses
0   NA    1/1/2019  53
1   EU    1/2/2019  

# What I want:
    region  date    expenses
0   NA    1/1/2019  53
1   EU    1/2/2019  NaN
Run Code Online (Sandbox Code Playgroud)

添加参数的问题keep_default_na=False是 的第二个值expenses不会被读入 as NaN。因此,如果我随后尝试,pd.isnull(df['value'][1])则会返回为False.

Qua*_*ang 5

对我来说,这有效:

df = pd.read_csv('file.csv', keep_default_na=False, na_values=[''])
Run Code Online (Sandbox Code Playgroud)

这使:

  region      date  expenses
0     NA  1/1/2019      53.0
1     EU  1/2/2019       NaN
Run Code Online (Sandbox Code Playgroud)

但我宁愿谨慎行事,因为NaN其他专栏中可能有其他内容,并且这样做

df = pd.read_csv('file.csv')
df['region'] = df['region'].fillna('NA')
Run Code Online (Sandbox Code Playgroud)