使用 pandas .map 更改值

CAB*_*CAB 2 python dictionary pandas

我正在尝试使用地图函数更改数据中的字符串的数值。

这是数据:

    label   sms_message
0   ham     Go until jurong point, crazy.. Available only ...
1   ham     Ok lar... Joking wif u oni...
2   spam    Free entry in 2 a wkly comp to win FA Cup fina...
3   ham     U dun say so early hor... U c already then say...
4   ham     Nah I don't think he goes to usf, he lives aro...
Run Code Online (Sandbox Code Playgroud)

我正在尝试使用以下命令将“垃圾邮件”更改为 1,将“火腿”更改为 0:

df['label'] = df.label.map({'ham':0, 'spam':1})
Run Code Online (Sandbox Code Playgroud)

但结果是:

    label   sms_message
0   NaN     Go until jurong point, crazy.. Available only ...
1   NaN     Ok lar... Joking wif u oni...
2   NaN     Free entry in 2 a wkly comp to win FA Cup fina...
3   NaN     U dun say so early hor... U c already then say...
4   NaN     Nah I don't think he goes to usf, he lives aro...
Run Code Online (Sandbox Code Playgroud)

有谁能找出问题所在吗?

hyg*_*ull 6

你是对的,我认为你执行了相同的语句两次(1 后 1)。在 Python 交互式终端上执行的以下语句阐明了这一点。

注意:如果您传递字典,NaN 如果与字典的键不匹配,则map()会将Series中的所有值替换为(我认为,您也做了同样的事情,即执行该语句两次)。检查pandas map()、apply()

Pandas 文档注释:当arg是字典时,Series中不在字典中的值(作为键)将转换为NaN

>>> import pandas as pd
>>>
>>> d = {
...     "label": ["ham", "ham", "spam", "ham", "ham"],
...     "sms_messsage": [
...     "Go until jurong point, crazy.. Available only ...",
...     "Ok lar... Joking wif u oni...",
...     "Free entry in 2 a wkly comp to win FA Cup fina...",
...     "U dun say so early hor... U c already then say...",
...     "Nah I don't think he goes to usf, he lives aro..."
...    ]
... }
>>>
>>> df = pd.DataFrame(d)
>>> df
  label                                       sms_messsage
0   ham  Go until jurong point, crazy.. Available only ...
1   ham                      Ok lar... Joking wif u oni...
2  spam  Free entry in 2 a wkly comp to win FA Cup fina...
3   ham  U dun say so early hor... U c already then say...
4   ham  Nah I don't think he goes to usf, he lives aro...
>>>
>>> df['label'] = df.label.map({'ham':0, 'spam':1})
>>> df
   label                                       sms_messsage
0      0  Go until jurong point, crazy.. Available only ...
1      0                      Ok lar... Joking wif u oni...
2      1  Free entry in 2 a wkly comp to win FA Cup fina...
3      0  U dun say so early hor... U c already then say...
4      0  Nah I don't think he goes to usf, he lives aro...
>>>
>>> df['label'] = df.label.map({'ham':0, 'spam':1})
>>> df
   label                                       sms_messsage
0    NaN  Go until jurong point, crazy.. Available only ...
1    NaN                      Ok lar... Joking wif u oni...
2    NaN  Free entry in 2 a wkly comp to win FA Cup fina...
3    NaN  U dun say so early hor... U c already then say...
4    NaN  Nah I don't think he goes to usf, he lives aro...
>>>
Run Code Online (Sandbox Code Playgroud)

获得相同结果的其他方法

>>> import pandas as pd
>>>
>>> d = {
...     "label": ['spam', 'ham', 'ham', 'ham', 'spam'],
...     "sms_message": ["M1", "M2", "M3", "M4", "M5"]
... }
>>>
>>> df = pd.DataFrame(d)
>>> df
  label sms_message
0  spam          M1
1   ham          M2
2   ham          M3
3   ham          M4
4  spam          M5
>>>
Run Code Online (Sandbox Code Playgroud)

第一种方法 -map()dictionary参数一起使用

>>> new_values = {'spam': 1, 'ham': 0}
>>>
>>> df
  label sms_message
0  spam          M1
1   ham          M2
2   ham          M3
3   ham          M4
4  spam          M5
>>>
>>> df.label = df.label.map(new_values)
>>> df
   label sms_message
0      1          M1
1      0          M2
2      0          M3
3      0          M4
4      1          M5
>>>
Run Code Online (Sandbox Code Playgroud)

第二种方式 -map()function参数一起使用

>>> df.label = df.label.map(lambda v: 0 if v == 'ham' else 1)
>>> df
   label sms_message
0      1          M1
1      0          M2
2      0          M3
3      0          M4
4      1          M5
>>>
Run Code Online (Sandbox Code Playgroud)

第三种方式 -apply()function参数一起使用

>>> df.label = df.label.apply(lambda v: 0 if v == "ham" else 1)
>>>
>>> df
   label sms_message
0      1          M1
1      0          M2
2      0          M3
3      0          M4
4      1          M5
>>>
Run Code Online (Sandbox Code Playgroud)

谢谢。