添加两个NaN系列

joe*_*otz 19 python pandas

我正在研究"Python For Data Analysis",我不了解特定的功能.添加两个pandas系列对象将自动对齐索引数据,但如果一个对象不包含该索引,则返回为NaN.例如来自书:

a = Series([35000,71000,16000,5000],index=['Ohio','Texas','Oregon','Utah'])
b = Series([NaN,71000,16000,35000],index=['California', 'Texas', 'Oregon', 'Ohio'])
Run Code Online (Sandbox Code Playgroud)

结果:

    In [63]: a
    Out[63]: Ohio          35000
             Texas         71000
             Oregon        16000
             Utah           5000
    In [64]: b
    Out[64]: California      NaN
             Texas         71000
             Oregon        16000
             Ohio          35000
Run Code Online (Sandbox Code Playgroud)

当我把它们加在一起时,我得到了......

    In [65]: a+b
    Out[65]: California       NaN
             Ohio           70000
             Oregon         32000
             Texas         142000
             Utah             NaN
Run Code Online (Sandbox Code Playgroud)

那么为什么犹他州的价值是NaN而不是500?似乎500 + NaN = 500.是什么赋予了?我错过了什么,请解释一下.

更新:

    In [92]: # fill NaN with zero
             b = b.fillna(0)
             b
    Out[92]: California        0
             Texas         71000
             Oregon        16000
             Ohio          35000

    In [93]: a
    Out[93]: Ohio      35000
             Texas     71000
             Oregon    16000
             Utah       5000

    In [94]: # a is still good
             a+b
    Out[94]: California       NaN
             Ohio           70000
             Oregon         32000
             Texas         142000 
             Utah             NaN
Run Code Online (Sandbox Code Playgroud)

Dan*_*lan 26

熊猫不假设500 + NaN = 500,但很容易要求它这样做: a.add(b, fill_value=0)

  • 既然你提到了这本书,你可以参考第128页的"算术和数据对齐"一节来讨论这个问题. (2认同)