pandas - 将索引类型从 RangeIndex 转换为 Int64Index

Reb*_*kah 11 python indexing pandas

如何将 RangeIndex 类型转换为 Int64Index 类型?我有两个数据框,都以相同的方式从 .csv 文件导入。Pandas 会自动将一个设置为 Int64Index,将另一个设置为 RangeIndex。当我为两个数据帧放置以下代码(基于其他两列中的值创建一个新列)时,出现错误。我想使两个数据框的类型相同,以便我的代码可以为两个数据框工作以创建新列,稍后我将用于合并。

此代码适用于 Int64Index 但不适用于范围,并且我确认相关字段(列)在两个数据框中是相同的。

这对 Int64Index 数据帧 (df_new) 很有用:

# create new column by combining data in 3 other columns
df_new['ExpWLTh']=df_new['ExpNum'].astype(str)+'-'+df_new['WL'].astype(str)+'-'+df_new['Threshold'].astype(str)
Run Code Online (Sandbox Code Playgroud)

相同的代码在 RangeIndex 数据框 (df_val) 中不起作用,即使相关列的数据类型相同:

# create new column, combine 3 columns to make new one - for graphing
df_val['ExpWLTh']=df_val['ExpNum'].astype(str)+'-'+df_val['WL'].astype(str)+'-'+df_val['Threshold'].astype(str)
Run Code Online (Sandbox Code Playgroud)

当我尝试创建新列时,RangeIndex 数据框 (df_val) 给了我这个错误:

unorderable types: str() < int()
Run Code Online (Sandbox Code Playgroud)

以下是每个 df 中数据类型的详细信息:

df_val:
None
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 218 entries, 0 to 217
Data columns (total 15 columns):
Person               218 non-null object
Threshold            218 non-null int64
WL                   218 non-null int64
Threshold            218 non-null float64
Energy sum           218 non-null float64
White sum            218 non-null float64
Diff (energy)        218 non-null float64
Scaled energy        218 non-null float64
Sens (energy)        218 non-null float64
Sens (quanta)        218 non-null float64
Log sens (quanta)    218 non-null float64
Add 3                218 non-null float64
BkgdLt               218 non-null int64
BkgdLt_b             218 non-null object
ExpNum               218 non-null object
dtypes: float64(9), int64(3), object(3)
memory usage: 25.6+ KB
None
df_new:
None
<class 'pandas.core.frame.DataFrame'>
Int64Index: 7043 entries, 0 to 7839
Data columns (total 15 columns):
File             7043 non-null object
Threshold        7043 non-null int64
StepSize         7043 non-null object
RevNum           7028 non-null float64
WL               7043 non-null int64
RevPos           7028 non-null float64
BkgdLt           7043 non-null int32
Date             7043 non-null datetime64[ns]
Person           7043 non-null object
AbRevPos         7028 non-null float64
ExpNum           7043 non-null object
ExpNumPerWLTh    7043 non-null object
Stair            7043 non-null object
ExpWLTh          7043 non-null object
ExpPer           7043 non-null object
dtypes: datetime64[ns](1), float64(3), int32(1), int64(2), object(8)
memory usage: 852.9+ KB
Run Code Online (Sandbox Code Playgroud)

来自 df_new 的两个数据帧的示例数据:

    File    Threshold   StepSize    RevNum  WL  RevPos  BkgdLt  Date    Person  AbRevPos    ExpNum  ExpNumPerWLTh   Stair   ExpWLTh ExpPer
7835    ZBL-2018-05-23_50_440   1   1.5 10.0    440 -12.012382  50  2018-05-23  ZBL 12.012382   Four    Four-ZBL-440-1  Four-ZBL-1  Four-440-1  Four-ZBL
7836    ZBL-2018-05-23_50_440   1   0.82    11.0    440 -13.512382  50  2018-05-23  ZBL 13.512382   Four    Four-ZBL-440-1  Four-ZBL-1  Four-440-1  Four-ZBL
7837    ZBL-2018-05-23_50_440   0   0.82    11.0    476 50.000000   50  2018-05-23  ZBL 50.000000   Four    Four-ZBL-476-0  Four-ZBL-0  Four-476-0  Four-ZBL
7838    ZBL-2018-05-23_50_440   0   1.5 12.0    476 50.000000   50  2018-05-23  ZBL 50.000000   Four    Four-ZBL-476-0  Four-ZBL-0  Four-476-0  Four-ZBL
7839    ZBL-2018-05-23_50_440   1   1.5 12.0    440 -11.052382  50  2018-05-23  ZBL 11.052382   Four    Four-ZBL-440-1  Four-ZBL-1  Four-440-1  Four-ZBL
Run Code Online (Sandbox Code Playgroud)

来自 df_val:

    Person  Threshold   WL  Threshold   Energy sum  White sum   Diff (energy)   Scaled energy   Sens (energy)   Sens (quanta)   Log sens (quanta)   Add 3   BkgdLt  BkgdLt_b    ExpNum
213 RJI 1   488 -30.224442  0.011540    0.013391    -0.001851   -185.08 -0.005403   -0.006422   -2.192351   0.807649    50  50  Four
214 SFO 0   488 28.068598   0.014332    0.013391    0.000941    94.12   0.010625    0.012628    -1.898674   1.101326    50  50  Four
215 SFO 1   488 -20.585589  0.012202    0.013391    -0.001189   -118.92 -0.008409   -0.009994   -2.000247   0.999753    50  50  Four
216 ZBL 0   488 30.690436   0.014410    0.013391    0.001019    101.88  0.009815    0.011666    -1.933081   1.066919    50  50  Four
217 ZBL 1   488 -30.671511  0.011497    0.013391    -0.001894   -189.40 -0.005280   -0.006275   -2.202372   0.797628    50  50  Four
Run Code Online (Sandbox Code Playgroud)

首先用于导入其中一个 .csv 文件的代码:

# create data frame from values in csv file
df_val = pd.read_csv('Lum_Thresh_2_3_4.csv', sep=',', delimiter=None, header='infer', 
    names=['Person', 'Inc/dec (0 = inc)', 'Wavelength', 'Threshold', 'Energy sum', 'White sum', 
           'Diff (energy)', 'Scaled energy', 'Sens (energy)', 'Sens (quanta)', 'Log sens (quanta)', 
           'Add 3', 'BkgdLt_a'],
    engine='python', skiprows=1, infer_datetime_format=True)
Run Code Online (Sandbox Code Playgroud)

576*_*76i 12

这在这里有效:

df_val.index = list(df_val.index)
Run Code Online (Sandbox Code Playgroud)