Pandas to_sql()插入索引

Ale*_*rMP 7 python sqlalchemy pandas

我正在使用Pandas 0.18.1,同时摆弄这段代码,

import pd

def getIndividualDf(item):
    var1 = []
    # ... populate this list of numbers
    var2 = []
    # ... populate this other list of numbers

    newDf = pd.DataFrame({'var1': var1, 'var2': var2})
    newDf['extra_column'] = someIntScalar
    yield newDf

dfs = []
for item in someList:
    dfs.append(getIndividualDf(item))

resultDf = pd.concat(dfs)
resultDf['segment'] = segmentId # this is an integer scalar

from sqlalchemy import create_engine
engine = create_engine('postgresql://'+user+':'+password+'@'+host+'/'+dbname)
resultDf.reset_index().to_sql('table_name', engine, schema="schema_name", if_exists="append", index=False)
Run Code Online (Sandbox Code Playgroud)

我得到了这个例外:

(psycopg2.ProgrammingError)关系"table_name"的列"索引"不存在

实际上,表中没有这样的列,只是因为数据帧中没有这样的显式列.这就是为什么它很奇怪.

运行

print(list(resultDf))
Run Code Online (Sandbox Code Playgroud)

就在to_sql()电话会议之前,收益率

['var1','var2','extra_column','segment']

index=Falseto_sql()呼叫中删除会将错误更改为:

(psycopg2.ProgrammingError)关系"table_name"的列"level_0"不存在

我很困惑.我该如何摆脱index专栏?

更新
print(resultDf.head())产生了这些信息:

     var1       var2  extra_column  segment
0       8   0.101653    2077869737   201606
1       9   0.303694    2077869737   201606
2      10   0.493210    2077869737   201606
3      11   0.661064    2077869737   201606
4      12   0.820924    2077869737   201606
Run Code Online (Sandbox Code Playgroud)

Ste*_*n G 20

在写入sql之前,您无需重置索引,例如:

resultDf.to_sql('table_name', engine, schema="schema_name", if_exists="append", index=False)
Run Code Online (Sandbox Code Playgroud)

  • @AlexanderMP重置索引时,会创建一个名为level_0的列并使用索引填充它. (2认同)