如何使用CSV文件中的标题从CSV文件复制到PostgreSQL表？

Question

如何使用CSV文件中的标题从CSV文件复制到PostgreSQL表？

Sta*_*hil 78 csv postgresql postgresql-copy

我想将CSV文件复制到Postgres表.这个表中大约有100列,所以如果我不需要,我不想重写它们.

我正在使用该\copy table from 'table.csv' delimiter ',' csv;命令但没有创建表格ERROR: relation "table" does not exist.如果我添加一个空白表我没有错误,但没有任何反应.我尝试了这个命令两三次,没有输出或消息,但是当我通过PGAdmin检查时表没有更新.

有没有办法导入包含标题的表,就像我想要做的那样？

Answer 1

G. *_*ito 111

这很有效.第一行中包含列名.

COPY wheat FROM 'wheat_crop_data.csv' DELIMITER ';' CSV HEADER

Run Code Online (Sandbox Code Playgroud)

`COPY`不会创建表或向其添加列,它会使用现有列向现有表添加行.据推测,提问者想要自动创建~100列,并且`COPY`没有这个功能,至少从PG 9.3开始. (27认同)
我认为这个命令的问题是,你必须是数据库超级用户.\ copy也像普通用户一样工作 (5认同)
@Exocom好抓.因为我从来都不是我使用的postgres系统上DB的管理员或超级用户(pgadmin使我成为我使用的数据库的所有者,并且给了我有限的权限/角色)我必须使用`\ COPY'.干杯 (2认同)
@Daniel我理解用户的表已经存在,并且拥有他们需要的所有列,他们只想要'ADD`数据. (2认同)

Answer 2

joe*_*lom 21

使用Python库pandas,您可以轻松地创建列名并从csv文件中推断数据类型.

from sqlalchemy import create_engine
import pandas as pd

engine = create_engine('postgresql://user:pass@localhost/db_name')
df = pd.read_csv('/path/to/csv_file')
df.to_sql('pandas_db', engine)

Run Code Online (Sandbox Code Playgroud)

所述if_exists参数可以被设置为替代或附加到现有的表,例如df.to_sql('pandas_db', engine, if_exists='replace').这适用于其他输入文件类型,这里和这里的文档.

我发现 pd.DataFrame.from_csv 给我带来的麻烦更少，但这个答案是迄今为止最简单的方法，IMO。 (2认同)

Answer 3

Pet*_*uss 11

终端的替代方案未经许可

NOTES的pg文档说

该路径将相对于服务器进程的工作目录(通常是集群的数据目录)进行解释,而不是客户端的工作目录.

因此,在geally,使用psql或任何客户端,即使在本地服务器中,您也有问题......并且,如果您正在为其他用户表达COPY命令,例如.在Github自述文件中,读者会遇到问题......

使用客户端权限表达相对路径的唯一方法是使用STDIN,

指定STDIN或STDOUT时,数据通过客户端和服务器之间的连接传输.

因为这里想起:

psql -h remotehost -d remote_mydb -U myuser -c \
   "copy mytable (column1, column2) from STDIN with delimiter as ','" \
   < ./relative_path/file.csv

Run Code Online (Sandbox Code Playgroud)

Answer 4

meh*_*met 5

我已经使用这个功能一段时间了，没有任何问题。您只需提供 csv 文件中的数字列，它就会从第一行获取标题名称并为您创建表格：

create or replace function data.load_csv_file
    (
        target_table  text, -- name of the table that will be created
        csv_file_path text,
        col_count     integer
    )

    returns void

as $$

declare
    iter      integer; -- dummy integer to iterate columns with
    col       text; -- to keep column names in each iteration
    col_first text; -- first column name, e.g., top left corner on a csv file or spreadsheet

begin
    set schema 'data';

    create table temp_table ();

    -- add just enough number of columns
    for iter in 1..col_count
    loop
        execute format ('alter table temp_table add column col_%s text;', iter);
    end loop;

    -- copy the data from csv file
    execute format ('copy temp_table from %L with delimiter '','' quote ''"'' csv ', csv_file_path);

    iter := 1;
    col_first := (select col_1
                  from temp_table
                  limit 1);

    -- update the column names based on the first row which has the column names
    for col in execute format ('select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = %L', col_first)
    loop
        execute format ('alter table temp_table rename column col_%s to %s', iter, col);
        iter := iter + 1;
    end loop;

    -- delete the columns row // using quote_ident or %I does not work here!?
    execute format ('delete from temp_table where %s = %L', col_first, col_first);

    -- change the temp table name to the name given as parameter, if not blank
    if length (target_table) > 0 then
        execute format ('alter table temp_table rename to %I', target_table);
    end if;
end;

$$ language plpgsql;

Run Code Online (Sandbox Code Playgroud)

归档时间：	12 年，7 月前
查看次数：	137893 次
最近记录：	7 年，2 月前