将多个分隔的文本文件导入到SQL Server数据库中并自动创建表

LBo*_*rdt 2 sql sql-server csv import create-table

我有多个定界的文本文件(例如.csv文件),每个文件包含列,行和标题。

我想尽可能轻松地将所有这些输入文件导入SQL Server。具体而言,我想创建输出表到一个我将导入这些文件上飞

这些输入文件中的某些将需要导入到一个相同的输出表中,而其他输入文件将需要导入到不同的表中。您可以假定将要导入到同一表中的所有文件都具有相同的头。

SQL Server Management Studio有一个导入向导,使用该向导可以导入定界的文本文件(和其他格式)并自动创建输出表。但是,这不允许您同时导入多个文件。此外,它需要大量的手工工作并且不可复制。

人们可以在线找到许多脚本,这些脚本可以将多个文本文件导入到一个表中。但是,其中大多数要求首先创建输出表。这也需要每张桌子额外的工作。

有没有办法列出所有相关的输入文件及其对应的输出表,以便在导入数据后自动创建表?

LBo*_*rdt 5

该脚本允许您将多个定界的文本文件导入SQL数据库。将自动创建将数据导入到其中的表,包括所有必需的列。该脚本包括一些文档。

/*
**  This file was created by Laurens Bogaardt, Advisor Data Analytics at EY Amsterdam on 2016-11-03.
**  This script allows you to import multiple delimited text files into a SQL database. The tables 
**  into which the data is imported, including all required columns, are created automatically. This 
**  script uses tab-delimited (tsv) files and SQL Server Management Studio. The script may need some 
**  minor adjustments for other formats and tools. The scripts makes several assumptions which need 
**  to be valid before it can run properly. First of all, it assumes none of the output tables exist 
**  in the SQL tool before starting. Therefore, it may be necessary to clean the database and delete 
**  all the existing tables. Secondly, the script assumes that, if multiple text files are imported 
**  into the same output table, the number and order of the columns of these files is identical. If 
**  this is not the case, some manual work may need to be done to the text files before importing.
**  Finally, please note that this script only imports data as strings (to be precise, as NVARCHAR's
**  of length 255). It does not allow you to specify the datatype per column. This would need to be 
**  done using another script after importing the data as strings.
*/

-- 1.   Import Multiple Delimited Text Files into a SQL Database

-- 1.1  Define the path to the input and define the terminators

/*
**  In this section, some initial parameters are set. Obviously, the 'DatabaseName' refers to the 
**  database in which you want to create new tables. The '@Path' parameter sets the folder in 
**  which the text files are located which you want to import. Delimited files are defined by 
**  two characters: one which separates columns and one which separates rows. Usually, the 
**  row-terminator is the newline character CHAR(10), also given by '\n'. When files are created 
**  in Windows, the row-terminator often includes a carriage return CHAR(13), also given by '\r\n'. 
**  Often, a tab is used to separate each column. This is given by CHAR(9) or by the character '\t'. 
**  Other useful characters include the comma CHAR(44), the semi-colon CHAR(59) and the pipe 
**  CHAR(124).
*/

USE [DatabaseName]
DECLARE @Path NVARCHAR(255) = 'C:\\PathToFiles\\'
DECLARE @RowTerminator NVARCHAR(5) = CHAR(13) + CHAR(10)
DECLARE @ColumnTerminator NVARCHAR(5) = CHAR(9)

-- 1.2  Define the list of input and output in a temporary table

/*
**  In this section, a temporary table is created which lists all the filenames of the delimited 
**  files which need to be imported, as well as the names of the tables which are created and into 
**  which the data is imported. Multiple files may be imported into the same output table. Each row 
**  is prepended with an integer which increments up starting from 1. It is essential that this 
**  number follows this logic. The temporary table is deleted at the end of this script.
*/

IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Files_Temporary];
CREATE TABLE [dbo].[Files_Temporary]
(
    [ID] INT
    , [FileName] NVARCHAR(255)
    , [TableName] NVARCHAR(255)
);

INSERT INTO [dbo].[Files_Temporary] SELECT 1,   'MyFileA.txt',  'NewTable1'
INSERT INTO [dbo].[Files_Temporary] SELECT 2,   'MyFileB.txt',  'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 3,   'MyFileC.tsv',  'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 4,   'MyFileD.csv',  'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 5,   'MyFileE.dat',  'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 6,   'MyFileF',      'NewTable3'
INSERT INTO [dbo].[Files_Temporary] SELECT 7,   'MyFileG.text', 'NewTable4'
INSERT INTO [dbo].[Files_Temporary] SELECT 8,   'MyFileH.txt',  'NewTable5'
INSERT INTO [dbo].[Files_Temporary] SELECT 9,   'MyFileI.txt',  'NewTable5'
INSERT INTO [dbo].[Files_Temporary] SELECT 10,  'MyFileJ.txt',  'NewTable5'
INSERT INTO [dbo].[Files_Temporary] SELECT 11,  'MyFileK.txt',  'NewTable6'

-- 1.3  Loop over the list of input and output and import each file to the correct table

/*
**  In this section, the 'WHILE' statement is used to loop over all input files. A counter is defined 
**  which starts at '1' and increments with each iteration. The filename and tablename are retrieved 
**  from the previously defined temporary table. The next step of the script is to check whether the 
**  output table already exists or not.
*/

DECLARE @Counter INT = 1

WHILE @Counter <= (SELECT COUNT(*) FROM [dbo].[Files_Temporary])
BEGIN
    PRINT 'Counter is ''' + CONVERT(NVARCHAR(5), @Counter) + '''.'

    DECLARE @FileName NVARCHAR(255)
    DECLARE @TableName NVARCHAR(255)
    DECLARE @Header NVARCHAR(MAX)
    DECLARE @SQL_Header NVARCHAR(MAX)
    DECLARE @CreateHeader NVARCHAR(MAX) = ''
    DECLARE @SQL_CreateHeader NVARCHAR(MAX)

    SELECT @FileName = [FileName], @TableName = [TableName] FROM [dbo].[Files_Temporary] WHERE [ID] = @Counter

    IF OBJECT_ID('[dbo].[' + @TableName + ']', 'U') IS NULL
    BEGIN
/*
**  If the output table does not yet exist, it needs to be created. This requires the list of all 
**  columnnames for that table to be retrieved from the first line of the text file, which includes 
**  the header. A piece of SQL code is generated and executed which imports the header of the text 
**  file. A second temporary table is created which stores this header as a single string.
*/
        PRINT 'Creating new table with name ''' + @TableName + '''.'

        IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL
        DROP TABLE [dbo].[Header_Temporary];
        CREATE TABLE [dbo].[Header_Temporary]
        (
            [Header] NVARCHAR(MAX)
        );

        SET @SQL_Header = '
            BULK INSERT [dbo].[Header_Temporary]
            FROM ''' + @Path + @FileName + '''
            WITH
            (
                FIRSTROW = 1,
                LASTROW = 1,
                MAXERRORS = 0,
                FIELDTERMINATOR = ''' + @RowTerminator + ''',
                ROWTERMINATOR = ''' + @RowTerminator + '''
            )'
        EXEC(@SQL_Header)

        SET @Header = (SELECT TOP 1 [Header] FROM [dbo].[Header_Temporary])
        PRINT 'Extracted header ''' + @Header + ''' for table ''' + @TableName + '''.'
/*
**  The columnnames in the header are separated using the column-terminator. This can be used to loop 
**  over each columnname. A new piece of SQL code is generated which will create the output table 
**  with the correctly named columns.
*/
        WHILE CHARINDEX(@ColumnTerminator, @Header) > 0
        BEGIN          
            SET @CreateHeader = @CreateHeader + '[' + LTRIM(RTRIM(SUBSTRING(@Header, 1, CHARINDEX(@ColumnTerminator, @Header) - 1))) + '] NVARCHAR(255), '
            SET @Header = SUBSTRING(@Header, CHARINDEX(@ColumnTerminator, @Header) + 1, LEN(@Header)) 
        END
        SET @CreateHeader = @CreateHeader + '[' + @Header + '] NVARCHAR(255)'

        SET @SQL_CreateHeader = 'CREATE TABLE [' + @TableName + '] (' + @CreateHeader + ')'
        EXEC(@SQL_CreateHeader)
    END

/*
**  Finally, the data from the text file is imported into the newly created table. The first line, 
**  including the header information, is skipped. If multiple text files are imported into the same 
**  output table, it is essential that the number and the order of the columns is identical, as the 
**  table will only be created once, using the header information of the first text file.
*/
    PRINT 'Inserting data from ''' + @FileName + ''' to ''' + @TableName + '''.'
    DECLARE @SQL NVARCHAR(MAX)
    SET @SQL = '
        BULK INSERT [dbo].[' + @TableName + ']
        FROM ''' + @Path + @FileName + '''
        WITH
        (
            FIRSTROW = 2,
            MAXERRORS = 0,
            FIELDTERMINATOR = ''' + @ColumnTerminator + ''',
            ROWTERMINATOR = ''' + @RowTerminator + '''
        )'
    EXEC(@SQL)

    SET @Counter = @Counter + 1
END;

-- 1.4  Cleanup temporary tables

/*
**  In this section, the temporary tables which were created and used by this script are deleted. 
**  Alternatively, the script could have used 'real' temporary table (identified by the '#' character 
**  in front of the name) or a table variable. These would have deleted themselves once they were no 
**  longer in use. However, the end result is the same.
*/

IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Files_Temporary];

IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Header_Temporary];
Run Code Online (Sandbox Code Playgroud)