如何使用 pandas 导入多个 csv 文件并连接到一个 DataFrame

Ale*_*ues 4 python csv dataframe pandas

我有问题No objects to concatenate。我无法从 main 及其子目录导入 .csv 文件以将它们连接到一个 DataFrame 中。我正在使用熊猫。旧的答案对我没有帮助,所以请不要标记为重复。

文件夹结构是这样的

main/*.csv
main/name1/name1/*.csv
main/name1/name2/*.csv
main/name2/name1/*.csv
main/name3/*.csv
Run Code Online (Sandbox Code Playgroud)
import pandas as pd
import os
import glob

folder_selected = 'C:/Users/jacob/Documents/csv_files'
Run Code Online (Sandbox Code Playgroud)
  1. 不行
frame = pd.concat(map(pd.read_csv, glob.iglob(os.path.join(folder_selected, "/*.csv"))))
Run Code Online (Sandbox Code Playgroud)
  1. 不行
csv_paths = glob.glob('*.csv')
dfs = [pd.read_csv(folder_selected) for folder_selected in csv_paths]
df = pd.concat(dfs)
Run Code Online (Sandbox Code Playgroud)
  1. 不行
            all_files = []
            
            all_files = glob.glob (folder_selected + "/*.csv")
            
            file_path = []
            for file in all_files:
                df = pd.read_csv(file, index_col=None, header=0)
                file_path.append(df)
                    
        frame = pd.concat(file_path, axis=0, ignore_index=False)
Run Code Online (Sandbox Code Playgroud)

ati*_*tin 7

您需要递归搜索子目录。

folder = 'C:/Users/jacob/Documents/csv_files'
path = folder+"/**/*.csv"
Run Code Online (Sandbox Code Playgroud)
  1. 使用glob.iglob
df = pd.concat(map(pd.read_csv, glob.iglob(path, recursive=True)))
Run Code Online (Sandbox Code Playgroud)
  1. 使用glob.glob
csv_paths = glob.glob(path, recursive=True)
dfs = [pd.read_csv(csv_path) for csv_path in csv_paths]
df = pd.concat(dfs)
Run Code Online (Sandbox Code Playgroud)
  1. 使用os.walk
file_paths = []
for base, dirs, files in os.walk(folder):
    for file in fnmatch.filter(files, '*.csv'):
        file_paths.append(os.path.join(base, file))
df = pd.concat([pd.read_csv(file) for file in file_paths])
Run Code Online (Sandbox Code Playgroud)
  1. 使用pathlib
from pathlib import Path
files = Path(folder).rglob('*.csv')
df = pd.concat(map(pd.read_csv, files))
Run Code Online (Sandbox Code Playgroud)