Python Pandas 的“未命名”列不断出现

Question

Python Pandas 的“未命名”列不断出现

我遇到了一个问题，每次运行我的程序（从 .csv 文件读取数据帧）时，都会出现一个名为“未命名”的新列。

运行 3 次后的示例输出列 -

  Unnamed: 0  Unnamed: 0.1            Subreddit  Appearances

Run Code Online (Sandbox Code Playgroud)

这是我的代码。对于每一行，“未命名”列仅增加 1。

df = pd.read_csv(Location)
while counter < 50:
    #gets just the subreddit name
    e = str(elem[counter].get_attribute("href"))
    e = e.replace("https://www.reddit.com/r/", "")
    e = e[:-1]
    if e in df['Subreddit'].values:
        #adds 1 to Appearances if the subreddit is already in the DF
        df.loc[df['Subreddit'] == e, 'Appearances'] += 1
    else:
        #adds new row with the subreddit name and sets the amount of appearances to 1.
        df = df.append({'Subreddit': e, 'Appearances': 1}, ignore_index=True)
    df.reset_index(inplace=True, drop=True)
    print(e)
    counter = counter + 2
#(doesn't work) df.drop(df.columns[df.columns.str.contains('Unnamed', case=False)], axis=1)

Run Code Online (Sandbox Code Playgroud)

我第一次使用干净的 .csv 文件运行它时，它运行良好，但每次之后，另一个“未命名”列出现了。我只是希望每次都显示“Subreddit”和“外观”列。

Answer 1

mir*_*ixx 5

每次我运行程序时（...）都会出现一个名为“未命名”的新列。

我想这是因为reset_index或者也许to_csv你的代码中有一个像 @jpp 建议的那样。要修复该问题，to_csv请务必使用index=False：

df.to_csv(path, index=False)

Run Code Online (Sandbox Code Playgroud)

只是想要“Subreddit”和“Appearances”专栏

一般来说，我将如何完成您的任务。

其作用是首先对所有出现进行计数（由键入e），然后根据这些计数创建一个新的数据框以与您已有的数据框合并（how='outer'添加尚不存在的行）。这可以避免重置每个元素的索引，从而避免该问题并且性能也更高。

这是包含这些想法的代码：

base_df = pd.read_csv(location)
appearances = Counter()  # from collections
while counter < 50:
    #gets just the subreddit name
    e = str(elem[counter].get_attribute("href"))
    e = e.replace("https://www.reddit.com/r/", "")
    e = e[:-1]
    appearances[e] += 1
    counter = counter + 2
appearances_df = pd.DataFrame({'e': e, 'appearances': c } 
                               for e, c in x.items())
df = base_df.merge(appearances_df, how='outer', on='e')

Run Code Online (Sandbox Code Playgroud)

Answer 2

F B*_*het 5

另一种解决方案是使用属性读取 csvindex_col=0以不考虑索引列 : df = pd.read_csv(Location, index_col=0)。

归档时间：	7 年前
查看次数：	8822 次
最近记录：	6 年，6 月前