Pab*_* LR 4 python csv multiple-columns
我是 Python 新手,我正在尝试获取 csv 文件的每个(列或行)的平均值,然后选择高于其列(o 行)平均值的两倍的值。我的文件有数百列,并且具有如下浮点值:
845.123,452.234,653.23,...
432.123,213.452.421.532,...
743.234,532,432.423,...
Run Code Online (Sandbox Code Playgroud)
我已经尝试对我的代码进行多次更改以获得每一列的平均值(单独),但目前我的代码是这样的:
def AverageColumn (c):
f=open(csv,"r")
average=0
Sum=0
column=len(f)
for i in range(0,column):
for n in i.split(','):
n=float(n)
Sum += n
average = Sum / len(column)
return 'The average is:', average
f.close()
csv="MDT25.csv"
print AverageColumn(csv)
Run Code Online (Sandbox Code Playgroud)
但我总是收到类似“f 没有 len()”或“'int' 对象不可迭代”之类的错误...
如果有人告诉我如何获得每一列(或行,如您所愿)的平均值,然后选择高于其列(或行)平均值的两倍的值,我将不胜感激。我宁愿不将模块作为 csv 导入,而是根据您的喜好。谢谢!
这是您的功能的清理,但它可能不会做您想要它做的事情。目前,它正在获取所有列中所有值的平均值:
def average_column (csv):
f = open(csv,"r")
average = 0
Sum = 0
row_count = 0
for row in f:
for column in row.split(','):
n=float(column)
Sum += n
row_count += 1
average = Sum / len(column)
f.close()
return 'The average is:', average
Run Code Online (Sandbox Code Playgroud)
我会使用csv模块(这使得 csv 解析更容易),用一个Counter对象来管理列总数和一个上下文管理器来打开文件(不需要close()):
import csv
from collections import Counter
def average_column (csv_filepath):
column_totals = Counter()
with open(csv_filepath,"rb") as f:
reader = csv.reader(f)
row_count = 0.0
for row in reader:
for column_idx, column_value in enumerate(row):
try:
n = float(column_value)
column_totals[column_idx] += n
except ValueError:
print "Error -- ({}) Column({}) could not be converted to float!".format(column_value, column_idx)
row_count += 1.0
# row_count is now 1 too many so decrement it back down
row_count -= 1.0
# make sure column index keys are in order
column_indexes = column_totals.keys()
column_indexes.sort()
# calculate per column averages using a list comprehension
averages = [column_totals[idx]/row_count for idx in column_indexes]
return averages
Run Code Online (Sandbox Code Playgroud)