Ama*_*ram 5 python numpy matplotlib pandas
我已将这些编程用于计算方差
import pandas as pd
import xlrd
import numpy as np
import matplotlib.pyplot as plt
credit_card=pd.read_csv("default_of_credit_card_clients_Data.csv",skiprows=1)
print(credit_card.head())
for col in credit_card:
var[col]=np.var(credit_card(col))
print(var)
Run Code Online (Sandbox Code Playgroud)
我收到了这个错误
回溯(最近一次调用最后一次):文件"C:/Python34/project.py",第11行,var [col] = np.var(credit_card(col))TypeError:'DataFrame'对象不可调用
一个解决方案将不胜感激.
看来你需要DataFrame.var:
默认情况下由N-1标准化.这可以使用ddof参数进行更改
var1 = credit_card.var()
Run Code Online (Sandbox Code Playgroud)
样品:
#random dataframe
np.random.seed(100)
credit_card = pd.DataFrame(np.random.randint(10, size=(5,5)), columns=list('ABCDE'))
print (credit_card)
A B C D E
0 8 8 3 7 7
1 0 4 2 5 2
2 2 2 1 0 8
3 4 0 9 6 2
4 4 1 5 3 4
var1 = credit_card.var()
print (var1)
A 8.8
B 10.0
C 10.0
D 7.7
E 7.8
dtype: float64
var2 = credit_card.var(axis=1)
print (var2)
0 4.3
1 3.8
2 9.8
3 12.2
4 2.3
dtype: float64
Run Code Online (Sandbox Code Playgroud)
如果需要numpy解决方案numpy.var:
print (np.var(credit_card.values, axis=0))
[ 7.04 8. 8. 6.16 6.24]
print (np.var(credit_card.values, axis=1))
[ 3.44 3.04 7.84 9.76 1.84]
Run Code Online (Sandbox Code Playgroud)
差异是因为默认情况下ddof=1的pandas,但你可以将其更改为0:
var1 = credit_card.var(ddof=0)
print (var1)
A 7.04
B 8.00
C 8.00
D 6.16
E 6.24
dtype: float64
var2 = credit_card.var(ddof=0, axis=1)
print (var2)
0 3.44
1 3.04
2 7.84
3 9.76
4 1.84
dtype: float64
Run Code Online (Sandbox Code Playgroud)