相关疑难解决方法(0)

更改Pandas中列的数据类型

我想将表格(表示为列表列表)转换为Pandas DataFrame.作为一个极其简化的例子:

a = [['a', '1.2', '4.2'], ['b', '70', '0.03'], ['x', '5', '0']]
df = pd.DataFrame(a)
Run Code Online (Sandbox Code Playgroud)

将列转换为适当类型的最佳方法是什么,在这种情况下,将第2列和第3列转换为浮点数?有没有办法在转换为DataFrame时指定类型?或者最好先创建DataFrame,然后循环遍历列以更改每列的类型?理想情况下,我想以动态方式执行此操作,因为可能有数百列,我不想确切地指定哪些列属于哪种类型.我可以保证的是,每列包含相同类型的值.

python types casting dataframe pandas

688
推荐指数
11
解决办法
132万
查看次数

将包含NaN的Pandas列转换为dtype`int`

我将.csv文件中的数据读取到Pandas数据帧,如下所示.对于其中一列,即id我想将列类型指定为int.问题是id系列缺少/空值.

当我尝试id在读取.csv时将列转换为整数时,我得到:

df= pd.read_csv("data.csv", dtype={'id': int}) 
error: Integer column has NA values
Run Code Online (Sandbox Code Playgroud)

或者,我尝试在阅读后转换列类型,如下所示,但这次我得到:

df= pd.read_csv("data.csv") 
df[['id']] = df[['id']].astype(int)
error: Cannot convert NA to integer
Run Code Online (Sandbox Code Playgroud)

我怎么解决这个问题?

python pandas na

132
推荐指数
9
解决办法
13万
查看次数

+ 不支持的操作数类型:'int' 和 'str' 使用 Pandas 表示

当我尝试获取数据框列之一的平均值时,它显示错误:

TypeError: unsupported operand type(s) for +: 'int' and 'str'
Run Code Online (Sandbox Code Playgroud)

这是我的代码:

import pandas as pd

import numpy as np

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data"

df = pd.read_csv(url, header = None, )

headers = ["symboling","normalized-losses","make","fuel-type","aspiration","num-of-doors","body-style","drive-wheels","engine-location","wheel-base","lenght","width","height","curb-weight","engine-type","num-of-cylinders","engine-size","fuel-system","bore","stroke","compression-ratio","horsepower","peak-rpm","city-mpg","highway-mpg","price"]

df.columns = headers

df.replace('?',np.nan, inplace=True)

mean_val = df['normalized-losses'].mean()

print(mean_val)
Run Code Online (Sandbox Code Playgroud)

python dataframe pandas

7
推荐指数
1
解决办法
9599
查看次数

标签 统计

pandas ×3

python ×3

dataframe ×2

casting ×1

na ×1

types ×1