Ash*_*nha 1 python azure pandas azure-machine-learning-studio
当我们有一个数据类型为字符串的列并且值为col1 col2 1 .89时,我们将面临错误
所以,当我们使用时
def azureml_main(dataframe1 = None, dataframe2 = None):
# Execution logic goes here
print('Input pandas.DataFrame #1:')
import pandas as pd
import numpy as np
from sklearn.kernel_approximation import RBFSampler
x =dataframe1.iloc[:,2:1080]
print x
df1 = dataframe1[['colname']]
change = np.array(df1)
b = change.ravel()
print b
rbf_feature = RBFSampler(gamma=1, n_components=100,random_state=1)
print rbf_feature
print "test"
X_features = rbf_feature.fit_transform(x)
Run Code Online (Sandbox Code Playgroud)
在此之后我们得到错误,因为无法将非int转换为float类型
使用astype(float)例如:
df['col'] = df['col'].astype(float)
Run Code Online (Sandbox Code Playgroud)
df = df.convert_objects(convert_numeric=True)
Run Code Online (Sandbox Code Playgroud)
例:
In [379]:
df = pd.DataFrame({'a':['1.23', '0.123']})
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 1
Data columns (total 1 columns):
a 2 non-null object
dtypes: object(1)
memory usage: 32.0+ bytes
In [380]:
df['a'].astype(float)
Out[380]:
0 1.230
1 0.123
Name: a, dtype: float64
In [382]:
df = df.convert_objects(convert_numeric=True)
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 1
Data columns (total 1 columns):
a 2 non-null float64
dtypes: float64(1)
memory usage: 32.0 bytes
Run Code Online (Sandbox Code Playgroud)
UPDATE
如果你运行的版本0.17.0或更高版本则convert_objects已经被替换的方法:to_numeric,to_datetime,和to_timestamp因此,而不是:
df['col'] = df['col'].astype(float)
Run Code Online (Sandbox Code Playgroud)
你可以做:
df['col'] = pd.to_numeric(df['col'])
Run Code Online (Sandbox Code Playgroud)
请注意,默认情况下,如果您希望强制NaN执行以下操作,则任何不可转换的值都会引发错误:
df['col'] = pd.to_numeric(df['col'], errors='coerce')
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
23685 次 |
| 最近记录: |