相关疑难解决方法(0)

尝试应用 lambda 创建新列时,“'DataFrame' 对象没有属性 'apply'”

我的目标是在 Pandas DataFrame 中添加一个新列,但我面临一个奇怪的错误。

新列应该是现有列的转换,可以在字典/哈希图中进行查找。

# Loading data
df = sqlContext.read.format(...).load(train_df_path)

# Instanciating the map
some_map = {
    'a': 0, 
    'b': 1,
    'c': 1,
}

# Creating a new column using the map
df['new_column'] = df.apply(lambda row: some_map(row.some_column_name), axis=1)
Run Code Online (Sandbox Code Playgroud)

这导致以下错误:

AttributeErrorTraceback (most recent call last)
<ipython-input-12-aeee412b10bf> in <module>()
     25 df= train_df
     26 
---> 27 df['new_column'] = df.apply(lambda row: some_map(row.some_column_name), axis=1)

/usr/lib/spark/python/pyspark/sql/dataframe.py in __getattr__(self, name)
    962         if name not in self.columns:
    963             raise AttributeError(
--> 964                 "'%s' object has no attribute …
Run Code Online (Sandbox Code Playgroud)

python apache-spark-sql pyspark pyspark-sql

2
推荐指数
1
解决办法
8946
查看次数

标签 统计

apache-spark-sql ×1

pyspark ×1

pyspark-sql ×1

python ×1