我正在使用 Databricks。对于我的数据,我创建了一个 DeltaLake。然后我尝试使用 pandas API 修改该列,但由于某种原因弹出以下错误消息:
ValueError: Cannot combine the series or dataframe because it comes from a different dataframe. In order to allow this operation, enable 'compute.ops_on_diff_frames' option.
Run Code Online (Sandbox Code Playgroud)
我使用以下代码重写表中的数据:
df_new = spark.read.format('delta').load(f"abfss://{container}@{storage_account_name}.dfs.core.windows.net/{delta_name}")
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from math import *
from pyspark.pandas.config import set_option
import pyspark.pandas as ps
%matplotlib inline
from pyspark.pandas.config import set_option
import pyspark.pandas as ps
win_len = 5000
# For this be sure you …Run Code Online (Sandbox Code Playgroud)