我想替换包含多个分类的数据框中的某些值。
df = pd.DataFrame({'s1': ['a', 'b', 'c'], 's2': ['a', 'c', 'd']}, dtype='category')
Run Code Online (Sandbox Code Playgroud)
如果我申请.replace单列,结果如预期:
>>> df.s1.replace('a', 1)
0 1
1 b
2 c
Name: s1, dtype: object
Run Code Online (Sandbox Code Playgroud)
如果我对整个数据帧应用相同的操作,则会显示错误(简短版本):
>>> df.replace('a', 1)
ValueError: Cannot setitem on a Categorical with a new category, set the categories first
During handling of the above exception, another exception occurred:
ValueError: Wrong number of dimensions
Run Code Online (Sandbox Code Playgroud)
如果数据帧包含整数作为类别,则会发生以下情况:
df = pd.DataFrame({'s1': [1, 2, 3], 's2': [1, 3, 4]}, dtype='category')
>>> df.replace(1, 3)
s1 s2
0 3 3 …Run Code Online (Sandbox Code Playgroud) 我想将动态加载的函数提交到 concurrent.futures.ProcessPoolExecutor. 这是一个例子。其中module.py包含该功能。
# Content of module.py
def func():
return 1
Run Code Online (Sandbox Code Playgroud)
然后,剩下的就在file.py
# Content of file.py
from concurrent.futures import ProcessPoolExecutor
import multiprocessing
import importlib
from pathlib import Path
import inspect
def load_function_from_module(path):
spec = importlib.util.spec_from_file_location(path.stem, str(path))
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
return mod
def func_top_level():
return 2
if __name__ == '__main__':
# Dynamically load function from other module.
path = Path(__file__).parent / "module.py"
func = dict(inspect.getmembers(load_function_from_module(path)))["func"]
with ProcessPoolExecutor(2) as executor:
future = executor.submit(func)
future_ = executor.submit(func_top_level)
# …Run Code Online (Sandbox Code Playgroud)