方程组求解器 pandas

hda*_*tas 3 python math numpy scipy pandas

我有这个数据框作为例子:

       Col1         Col2     ...    Col5       Price
 0     Wood         Wood            Plastic     50
 1     Iron         Wood            Wood        70
                            ...
3000   Iron         Iron            Wood        110
Run Code Online (Sandbox Code Playgroud)

我想知道是否可以为 N 个未知数建立一个线性求解器 N 方程(在本例中查找木材、铁、塑料等的价格......)

非常感谢 !

Eli*_*sha 5

该框架可以转换为线性程序,其中框架中的每一行都是一个约束,每种材料都是一个变量。然后我们可以使用numpy 求解器来求解程序(Rajan Chahan在问题评论中提到)。

import numpy as np
import pandas as pd

from numpy.linalg import solve

# Create a simple frame, with two materials - Wood & Iron.
df = pd.DataFrame({'Col1': ['Iron', 'Wood'], 'Col2': ['Wood', 'Wood'], 'Price': [3,2]})

# Extract the materials and map each material to a unique integer
# For example, "Iron"=0 and "Wood"=1
materials = pd.Series(np.unique(df.as_matrix()[:, :-1])).astype('category')

# Create a the coefficients matrix where each row is a constraint
# For example "Iron + Wood" translates into "1*x0 + 1*x1"
# And "Wood + Wood" translates into "0*x0 + 2*x1"
A = np.zeros((len(df), len(materials)))

# Iterate over all constrains and materials and fill the coefficients
for i in range(len(df)):
    for j in range(1, df.shape[1]):
        A[i, materials.cat.categories.get_loc(df.get_value(i, 'Col{}'.format(j)))] += 1

# Solve the program and the solution is an array.
# Each entry in the array correspond to a material price.
solution = solve(A, df['Price'])  # [ 2. 1.]

# Convert to a mapping per-material
material_prices = pd.Series(solution, index=materials.cat.categories)
# Iron    2.0
# Wood    1.0
# dtype: float64
Run Code Online (Sandbox Code Playgroud)

如果材料数量与约束数量不同,您可以计算最小二乘解solution = solve(A, df['Price'])将上面代码中的行替换为:

from numpy.linalg import solve, lstsq
solution = lstsq(A, df['Price'])[0]
Run Code Online (Sandbox Code Playgroud)