hda*_*tas 3 python math numpy scipy pandas
我有这个数据框作为例子:
Col1 Col2 ... Col5 Price
0 Wood Wood Plastic 50
1 Iron Wood Wood 70
...
3000 Iron Iron Wood 110
Run Code Online (Sandbox Code Playgroud)
我想知道是否可以为 N 个未知数建立一个线性求解器 N 方程(在本例中查找木材、铁、塑料等的价格......)
非常感谢 !
该框架可以转换为线性程序,其中框架中的每一行都是一个约束,每种材料都是一个变量。然后我们可以使用numpy 求解器来求解程序(Rajan Chahan在问题评论中提到)。
import numpy as np
import pandas as pd
from numpy.linalg import solve
# Create a simple frame, with two materials - Wood & Iron.
df = pd.DataFrame({'Col1': ['Iron', 'Wood'], 'Col2': ['Wood', 'Wood'], 'Price': [3,2]})
# Extract the materials and map each material to a unique integer
# For example, "Iron"=0 and "Wood"=1
materials = pd.Series(np.unique(df.as_matrix()[:, :-1])).astype('category')
# Create a the coefficients matrix where each row is a constraint
# For example "Iron + Wood" translates into "1*x0 + 1*x1"
# And "Wood + Wood" translates into "0*x0 + 2*x1"
A = np.zeros((len(df), len(materials)))
# Iterate over all constrains and materials and fill the coefficients
for i in range(len(df)):
for j in range(1, df.shape[1]):
A[i, materials.cat.categories.get_loc(df.get_value(i, 'Col{}'.format(j)))] += 1
# Solve the program and the solution is an array.
# Each entry in the array correspond to a material price.
solution = solve(A, df['Price']) # [ 2. 1.]
# Convert to a mapping per-material
material_prices = pd.Series(solution, index=materials.cat.categories)
# Iron 2.0
# Wood 1.0
# dtype: float64
Run Code Online (Sandbox Code Playgroud)
如果材料数量与约束数量不同,您可以计算最小二乘解。solution = solve(A, df['Price'])将上面代码中的行替换为:
from numpy.linalg import solve, lstsq
solution = lstsq(A, df['Price'])[0]
Run Code Online (Sandbox Code Playgroud)