Dav*_*s72 5 c++ algorithm artificial-intelligence machine-learning linear-regression
我在C++中实现了一个简单的线性回归(单个变量)示例,以帮助我理解这些概念.我很确定关键算法是正确的,但我的表现非常糟糕.
这是实际执行梯度下降的方法:
void LinearRegression::BatchGradientDescent(std::vector<std::pair<int,int>> & data,float& theta1,float& theta2)
{
float weight = (1.0f/static_cast<float>(data.size()));
float theta1Res = 0.0f;
float theta2Res = 0.0f;
for(auto p: data)
{
float cost = Hypothesis(p.first,theta1,theta2) - p.second;
theta1Res += cost;
theta2Res += cost*p.first;
}
theta1 = theta1 - (m_LearningRate*weight* theta1Res);
theta2 = theta2 - (m_LearningRate*weight* theta2Res);
}
Run Code Online (Sandbox Code Playgroud)
其他关键功能如下:
float LinearRegression::Hypothesis(float x,float theta1,float theta2) const
{
return theta1 + x*theta2;
}
float LinearRegression::CostFunction(std::vector<std::pair<int,int>> & data,
float theta1,
float theta2) const
{
float error = 0.0f;
for(auto p: data)
{
float prediction = (Hypothesis(p.first,theta1,theta2) - p.second) ;
error += prediction*prediction;
}
error *= 1.0f/(data.size()*2.0f);
return error;
}
void LinearRegression::Regress(std::vector<std::pair<int,int>> & data)
{
for(unsigned int itr = 0; itr < MAX_ITERATIONS; ++itr)
{
BatchGradientDescent(data,m_Theta1,m_Theta2);
//Some visualisation code
}
}
Run Code Online (Sandbox Code Playgroud)
现在的问题是,如果学习率大于0.000001,则梯度下降后的成本函数的值高于之前的值.也就是说,算法反向运行.线条通过原点很快形成一条直线,然后需要数百万次迭代才能真正达到相当合适的线条.
学习率为0.01,经过六次迭代后输出为:(差异为costAfter-costBefore)
Cost before 102901.945312, cost after 517539430400.000000, difference 517539332096.000000
Cost before 517539430400.000000, cost after 3131945127824588800.000000, difference 3131944578068774912.000000
Cost before 3131945127824588800.000000, cost after 18953312418560698826620928.000000, difference 18953308959796185006080000.000000
Cost before 18953312418560698826620928.000000, cost after 114697949347691988409089177681920.000000, difference 114697930004878874575022382383104.000000
Cost before 114697949347691988409089177681920.000000, cost after inf, difference inf
Cost before inf, cost after inf, difference nan
Run Code Online (Sandbox Code Playgroud)
在这个例子中,thetas设置为零,学习率为0.000001,并且有8,000,000次迭代!可视化代码仅在每100,000次迭代后更新图形.

创建数据点的函数:
static void SetupRegressionData(std::vector<std::pair<int,int>> & data)
{
srand (time(NULL));
for(int x = 50; x < 750; x += 3)
{
data.push_back(std::pair<int,int>(x+(rand() % 100), 400 + (rand() % 100) ));
}
}
Run Code Online (Sandbox Code Playgroud)
简而言之,如果我的学习速率太高,则梯度下降算法有效地向后运行并趋向于无穷大,并且如果它降低到实际收敛到最小值的点,则实际这样做所需的迭代次数是不可接受的高.
我是否错过了核心算法中的任何错误/错误?
| 归档时间: |
|
| 查看次数: |
618 次 |
| 最近记录: |