我希望运行梯度下降优化来最小化变量实例化的成本.我的程序计算量很大,所以我正在寻找一个快速实现GD的流行库.什么是推荐的库/参考?
c++ optimization visual-studio-2010 numerical-methods gradient-descent
我正在尝试开发随机梯度下降,但我不知道它是否100%正确.
以下是我使用10,000个元素和num_iter = 100或500的训练集获得的一些结果
FMINUC :
Iteration #100 | Cost: 5.147056e-001
BACTH GRADIENT DESCENT 500 ITER
Iteration #500 - Cost = 5.535241e-001
STOCHASTIC GRADIENT DESCENT 100 ITER
Iteration #100 - Cost = 5.683117e-001 % First time I launched
Iteration #100 - Cost = 7.047196e-001 % Second time I launched
Run Code Online (Sandbox Code Playgroud)
Logistic回归的梯度下降实现
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
[J, gradJ] = lrCostFunction(theta, X, y, lambda);
theta = theta - alpha * gradJ;
J_history(iter) = J;
fprintf('Iteration #%d - …Run Code Online (Sandbox Code Playgroud) matlab machine-learning gradient-descent logistic-regression
我试图在具有单个特征和多个训练样例(m)的数据集上实现批量梯度下降.
当我尝试使用正规方程时,我得到了正确的答案但错误的答案下面的代码在MATLAB中执行批量梯度下降.
function [theta] = gradientDescent(X, y, theta, alpha, iterations)
m = length(y);
delta=zeros(2,1);
for iter =1:1:iterations
for i=1:1:m
delta(1,1)= delta(1,1)+( X(i,:)*theta - y(i,1)) ;
delta(2,1)=delta(2,1)+ (( X(i,:)*theta - y(i,1))*X(i,2)) ;
end
theta= theta-( delta*(alpha/m) );
computeCost(X,y,theta)
end
end
Run Code Online (Sandbox Code Playgroud)
y是具有目标值的向量,X是一个矩阵,第一列中包含一列和第二列值(变量).
我使用矢量化实现了这一点,即
theta = theta - (alpha/m)*delta
Run Code Online (Sandbox Code Playgroud)
... delta是一个初始化为零的2元素列向量.
成本函数J(Theta)是1/(2m)*(sum from i=1 to m [(h(theta)-y)^2]).
matlab gradient machine-learning linear-regression gradient-descent
torch.inference_modePyTorch从 v1.9 开始具有新功能,“类似于 torch.no_grad……在这种模式下运行的代码通过禁用视图跟踪和版本计数器碰撞来获得更好的性能”。
如果我只是在测试时评估我的模型(即不是训练),是否存在torch.no_grad更好的情况torch.inference_mode?我计划用后者替换前者的每个实例,并且我希望使用运行时错误作为护栏(即我相信任何问题都会将其自身显示为运行时错误,并且如果它不作为运行时错误出现,那么我认为使用torch.inference_mode) 确实更好。
PyTorch 开发者播客中提到了有关为何开发推理模式的更多详细信息。
artificial-intelligence inference machine-learning gradient-descent pytorch
我在R中使用梯度下降有一个多变量线性回归的工作实现.我想看看我是否可以使用我所拥有的随机梯度下降.我不确定这是否真的效率低下.例如,对于α的每个值,我想要执行500次SGD迭代并且能够指定每次迭代中随机挑选的样本的数量.这样做会很好,所以我可以看到样本数量如何影响结果.我在使用迷你批处理时遇到了麻烦,我希望能够轻松地绘制结果.
这是我到目前为止:
# Read and process the datasets
# download the files from GitHub
download.file("https://raw.githubusercontent.com/dbouquin/IS_605/master/sgd_ex_data/ex3x.dat", "ex3x.dat", method="curl")
x <- read.table('ex3x.dat')
# we can standardize the x vaules using scale()
x <- scale(x)
download.file("https://raw.githubusercontent.com/dbouquin/IS_605/master/sgd_ex_data/ex3y.dat", "ex3y.dat", method="curl")
y <- read.table('ex3y.dat')
# combine the datasets
data3 <- cbind(x,y)
colnames(data3) <- c("area_sqft", "bedrooms","price")
str(data3)
head(data3)
################ Regular Gradient Descent
# http://www.r-bloggers.com/linear-regression-by-gradient-descent/
# vector populated with 1s for the intercept coefficient
x1 <- rep(1, length(data3$area_sqft))
# appends to dfs
# create x-matrix of independent variables …Run Code Online (Sandbox Code Playgroud) 我在卷积神经网络上研究这个课程.我一直在尝试为svm实现一个损失函数的梯度,并且(我有一个解决方案的副本)我无法理解为什么解决方案是正确的.
在此页面上,它定义了损失函数的梯度,如下所示:
在我的代码中,我的分析梯度在代码中实现时与数字梯度匹配,如下所示:
dW = np.zeros(W.shape) # initialize the gradient as zero
# compute the loss and the gradient
num_classes = W.shape[1]
num_train = X.shape[0]
loss = 0.0
for i in xrange(num_train):
scores = X[i].dot(W)
correct_class_score = scores[y[i]]
for j in xrange(num_classes):
if j == y[i]:
if margin > 0:
continue
margin = scores[j] - correct_class_score + 1 # note delta = 1
if margin > 0:
dW[:, y[i]] += -X[i]
dW[:, j] += X[i] # …Run Code Online (Sandbox Code Playgroud) python svm computer-vision linear-regression gradient-descent
当我尝试索引叶变量以使用自定义收缩函数更新梯度时遇到就地操作错误。我无法解决它。任何帮助表示高度赞赏!
import torch.nn as nn
import torch
import numpy as np
from torch.autograd import Variable, Function
# hyper parameters
batch_size = 100 # batch size of images
ld = 0.2 # sparse penalty
lr = 0.1 # learning rate
x = Variable(torch.from_numpy(np.random.normal(0,1,(batch_size,10,10))), requires_grad=False) # original
# depends on size of the dictionary, number of atoms.
D = Variable(torch.from_numpy(np.random.normal(0,1,(500,10,10))), requires_grad=True)
# hx sparse representation
ht = Variable(torch.from_numpy(np.random.normal(0,1,(batch_size,500,1,1))), requires_grad=True)
# Dictionary loss function
loss = nn.MSELoss()
# customized shrink function to update …Run Code Online (Sandbox Code Playgroud) python neural-network gradient-descent deep-learning pytorch
我正在尝试在keras中实现完全梯度下降.这意味着对于每个纪元,我都在训练整个数据集.这就是批量大小被定义为训练集的长度大小的原因.
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD,Adam
from keras import regularizers
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import random
from numpy.random import seed
import random
def xrange(start_point,end_point,N,base):
temp = np.logspace(0.1, 1, N,base=base,endpoint=False)
temp=temp-temp.min()
temp=(0.0+temp)/(0.0+temp.max()) #this is between 0 and 1
return (end_point-start_point)*temp +start_point #this is the range
def train_model(x_train,y_train,x_test):
#seed(1)
model=Sequential()
num_units=100
act='relu'
model.add(Dense(num_units,input_shape=(1,),activation=act))
model.add(Dense(num_units,activation=act))
model.add(Dense(num_units,activation=act))
model.add(Dense(num_units,activation=act))
model.add(Dense(1,activation='tanh')) #output layer 1 unit ; activation='tanh'
model.compile(Adam(),'mean_squared_error',metrics=['mse'])
history=model.fit(x_train,y_train,batch_size=len(x_train),epochs=500,verbose=0,validation_split = 0.2 ) #train …Run Code Online (Sandbox Code Playgroud) python machine-learning gradient-descent deep-learning keras
我想可视化 CNN 中给定特征图所学的模式(在本例中,我使用的是 vgg16)。为此,我创建了一个随机图像,通过网络馈送到所需的卷积层,选择特征图并找到相对于输入的梯度。这个想法是以这样一种方式改变输入,以最大化所需特征图的激活。使用 tensorflow 2.0 我有一个 GradientTape 跟随函数然后计算梯度,但是梯度返回 None,为什么它无法计算梯度?
import tensorflow as tf
import matplotlib.pyplot as plt
import time
import numpy as np
from tensorflow.keras.applications import vgg16
class maxFeatureMap():
def __init__(self, model):
self.model = model
self.optimizer = tf.keras.optimizers.Adam()
def getNumLayers(self, layer_name):
for layer in self.model.layers:
if layer.name == layer_name:
weights = layer.get_weights()
num = weights[1].shape[0]
return ("There are {} feature maps in {}".format(num, layer_name))
def getGradient(self, layer, feature_map):
pic = vgg16.preprocess_input(np.random.uniform(size=(1,96,96,3))) ## Creates values between 0 and 1
pic …Run Code Online (Sandbox Code Playgroud) 我是机器学习的新手,我正在尝试分析我的一个项目的分类算法。我SGDClassifier在sklearn图书馆遇到的。但是很多论文都将 SGD 称为一种优化技术。有人可以解释一下是如何SGDClassifier实施的吗?
classification machine-learning gradient-descent scikit-learn
gradient-descent ×10
python ×4
matlab ×2
pytorch ×2
c++ ×1
gradient ×1
inference ×1
keras ×1
optimization ×1
r ×1
scikit-learn ×1
stochastic ×1
svm ×1
tensorflow ×1