我最近从matlab切换到R,我想运行一个优化方案.
在matlab中,我能够:
options = optimset('GradObj', 'on', 'MaxIter', 400);
[theta, cost] = fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);
Run Code Online (Sandbox Code Playgroud)
这相当于costFunctionReg(这里我称之为logisticRegressionCost)
logisticRegressionCost <- function(theta, X, y) {
J = 0;
theta = as.matrix(theta);
X = as.matrix(X);
y = as.matrix(y);
rows = dim(theta)[2];
cols = dim(theta)[1];
grad = matrix(0, rows, cols);
predicted = sigmoid(X %*% theta);
J = (-y) * log(predicted) - (1 - y) * log(1 - predicted);
J = sum(J) / dim(y)[1];
grad = t(predicted - y);
grad = grad %*% X;
grad …Run Code Online (Sandbox Code Playgroud) 当按多个条件分组时,我想保留空组(使用默认值,如 NA 或 0)。
dt = data.table(user = c("A", "A", "B"), date = c("t1", "t2", "t1"), duration = c(1, 2, 1))
dt[, .("total" = sum(duration)), by = .(date, user)]
Run Code Online (Sandbox Code Playgroud)
结果:
date user total
1: t1 A 1
2: t2 A 2
3: t1 B 1
Run Code Online (Sandbox Code Playgroud)
想要的结果:
date user total
1: t1 A 1
2: t2 A 2
3: t1 B 1
3: t2 B NA
Run Code Online (Sandbox Code Playgroud)
一种解决方案可能是在分组之前添加具有 0 个值的行,但它需要创建许多列的笛卡尔乘积并手动检查该组合是否已存在值,但我更喜欢内置/更简单的值。