我的.gitlab-ci.yml
文件看起来像这样:
anomalydetector:
image: continuumio/miniconda:4.7.10
stage: build
tags:
- docker
script:
- conda env create -f environment.yml
- conda activate my-env
- pytest tests/.
Run Code Online (Sandbox Code Playgroud)
在 Gitlab 上,这项工作开始正常,并且日志读取
$ conda env create -f environment.yml
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done
==> WARNING: A newer version of conda exists. <==
current version: 4.7.10
latest version: 4.7.11
Run Code Online (Sandbox Code Playgroud)
好的,所以我使用的是conda
4.4 之后的版本,所以conda activate
应该可以工作。但是,作业失败并显示以下内容:
# To activate this environment, use
#
# $ conda activate my-env
#
# …
Run Code Online (Sandbox Code Playgroud) 我有一个以以下内容开头的 Dockerfile:
FROM python:3.7-slim
RUN apt-get update && apt-get install build-essential -y
Run Code Online (Sandbox Code Playgroud)
问题是,这个层总是在变化,所以当我运行时docker build -t <mytag> .
,这个层(和后续的)会再次运行,这会占用大量时间。
有没有办法build-essential
在我的 Dockerfile 中安装一个不会不断变化的层?
编辑:我在 RUN 之前有一个 COPY 行,我从问题中删除了它,因为我不想包含私有文件的名称,但我没有想到这是使构建重新运行的原因这一步。
我已经安装minikube
并kubectl
安装了:
$ minikube version
minikube version: v1.4.0
commit: 7969c25a98a018b94ea87d949350f3271e9d64b6
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-18T14:36:53Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-18T14:27:17Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
Run Code Online (Sandbox Code Playgroud)
然后我按照https://helm.sh/docs/using_helm/的说明进行操作:
$ tar -xzvf Downloads/helm-v2.13.1-linux-amd64.tar.gz linux-amd64/
linux-amd64/LICENSE
linux-amd64/tiller
linux-amd64/helm
linux-amd64/README.md
Run Code Online (Sandbox Code Playgroud)
但是现在,如果我检查我的helm
版本,我会得到这个:
$ helm version
Client: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
Error: could not find tiller
Run Code Online (Sandbox Code Playgroud)
我试过运行helm init
,但得到以下内容:
$ helm init
$HELM_HOME has been configured at …
Run Code Online (Sandbox Code Playgroud) 我正在开发一个使用 pyspark 的项目,并且想设置自动化测试。
我的.gitlab-ci.yml
文件如下所示:
image: "myimage:latest"
stages:
- Tests
pytest:
stage: Tests
script:
- pytest tests/.
Run Code Online (Sandbox Code Playgroud)
我使用 Dockerfile 构建了 docker 映像,myimage
如下所示(请参阅这个优秀的答案):
FROM python:3.7
RUN python --version
# Create app directory
WORKDIR /app
# copy requirements.txt
COPY local-src/requirements.txt ./
# Install app dependencies
RUN pip install -r requirements.txt
# Bundle app source
COPY src /app
Run Code Online (Sandbox Code Playgroud)
但是,当我运行此命令时,gitlab CI 作业出现以下错误:
/usr/local/lib/python3.7/site-packages/pyspark/java_gateway.py:95: in launch_gateway
raise Exception("Java gateway process exited before sending the driver its port number")
E Exception: …
Run Code Online (Sandbox Code Playgroud) 我无法手动匹配 LGBM 的简历分数。
这是一个 MCVE:
from sklearn.datasets import load_breast_cancer
import pandas as pd
from sklearn.model_selection import train_test_split, KFold
from sklearn.metrics import roc_auc_score
import lightgbm as lgb
import numpy as np
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = pd.Series(data.target)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
folds = KFold(5, random_state=42)
params = {'random_state': 42}
results = lgb.cv(params, lgb.Dataset(X_train, y_train), folds=folds, num_boost_round=1000, early_stopping_rounds=100, metrics=['auc'])
print('LGBM\'s cv score: ', results['auc-mean'][-1])
clf = lgb.LGBMClassifier(**params, n_estimators=len(results['auc-mean']))
val_scores = []
for train_idx, val_idx …
Run Code Online (Sandbox Code Playgroud) python machine-learning scikit-learn cross-validation lightgbm
如果我跑
from sklearn.datasets import load_breast_cancer
import lightgbm as lgb
breast_cancer = load_breast_cancer()
data = breast_cancer.data
target = breast_cancer.target
params = {
"task": "convert_model",
"convert_model_language": "cpp",
"convert_model": "test.cpp",
}
gbm = lgb.train(params, lgb.Dataset(data, target))
Run Code Online (Sandbox Code Playgroud)
然后我期待test.cpp
创建一个名为的文件,模型以 C++ 格式保存。
但是,我的当前目录中没有任何内容。
我已阅读文档(https://lightgbm.readthedocs.io/en/latest/Parameters.html#io-parameters),但不知道我做错了什么。
python ×4
docker ×2
lightgbm ×2
anaconda ×1
apache-spark ×1
c++ ×1
conda ×1
dockerfile ×1
gcc ×1
gitlab ×1
kubectl ×1
kubernetes ×1
linux ×1
minikube ×1
pyspark ×1
scikit-learn ×1