小编Avi*_*ava的帖子

无法指定“edition2021”以便在 Rust 中使用不稳定的包

我想通过 Cargo 运行一个示例,但遇到错误:

error: failed to parse manifest at `/Users/aviralsrivastava/dev/subxt/Cargo.toml`
Run Code Online (Sandbox Code Playgroud)

完整的堆栈跟踪是:

error: failed to parse manifest at `/Users/aviralsrivastava/dev/subxt/Cargo.toml`

Caused by:
  feature `edition2021` is required

  The package requires the Cargo feature called `edition2021`, but that feature is not stabilized in this version of Cargo (1.56.0-nightly (b51439fd8 2021-08-09)).
  Consider adding `cargo-features = ["edition2021"]` to the top of Cargo.toml (above the [package] table) to tell Cargo you are opting in to use this unstable feature.
  See https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#edition-2021 for more information about the status …
Run Code Online (Sandbox Code Playgroud)

rust substrate polkadot

43
推荐指数
3
解决办法
3万
查看次数

如何避免 Papermill 中名为“kernelspec”的 Keyerror?

我正在使用气流(docker)运行造纸厂命令。该脚本存储在 S3 上,我使用纸厂的 Python 客户端运行它。它以一个完全无法理解的错误结束:

Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/ipython_genutils/ipstruct.py", line 132, in __getattr__
result = self[key]
KeyError: 'kernelspec'
Run Code Online (Sandbox Code Playgroud)

我试图查看文档,但徒劳无功。

我正在使用的代码是运行 papermill 命令:

import time
from datetime import datetime, timedelta

from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from mypackage.datastore import db
from mypackage.workflow.transform.jupyter_notebook import run_jupyter_notebook


dag_id = "jupyter-test-dag"
default_args = {
    'owner': "aviral",
    'depends_on_past': False,
    'start_date': "2019-02-28T00:00:00",
    'email': "aviral@some_org.com",
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 0,
    'retry_delay': timedelta(minutes=5),
    'provide_context': True
}

dag = DAG(
    dag_id,
    catchup=False,
    default_args=default_args,
    schedule_interval=None, …
Run Code Online (Sandbox Code Playgroud)

python python-3.x jupyter jupyter-notebook papermill

10
推荐指数
1
解决办法
1053
查看次数

Rust 中两个引用变量如何相等?

我的代码在其中一个函数中存在错误:

fn is_five(x: &i32) -> bool {
    x == 5
}

fn main() {
    assert!(is_five(&5));
    assert!(!is_five(&6));
    println!("Success!");
}
Run Code Online (Sandbox Code Playgroud)

运行时,报错为:

error[E0277]: can't compare `&i32` with `{integer}`
 --> main.rs:2:7
  |
2 |     x == 5
  |       ^^ no implementation for `&i32 == {integer}`
  |
  = help: the trait `std::cmp::PartialEq<{integer}>` is not implemented for `&i32`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0277`.
Run Code Online (Sandbox Code Playgroud)

我通过比较两个值而不是一个地址和一个值的逻辑来修复它。

fn is_five(x: &i32) -> bool {
    *x == 5
}
Run Code Online (Sandbox Code Playgroud)

然而,我也尝试(随机)使用借用方法,令我惊讶的是,它有效。 …

rust

8
推荐指数
1
解决办法
2410
查看次数

即使设置了SLUGIFY_USES_TEXT_UNIDECODE和AIRFLOW_GPL_UNIDECODE,也无法安装Airflow

airflow通过命令安装: python3 setup.py install。它包含需求文件,requirements/athena.txt该文件是:

apache-airflow [芹菜,postgres,蜂巢,密码,密码] == 1.10.1

我收到一个错误:

RuntimeError: By default one of Airflow's dependencies installs a GPL dependency (unidecode). To avoid this dependency set SLUGIFY_USES_TEXT_UNIDECODE=yes in your environment when you install or upgrade Airflow. To force installing the GPL version set AIRFLOW_GPL_UNIDECODE
Run Code Online (Sandbox Code Playgroud)

要消除此错误,请设置export SLUGIFY_USES_TEXT_UNIDECODE=yesexport AIRFLOW_GPL_UNIDECODE=yes。但是,运行命令python3 setup.py install仍会给出相同的错误,没有任何更改。要检查环境变量:

?  athena-py git:(pyspark-DataFrameStatFunctions) echo $SLUGIFY_USES_TEXT_UNIDECODE
yes
?  athena-py git:(pyspark-DataFrameStatFunctions) echo $AIRFLOW_GPL_UNIDECODE
yes
Run Code Online (Sandbox Code Playgroud)

python pip python-3.x airflow

7
推荐指数
1
解决办法
182
查看次数

为什么在 Rust 中不建议递归?

我熟悉关于递归的一般意识 - 不要使用它,因为它不是一个好的内存管理实践。然而,这个概念应该适用于所有的编程语言,除非它们能很好地处理递归下的内存管理。

在阅读Educative 的 Rust 课程文档时,有一个声明:

在 Rust 中递归是可能的,但并不真正鼓励它。相反,Rust 支持称为迭代的东西,也称为循环。

我无法理解为什么会这样?与其他语言相比,Rust 中是否有一些不那么常见的东西,即不建议使用递归或在 Rust 中比其他语言更好地处理迭代?

recursion rust

7
推荐指数
1
解决办法
719
查看次数

How to resolve duplicate column names while joining two dataframes in PySpark?

I have a file A and B which are exactly the same. I am trying to perform inner and outer joins on these two dataframes. Since I have all the columns as duplicate columns, the existing answers were of no help. The other questions that I have gone through contain a col or two as duplicate, my issue is that the whole files are duplicates of each other: both in data and in column names.

My code:

import sys
from …
Run Code Online (Sandbox Code Playgroud)

python apache-spark apache-spark-sql pyspark

6
推荐指数
1
解决办法
2万
查看次数

AWS Glue:由于缺少元数据而无法启动作业运行

为了使用 boto3 运行作业,仅需要文档说明。JobName但是,我的代码:

    def start_job_run(self, name):
        print("The name of the job to be run via client is: {}".format(name))
        self.response_de_start_job = self.client.start_job_run(
            JobName=name
        )
        print(self.response_de_start_job)
Run Code Online (Sandbox Code Playgroud)

客户是:

    self.client = boto3.client(
            'glue',
            region_name='ap-south-1',
            aws_access_key_id=os.getenv('AWS_ACCESS_KEY_ID'),
            aws_secret_access_key=os.getenv('AWS_SECRET_ACCESS_KEY'),
        )
Run Code Online (Sandbox Code Playgroud)

当通过Python3执行时,给出错误:

botocore.errorfactory.EntityNotFoundException: An error occurred (EntityNotFoundException) when calling the StartJobRun operation: Failed to start job run due to missing metadata
Run Code Online (Sandbox Code Playgroud)

但是当我从 UI 和 cli( ) 对同一个作业执行相同的操作时aws glue start-job-run --job-name march15_9,它工作正常。

python python-3.x boto3 aws-glue

5
推荐指数
2
解决办法
2万
查看次数

更改配置后如何部署发布?

我已经在我的集群中成功发布了 jhub。然后我更改了配置以提取另一个 docker 镜像,如文档中所述。

这一次,在运行相同的旧命令时:

# Suggested values: advanced users of Kubernetes and Helm should feel
# free to use different values.
RELEASE=jhub
NAMESPACE=jhub

helm upgrade --install $RELEASE jupyterhub/jupyterhub \
  --namespace $NAMESPACE  \
  --version=0.8.2 \
  --values jupyter-hub-config.yaml
Run Code Online (Sandbox Code Playgroud)

jupyter-hub-config.yaml文件在哪里:

proxy:
  secretToken: "<a secret token>"
singleuser:
  image:
    # Get the latest image tag at:
    # https://hub.docker.com/r/jupyter/datascience-notebook/tags/
    # Inspect the Dockerfile at:
    # https://github.com/jupyter/docker-stacks/tree/master/datascience-notebook/Dockerfile
    name: jupyter/datascience-notebook
    tag: 177037d09156
Run Code Online (Sandbox Code Playgroud)

我遇到以下问题:

UPGRADE FAILED
ROLLING BACK
Error: "jhub" has no deployed releases
Error: …
Run Code Online (Sandbox Code Playgroud)

kubernetes jupyter devops jupyterhub kubernetes-helm

5
推荐指数
1
解决办法
1342
查看次数

如何在Apache Airflow中同时使用Celery Executor和Kubernetes Executor?

我使用Celery Executor有多个dag,但是我想使用Kubernetes Executor运行一个特定的dag。我无法推断出实现此目标的好方法。

我已经airflow.cfg声明CeleryExecutor要在其中使用。而且我不想更改它,因为除了一只狗以外,其他所有狗狗都确实需要它。

# The executor class that airflow should use. Choices include
# SequentialExecutor, LocalExecutor, CeleryExecutor
executor = CeleryExecutor
Run Code Online (Sandbox Code Playgroud)

我的问题代码:

from datetime import datetime, timedelta

from airflow import DAG
from airflow.contrib.operators.kubernetes_pod_operator import \
    KubernetesPodOperator
from airflow.operators.dummy_operator import DummyOperator

default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'start_date': datetime.utcnow(),
    'email': ['airflow@example.com'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': timedelta(minutes=5)
}

dag = DAG(
    'kubernetes_sample_1', default_args=default_args)


start = DummyOperator(task_id='run_this_first', dag=dag)

passing = KubernetesPodOperator(namespace='default',
                                image="Python:3.6",
                                cmds=["Python", "-c"],
                                arguments=["print('hello …
Run Code Online (Sandbox Code Playgroud)

python celery python-3.x kubernetes airflow

5
推荐指数
1
解决办法
175
查看次数

如何解决使用 Spark 从 S3 重新分区大量数据时从内存中逐出缓存的表分区元数据的问题?

在尝试从 S3 重新分区数据帧时,我收到一个一般错误:

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 33 in stage 1.0 failed 4 times, most recent failure: Lost task 33.4 in stage 1.0 (TID 88, 172.44.16.141, executor 7): ExecutorLostFailure (executor 7 exited caused by one of the running tasks) Reason: worker lost
Run Code Online (Sandbox Code Playgroud)

当我检查驱动程序日志时,我在警告后看到相同的一般错误:

20/07/22 15:47:21 WARN SharedInMemoryCache: Evicting cached table partition metadata from memory due to size constraints (spark.sql.hive.filesourcePartitionFileCacheSize = 262144000 bytes). This may impact query planning performance.
Run Code Online (Sandbox Code Playgroud)

尽管我理性地调整了 Spark,但我无法理解为什么我会面临这个警告。

我在这里读到了这个警告。

我的 …

python hadoop hive apache-spark pyspark

5
推荐指数
0
解决办法
2080
查看次数