小编Avi*_*ava的帖子

无法指定“edition2021”以便在 Rust 中使用不稳定的包

我想通过 Cargo 运行一个示例，但遇到错误：

error: failed to parse manifest at `/Users/aviralsrivastava/dev/subxt/Cargo.toml`

Run Code Online (Sandbox Code Playgroud)

完整的堆栈跟踪是：

error: failed to parse manifest at `/Users/aviralsrivastava/dev/subxt/Cargo.toml`

Caused by:
  feature `edition2021` is required

  The package requires the Cargo feature called `edition2021`, but that feature is not stabilized in this version of Cargo (1.56.0-nightly (b51439fd8 2021-08-09)).
  Consider adding `cargo-features = ["edition2021"]` to the top of Cargo.toml (above the [package] table) to tell Cargo you are opting in to use this unstable feature.
  See https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#edition-2021 for more information about the status …

Run Code Online (Sandbox Code Playgroud)

rust substrate polkadot

Avi*_*ava

lucky-day

43
推荐指数

3
解决办法

3万
查看次数

如何避免 Papermill 中名为“kernelspec”的 Keyerror？

我正在使用气流（docker）运行造纸厂命令。该脚本存储在 S3 上，我使用纸厂的 Python 客户端运行它。它以一个完全无法理解的错误结束：

Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/ipython_genutils/ipstruct.py", line 132, in __getattr__
result = self[key]
KeyError: 'kernelspec'

Run Code Online (Sandbox Code Playgroud)

我试图查看文档，但徒劳无功。

我正在使用的代码是运行 papermill 命令：

import time
from datetime import datetime, timedelta

from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from mypackage.datastore import db
from mypackage.workflow.transform.jupyter_notebook import run_jupyter_notebook


dag_id = "jupyter-test-dag"
default_args = {
    'owner': "aviral",
    'depends_on_past': False,
    'start_date': "2019-02-28T00:00:00",
    'email': "aviral@some_org.com",
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 0,
    'retry_delay': timedelta(minutes=5),
    'provide_context': True
}

dag = DAG(
    dag_id,
    catchup=False,
    default_args=default_args,
    schedule_interval=None, …

Run Code Online (Sandbox Code Playgroud)

python python-3.x jupyter jupyter-notebook papermill

Avi*_*ava

lucky-day

10
推荐指数

1
解决办法

1053
查看次数

Rust 中两个引用变量如何相等？

我的代码在其中一个函数中存在错误：

fn is_five(x: &i32) -> bool {
    x == 5
}

fn main() {
    assert!(is_five(&5));
    assert!(!is_five(&6));
    println!("Success!");
}

Run Code Online (Sandbox Code Playgroud)

运行时，报错为：

error[E0277]: can't compare `&i32` with `{integer}`
 --> main.rs:2:7
  |
2 |     x == 5
  |       ^^ no implementation for `&i32 == {integer}`
  |
  = help: the trait `std::cmp::PartialEq<{integer}>` is not implemented for `&i32`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0277`.

Run Code Online (Sandbox Code Playgroud)

我通过比较两个值而不是一个地址和一个值的逻辑来修复它。

fn is_five(x: &i32) -> bool {
    *x == 5
}

Run Code Online (Sandbox Code Playgroud)

然而，我也尝试（随机）使用借用方法，令我惊讶的是，它有效。 …

rust

Avi*_*ava

lucky-day

8
推荐指数

1
解决办法

2410
查看次数

即使设置了SLUGIFY_USES_TEXT_UNIDECODE和AIRFLOW_GPL_UNIDECODE，也无法安装Airflow

我airflow通过命令安装： python3 setup.py install。它包含需求文件，requirements/athena.txt该文件是：

apache-airflow [芹菜，postgres，蜂巢，密码，密码] == 1.10.1

我收到一个错误：

RuntimeError: By default one of Airflow's dependencies installs a GPL dependency (unidecode). To avoid this dependency set SLUGIFY_USES_TEXT_UNIDECODE=yes in your environment when you install or upgrade Airflow. To force installing the GPL version set AIRFLOW_GPL_UNIDECODE

Run Code Online (Sandbox Code Playgroud)

要消除此错误，请设置export SLUGIFY_USES_TEXT_UNIDECODE=yes和export AIRFLOW_GPL_UNIDECODE=yes。但是，运行命令python3 setup.py install仍会给出相同的错误，没有任何更改。要检查环境变量：

?  athena-py git:(pyspark-DataFrameStatFunctions) echo $SLUGIFY_USES_TEXT_UNIDECODE
yes
?  athena-py git:(pyspark-DataFrameStatFunctions) echo $AIRFLOW_GPL_UNIDECODE
yes

Run Code Online (Sandbox Code Playgroud)

python pip python-3.x airflow

Avi*_*ava

lucky-day

7
推荐指数

1
解决办法

182
查看次数

为什么在 Rust 中不建议递归？

我熟悉关于递归的一般意识 - 不要使用它，因为它不是一个好的内存管理实践。然而，这个概念应该适用于所有的编程语言，除非它们能很好地处理递归下的内存管理。

在阅读Educative 的 Rust 课程的文档时，有一个声明：

在 Rust 中递归是可能的，但并不真正鼓励它。相反，Rust 支持称为迭代的东西，也称为循环。

我无法理解为什么会这样？与其他语言相比，Rust 中是否有一些不那么常见的东西，即不建议使用递归或在 Rust 中比其他语言更好地处理迭代？

recursion rust

Avi*_*ava

lucky-day

7
推荐指数

1
解决办法

719
查看次数

How to resolve duplicate column names while joining two dataframes in PySpark?

I have a file A and B which are exactly the same. I am trying to perform inner and outer joins on these two dataframes. Since I have all the columns as duplicate columns, the existing answers were of no help. The other questions that I have gone through contain a col or two as duplicate, my issue is that the whole files are duplicates of each other: both in data and in column names.

My code:

import sys
from …

Run Code Online (Sandbox Code Playgroud)

python apache-spark apache-spark-sql pyspark

Avi*_*ava

2019 03-12

6
推荐指数

1
解决办法

2万
查看次数

AWS Glue：由于缺少元数据而无法启动作业运行

为了使用 boto3 运行作业，仅需要文档说明。JobName但是，我的代码：

    def start_job_run(self, name):
        print("The name of the job to be run via client is: {}".format(name))
        self.response_de_start_job = self.client.start_job_run(
            JobName=name
        )
        print(self.response_de_start_job)

Run Code Online (Sandbox Code Playgroud)

客户是：

    self.client = boto3.client(
            'glue',
            region_name='ap-south-1',
            aws_access_key_id=os.getenv('AWS_ACCESS_KEY_ID'),
            aws_secret_access_key=os.getenv('AWS_SECRET_ACCESS_KEY'),
        )

Run Code Online (Sandbox Code Playgroud)

当通过Python3执行时，给出错误：

botocore.errorfactory.EntityNotFoundException: An error occurred (EntityNotFoundException) when calling the StartJobRun operation: Failed to start job run due to missing metadata

Run Code Online (Sandbox Code Playgroud)

但是当我从 UI 和 cli( ) 对同一个作业执行相同的操作时aws glue start-job-run --job-name march15_9，它工作正常。

python python-3.x boto3 aws-glue

Avi*_*ava

2019 03-15

5
推荐指数

2
解决办法

2万
查看次数

更改配置后如何部署发布？

我已经在我的集群中成功发布了 jhub。然后我更改了配置以提取另一个 docker 镜像，如文档中所述。

这一次，在运行相同的旧命令时：

# Suggested values: advanced users of Kubernetes and Helm should feel
# free to use different values.
RELEASE=jhub
NAMESPACE=jhub

helm upgrade --install $RELEASE jupyterhub/jupyterhub \
  --namespace $NAMESPACE  \
  --version=0.8.2 \
  --values jupyter-hub-config.yaml

Run Code Online (Sandbox Code Playgroud)

jupyter-hub-config.yaml文件在哪里：

proxy:
  secretToken: "<a secret token>"
singleuser:
  image:
    # Get the latest image tag at:
    # https://hub.docker.com/r/jupyter/datascience-notebook/tags/
    # Inspect the Dockerfile at:
    # https://github.com/jupyter/docker-stacks/tree/master/datascience-notebook/Dockerfile
    name: jupyter/datascience-notebook
    tag: 177037d09156

Run Code Online (Sandbox Code Playgroud)

我遇到以下问题：

UPGRADE FAILED
ROLLING BACK
Error: "jhub" has no deployed releases
Error: …

Run Code Online (Sandbox Code Playgroud)

kubernetes jupyter devops jupyterhub kubernetes-helm

Avi*_*ava

lucky-day

5
推荐指数

1
解决办法

1342
查看次数

如何在Apache Airflow中同时使用Celery Executor和Kubernetes Executor？

我使用Celery Executor有多个dag，但是我想使用Kubernetes Executor运行一个特定的dag。我无法推断出实现此目标的好方法。

我已经airflow.cfg声明CeleryExecutor要在其中使用。而且我不想更改它，因为除了一只狗以外，其他所有狗狗都确实需要它。

# The executor class that airflow should use. Choices include
# SequentialExecutor, LocalExecutor, CeleryExecutor
executor = CeleryExecutor

Run Code Online (Sandbox Code Playgroud)

我的问题代码：

from datetime import datetime, timedelta

from airflow import DAG
from airflow.contrib.operators.kubernetes_pod_operator import \
    KubernetesPodOperator
from airflow.operators.dummy_operator import DummyOperator

default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'start_date': datetime.utcnow(),
    'email': ['airflow@example.com'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': timedelta(minutes=5)
}

dag = DAG(
    'kubernetes_sample_1', default_args=default_args)


start = DummyOperator(task_id='run_this_first', dag=dag)

passing = KubernetesPodOperator(namespace='default',
                                image="Python:3.6",
                                cmds=["Python", "-c"],
                                arguments=["print('hello …

Run Code Online (Sandbox Code Playgroud)

python celery python-3.x kubernetes airflow

Avi*_*ava

lucky-day

5
推荐指数

1
解决办法

175
查看次数

如何解决使用 Spark 从 S3 重新分区大量数据时从内存中逐出缓存的表分区元数据的问题？

在尝试从 S3 重新分区数据帧时，我收到一个一般错误：

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 33 in stage 1.0 failed 4 times, most recent failure: Lost task 33.4 in stage 1.0 (TID 88, 172.44.16.141, executor 7): ExecutorLostFailure (executor 7 exited caused by one of the running tasks) Reason: worker lost

Run Code Online (Sandbox Code Playgroud)

当我检查驱动程序日志时，我在警告后看到相同的一般错误：

20/07/22 15:47:21 WARN SharedInMemoryCache: Evicting cached table partition metadata from memory due to size constraints (spark.sql.hive.filesourcePartitionFileCacheSize = 262144000 bytes). This may impact query planning performance.

Run Code Online (Sandbox Code Playgroud)

尽管我理性地调整了 Spark，但我无法理解为什么我会面临这个警告。

我在这里读到了这个警告。

我的 …

python hadoop hive apache-spark pyspark

Avi*_*ava

2020 07-23

5
推荐指数

0
解决办法

2080
查看次数