小编Bri*_*ian的帖子

Hyperopt Spark 3.0 问题

我正在运行运行时 8.1(包括 Apache Spark 3.1.1、Scala 2.12),试图让 hyperopt 按定义工作

https://docs.databricks.com/applications/machine-learning/automl-hyperparam-tuning/hyperopt-spark-mlflow-integration.html

py4j.Py4JException: Method maxNumConcurrentTasks([]) does not exist
Run Code Online (Sandbox Code Playgroud)

当我尝试

spark_trials = SparkTrials()
Run Code Online (Sandbox Code Playgroud)

我需要做什么特别的事情才能使其正常工作吗?

这是我正在使用的集群

{
    "autoscale": {
        "min_workers": 1,
        "max_workers": 2
    },
    "cluster_name": "mlops_tiny_ml",
    "spark_version": "8.2.x-cpu-ml-scala2.12",
    "spark_conf": {},
    "aws_attributes": {
        "first_on_demand": 1,
        "availability": "SPOT_WITH_FALLBACK",
        "zone_id": "us-west-2b",
        "instance_profile_arn": "arn:aws:iam::112437402463:instance-profile/databricks_instance_role_s3",
        "spot_bid_price_percent": 100,
        "ebs_volume_type": "GENERAL_PURPOSE_SSD",
        "ebs_volume_count": 3,
        "ebs_volume_size": 100
    },
    "node_type_id": "m4.large",
    "driver_node_type_id": "m4.large",
    "ssh_public_keys": [],
    "custom_tags": {},
    "spark_env_vars": {},
    "autotermination_minutes": 120,
    "enable_elastic_disk": false,
    "cluster_source": "UI",
    "init_scripts": [],
    "cluster_id": "0xxxxxt404"
}
Run Code Online (Sandbox Code Playgroud)

这是我正在使用的代码 https://docs.databricks.com/applications/machine-learning/automl-hyperparam-tuning/hyperopt-model-selection.html

apache-spark databricks hyperopt

5
推荐指数
1
解决办法
1219
查看次数

使用GenericRecord在Avro中用数组填充嵌套记录

我有以下架构:

{
    "name": "AgentRecommendationList",
    "type": "record",
    "fields": [
        {
            "name": "userid",
            "type": "string"
        },
        {
            "name": "friends",
            "type": {
                "type": "array",
                "items": {
                    "name": "SchoolFriends",
                    "type": "record",
                    "fields": [
                        {
                            "name": "Name",
                            "type": "string"
                        },
                        {
                            "name": "phoneNumber",
                            "type": "string"
                        },
                        {
                            "name": "email",
                            "type": "string"
                        }
                    ]
                }
            }
        }
    ]
}
Run Code Online (Sandbox Code Playgroud)

我正在使用GenericRecord,并且想为SchoolFriends放入一个数组数组。

val avschema = new RestService(URL).getLatestVersion(name)
val schema = new Schema.Parser().parse(avschema.getSchema)
val record = new GenericData.Record(schema)
Run Code Online (Sandbox Code Playgroud)

我想做类似record.put(x)的事情

java scala avro

0
推荐指数
1
解决办法
2653
查看次数

雪花表存放在哪里?

我正在考虑为客户提供雪花,但我无法从文档中得知他们将数据存储在哪里?好像是s3,但为什么存储成本这么贵?数据是在用户的s3中还是雪花s3中?

snowflake-cloud-data-platform

0
推荐指数
1
解决办法
2406
查看次数