我有一个lambda进程,偶尔会轮询API以获取最新数据.这个数据有唯一的密钥,我想用Glue来更新MySQL中的表.是否可以使用此密钥覆盖数据?(类似于Spark的模式=覆盖).如果没有 - 我可以在插入所有新数据之前截断Glue中的表吗?
谢谢
我正在尝试从lambda调用Fargate(ECS)任务,并看到错误弹出窗口。我尝试查看源代码,但是由于它作为响应返回,因此尚不清楚发生了什么。我将不胜感激任何建议。错误消息和我的代码粘贴在下面。
主要信息是: com.amazonaws.services.ecs.model.InvalidParameterException: name cannot be blank
{
"errorMessage": "name cannot be blank. (Service: AmazonECS; Status Code: 400; Error Code: InvalidParameterException; Request ID: 15746fff-35e7-11e8-90bf-fb7a32bec470)",
"errorType": "com.amazonaws.services.ecs.model.InvalidParameterException",
"stackTrace": [
"com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1630)",
"com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1302)",
"com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)",
"com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)",
"com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)",
"com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)",
"com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)",
"com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)",
"com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)",
"com.amazonaws.services.ecs.AmazonECSClient.doInvoke(AmazonECSClient.java:2742)",
"com.amazonaws.services.ecs.AmazonECSClient.invoke(AmazonECSClient.java:2718)",
"com.amazonaws.services.ecs.AmazonECSClient.executeRunTask(AmazonECSClient.java:2042)",
"com.amazonaws.services.ecs.AmazonECSClient.runTask(AmazonECSClient.java:2017)",
"Lambda.triggerLoad(Lambda.scala:79)",
"Lambda.$anonfun$handleRequest$2(Lambda.scala:27)",
"Lambda.$anonfun$handleRequest$2$adapted(Lambda.scala:27)",
"scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:59)",
"scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:52)",
"scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)",
"Lambda.handleRequest(Lambda.scala:27)",
"sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)",
"sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)",
"sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)",
"java.lang.reflect.Method.invoke(Method.java:498)"
]
}
Run Code Online (Sandbox Code Playgroud)
这是代码:
import com.amazonaws.ClientConfiguration
import com.amazonaws.auth.DefaultAWSCredentialsProviderChain
import com.amazonaws.regions.{Region, Regions}
import com.amazonaws.services.ecs.{AmazonECSClient, AmazonECSClientBuilder}
import com.amazonaws.services.ecs.model._
import com.amazonaws.services.lambda.runtime.events.S3Event
import com.amazonaws.services.lambda.runtime.{Context, RequestHandler}
import scala.collection.JavaConverters._
class Lambda extends RequestHandler[S3Event, Unit] …Run Code Online (Sandbox Code Playgroud) 我一直在尝试使用datastax spark-cassandra连接器(https://github.com/datastax/spark-cassandra-connector)从csv文件导入一些数据.我知道大多数情况下可以在导入时使用案例类,但是我正在处理大约500个字段的行,因此我不能在没有嵌套的情况下使用它们(由于案例的22个字段限制).也可以直接存储地图,但我不认为这是理想的,因为有几种数据类型.
我可能在RDD [String] - > RDD [(String,String,...)]的转换中遗漏了一些东西,因为.split(",")只产生RDD [Array [String]].
我做了相当多的搜索没有太多运气,所以任何帮助将不胜感激!谢谢.
apache-spark ×1
aws-fargate ×1
aws-glue ×1
aws-lambda ×1
cassandra ×1
datastax ×1
mysql ×1
pyspark ×1
scala ×1