小编Alw*_*nny的帖子

减少 AWS Fargate 中的 TCP 连接重置

我在 AWS Fargate 上使用 Amazon ECS，我的实例可以访问互联网，但连接在350秒后断开。平均而言，在 100 次中，我的服务收到ConnectionResetError: [Errno 104] Connection Reset by Peer Error 的次数大约有 5 次。我发现了一些建议来解决我的服务器端代码上的该问题，请参阅此处和此处

原因

如果使用 NAT 网关的连接空闲 350 秒或更长时间，连接就会超时。

当连接超时时，NAT 网关会向 NAT 网关后面尝试继续连接的任何资源返回 RST 数据包（它不会发送 FIN 数据包）。

解决方案

为了防止连接被丢弃，您可以通过该连接启动更多流量。或者，您可以在实例上启用 TCP keepalive，其值小于 350 秒。

现有代码：

url = "url to call http"
params = {
   "year": year,
   "month": month
}
response = self.session.get(url, params=params)

Run Code Online (Sandbox Code Playgroud)

为了解决这个问题，我目前正在使用使用tenacity 的创可贴重试逻辑解决方案，

@retry( retry=( retry_if_not_exception_type( HTTPError ) # specific: requests.exceptions.ConnectionError ), reraise=True, wait=wait_fixed(2), stop=stop_after_attempt(5), ) …
Run Code Online (Sandbox Code Playgroud)

python amazon-web-services amazon-ecs python-requests aws-fargate

Alw*_*nny

2023 03-28

5
推荐指数

1
解决办法

1931
查看次数

熊猫爆炸功能不适用于字符串列列表

要像列到行一样分解列表，我们可以使用pandas expand()函数。我的熊猫版本 ' 0.25.3 '

给定的示例对我有用，并且 Stackoverflow.com 的另一个答案按预期工作，但不适用于我的数据集。

city nested_city 0 soto ['Soto'] 1 tera-kora ['Daniel'] 2 jan-thiel ['Jan Thiel'] 3 westpunt ['Westpunt'] 4 nieuwpoort ['Nieuwpoort', 'Santa Barbara Plantation']
Run Code Online (Sandbox Code Playgroud)
我尝试过的：

test_data['nested_city'].explode()
Run Code Online (Sandbox Code Playgroud)
和

test_data.set_index(['nested_city']).apply(pd.Series.explode).reset_index()
Run Code Online (Sandbox Code Playgroud)
输出

0 ['Soto'] 1 ['Daniel'] 2 ['Jan Thiel'] 3 ['Westpunt'] 4 ['Nieuwpoort', 'Santa Barbara Plantation'] Name: neighbors, dtype: object
Run Code Online (Sandbox Code Playgroud)

python pandas

Alw*_*nny

lucky-day

4
推荐指数

1
解决办法

3222
查看次数

将 Elasticsearch kibana 查询字符串格式转换为 URI 搜索格式

从上周开始，我开始在 AWS 上使用 Elastic Search Service。我当前的 Elasticseach 版本是 6.XX 和 Kibana 6.XX，现在我对 Kibana 客户端上运行的查询格式有点灵活。但我的问题是我无法将查询转换为将在Browser URL/Postman上运行的 URI 格式。例如：如何将其转换为 URI 搜索格式？。

GET my_index/_search { "query": { "geo_bounding_box": { "location": { "top_left": { "lat": 42, "lon": -72 }, "bottom_right": { "lat": 40, "lon": -74 } } } } }
Run Code Online (Sandbox Code Playgroud)
我在这里看到了有关 URI 搜索格式的文档，其中包含q、df等不同参数：https : //www.elastic.co/guide/en/elasticsearch/reference/6.0/search-uri-request.html但不能将上述查询字符串转换为 URI 搜索格式。实际上，我对支持q、fq、排序、开始、行、boost、facet、group等的SOLR 查询格式非常灵活。所以，据我所知，弹性搜索也使用了 Lucene 索引，所以我的基本问题是

1.如何将上述 ES 查询字符串转换为 URI 搜索格式？

2.如何轻松将SOLR查询转换为ES格式？

如果您帮我将上述查询字符串转换为URI 搜索格式，那么将我现有的复杂 …

lucene solr amazon-web-services elasticsearch kibana-5

Alw*_*nny

2018 01-21

3
推荐指数

1
解决办法

2367
查看次数

计算 PostgreSql 中时间列的总和

任何人都可以建议我，在中查找时间字段总和的最简单方法POSTGRESQL。我刚刚找到解决方案，MYSQL但我需要该POSTGRESQL版本。

MYSQL：/sf/ask/213846041/

SELECT SEC_TO_TIME(SUM(TIME_TO_SEC(timespent))) FROM myTable;
Run Code Online (Sandbox Code Playgroud)
演示数据

id time 1 1:23:23 2 4:00:23 3 9:23:23
Run Code Online (Sandbox Code Playgroud)
所需输出

14:47:09

postgresql sum

Alw*_*nny

2017 05-23

2
推荐指数

1
解决办法

2万
查看次数

GraphQL 需要模块外部与内部 GraphQLObjectType

可能标题不适合我的问题，但让我解释一下我的情况。我正在使用 Graphql 架构。这是我的初始 schema.js 文件https://github.com/sany2k8/graphql-udemy/blob/master/schema/schema.js

它工作正常然后我决定将它拆分为不同的小文件，例如root_query_type.js、mutation.js、user_type.js和company_type.js。所有文件都作为模块导出并循环需要。例如 -

用户类型.js

const graphql = require('graphql'); const axios = require('axios'); const { GraphQLObjectType, GraphQLString, GraphQLInt } = graphql; //const CompanyType = require('./company_type'); // *this line causing error* const UserType = new GraphQLObjectType({ name: "User", fields: () => ({ id:{ type: GraphQLString}, firstName:{ type: GraphQLString}, age:{ type: GraphQLInt}, company :{ type: require('./company_type'), // *this line fix the error* resolve(parentValue, args){ return axios.get(`http://localhost:3000/companies/${parentValue.companyId}`) .then(res => res.data) } …
Run Code Online (Sandbox Code Playgroud)

javascript node.js graphql graphql-js

Alw*_*nny

lucky-day

2
推荐指数

1
解决办法

735
查看次数

具有多线程的 ElasticSearch Scroll API

首先，我想让大家知道我知道ElasticSearch Scroll API如何工作的基本工作逻辑。要使用Scroll API，首先，我们需要使用一些滚动值（如1m ）调用search方法，然后它将返回一个_scroll_id，该_scroll_id将用于 Scroll 上的下一个连续调用，直到所有文档在循环中返回。但问题是我只想在多线程的基础上使用相同的进程，而不是串行。例如：

如果我有 300000 个文档，那么我想以这种方式处理/获取文档

第一个线程将处理初始100000 个文档

第二个线程将处理接下来的100000 个文档

第三个线程将处理剩余的100000 个文档

所以我的问题是，我没有找到任何方法来设置滚动 API 上的from值，如何使用线程使滚动过程更快。不要以序列化的方式处理文档。

我的示例 python 代码

if index_name is not None and doc_type is not None and body is not None: es = init_es() page = es.search(index_name,doc_type, scroll = '30s',size = 10, body = body) sid = page['_scroll_id'] scroll_size = page['hits']['total'] # Start scrolling while (scroll_size > …
Run Code Online (Sandbox Code Playgroud)

multithreading elasticsearch elasticsearch-py

Alw*_*nny

lucky-day

2
推荐指数

1
解决办法

6459
查看次数

Postgresql 用整数值更新列

我有一个 jsonb 类型的列，包含字符串或整数格式的元素列表。我现在想要的是将它们全部设为相同类型，例如全部为 int 或全部为字符串格式

尝试过：这样我得到单个元素，但我需要更新列表中的所有元素。

SELECT parent_path -> 1 AS path FROM abc LIMIT 10
Run Code Online (Sandbox Code Playgroud)
或者

Update abc SET parent_path = ARRAY[parent_path]::TEXT[] AS parent_path FROM abc
Run Code Online (Sandbox Code Playgroud)
或者

UPDATE abc SET parent_path = replace(parent_path::text, '"', '') where id=123
Run Code Online (Sandbox Code Playgroud)
电流输出

path [6123697, 178, 6023099] [625953521394212864, 117, 6023181] ["153", "6288361", "553248635949090971"] [553248635358954983, 178320, 174, 6022967] [6050684, 6050648, 120, 6022967] [653, 178238, 6239135, 38, 6023117] ["153", "6288496", "553248635977039112"] [553248635998143523, 6023185] [553248635976194501, 6022967] [553248635976195634, 6022967]
Run Code Online (Sandbox Code Playgroud)
预期产出

path [6123697, 178, 6023099] [625953521394212864, 117, 6023181] …
Run Code Online (Sandbox Code Playgroud)

postgresql

Alw*_*nny

2020 11-02

2
推荐指数

1
解决办法

31
查看次数

使用列表理解生成字典列表的问题

feed_mapping = {'BC': 11, 'HA':12, 'AB':16,'GR':18} x = ['AB-16007891', 'HA-4625798','GR-4444545','BC-4447764','HA-46257854'] feed = [{"feed": feed_mapping[i.split('-')[0]],"id":[i]} for i in x] print(feed)
Run Code Online (Sandbox Code Playgroud)
通过上面的列表理解，我可以生成字典列表。如果值相同，我需要将值附加到idfeed

电流输出：

[{'feed': 16, 'id': ['AB-16007891']}, {'feed': 12, 'id': ['HA-4625798']}, {'feed': 18, 'id': ['GR-4444545']}, {'feed': 11, 'id': ['BC-4447764']}, {'feed': 12, 'id': ['HA-46257854']}]
Run Code Online (Sandbox Code Playgroud)
预期产量：

[{'feed': 16, 'id': ['AB-16007891']}, {'feed': 12, 'id': ['HA-4625798','HA-46257854']}, {'feed': 18, 'id': ['GR-4444545']}, {'feed': 11, 'id': ['BC-4447764']}]
Run Code Online (Sandbox Code Playgroud)

python list-comprehension python-3.x

Alw*_*nny

lucky-day

1
推荐指数

1
解决办法

44
查看次数

AWS lambda 一次将多个图像放入对象

我正在尝试将源图像的大小调整为多个维度 + 扩展名。

例如：当我上传源图像时，比如说 abc.jpg，我需要使用 s3 事件触发器将其大小调整为 .jpg 和 .webp 的大小，例如abc_320.jpg、abc_320.webp、abc_640.jpg、abc_640.webp。因此，使用我当前的 python lambda 处理程序，我可以通过多次put_object调用目标存储桶来完成它，但我想让它更加优化，因为将来我的维度+扩展可能会增加。那么如何通过一次调用将所有调整大小的图像存储到目标存储桶？

当前的 Lambda 处理程序：

import json import boto3 import os from os import path from io import BytesIO from PIL import Image # boto3 S3 initialization s3_client = boto3.client("s3") def lambda_handler(event, context): destination_bucket_name = 'destination-bucket' # event contains all information about uploaded object print("Event :", event) # Bucket Name where file was uploaded source_bucket_name = event['Records'][0]['s3']['bucket']['name'] # Filename of object …
Run Code Online (Sandbox Code Playgroud)

python amazon-web-services python-imaging-library boto3 aws-lambda

Alw*_*nny

2021 08-02

1
推荐指数

1
解决办法

29
查看次数

python ijson 不能同时处理多个元素

我有数千个非常大的 JSON 文件，需要对特定元素进行处理。为了避免内存过载，我使用了一个名为ijson的 python 库，当我只处理 json 文件中的单个元素时，它工作得很好，但当我尝试一次处理多个元素时，它会通过

IncompleteJSONError：解析错误：过早的 EOF

部分 JSON：

{ "info": { "added": 1631536344.112968, "started": 1631537322.81162, "duration": 14, "ended": 1631537337.342377 }, "network": { "domains": [ { "ip": "231.90.255.25", "domain": "dns.msfcsi.com" }, { "ip": "12.23.25.44", "domain": "teo.microsoft.com" }, { "ip": "87.101.90.42", "domain": "www.msf.com" } ] } }
Run Code Online (Sandbox Code Playgroud)
工作代码：（打开多个文件）

my_file_list = [f for f in glob.glob("data/jsons/*.json")] final_result = [] for filename in my_file_list: row = {} with open(filename, 'r') as f: info = ijson.items(f, 'info') for o …
Run Code Online (Sandbox Code Playgroud)

python json ijson

Alw*_*nny

2021 12-04

0
推荐指数

1
解决办法

1154
查看次数

标签统计

python ×5

amazon-web-services ×3

elasticsearch ×2

postgresql ×2

amazon-ecs ×1

aws-fargate ×1

aws-lambda ×1

boto3 ×1

elasticsearch-py ×1

graphql ×1

graphql-js ×1

ijson ×1

javascript ×1

json ×1

kibana-5 ×1

list-comprehension ×1

lucene ×1

multithreading ×1

node.js ×1

pandas ×1

python-3.x ×1

python-imaging-library ×1

python-requests ×1

solr ×1

sum ×1

标签 统计

小编Alw_nny的帖子

标签统计