Afe*_*ziz 8 python postgresql sqlalchemy psycopg2
我有一张超过1000万行的表.大约有50多列.该表存储传感器数据/参数.假设我需要查询全天或86,400秒的数据.完成此查询需要大约20秒或更长时间.
我在几个列上添加了单独的索引,例如recordTimestamp(捕获数据时存储),deviceId(传感器的标识),positionValid(GPS地理定位是否有效).然后我添加了一个包含所有三列的复合索引.
以下是我的查询:
t1 = time.time()
conn = engine.connect()
select_statement = select([Datatable]).where(and_(
Datatable.recordTimestamp >= start_date,
Datatable.recordTimestamp <= end_date,
Datatable.deviceId == device_id,
Datatable.positionValid != None,
Datatable.recordTimestamp % query_interval == 0))
lol_data = conn.execute(select_statement).fetchall()
conn.close()
t2 = time.time()
time_taken = t2 - t1
print('Select: ' + time_taken)
Run Code Online (Sandbox Code Playgroud)
以下是我的EXPLAIN ANALYZE语句:
EXPLAIN ANALYZE SELECT datatable.id, datatable."createdAt", datatable."analogInput01", datatable."analogInput02", datatable."analogInput03", datatable."analogInput04", datatable."analogInput05", datatable."analogInput06", datatable."analogInput07", datatable."canEngineRpm", datatable."canEngineTemperature", datatable."canFuelConsumedLiters", datatable."canFuelLevel", datatable."canVehicleMileage", datatable."deviceId", datatable."deviceTemperature", datatable."deviceInternalVoltage", datatable."deviceExternalVoltage", datatable."deviceAntennaCut", datatable."deviceEnum", datatable."deviceVehicleMileage", datatable."deviceSimSignal", datatable."deviceSimStatus", datatable."iButton01", datatable."iButton02", datatable."recordSequence", datatable."recordTimestamp", datatable."accelerationAbsolute", datatable."accelerationBrake", datatable."accelerationBump", datatable."accelerationTurn", datatable."accelerationX", datatable."accelerationY", datatable."accelerationZ", datatable."positionAltitude", datatable."positionDirection", datatable."positionSatellites", datatable."positionSpeed", datatable."positionLatitude", datatable."positionLongitude", datatable."positionHdop", datatable."positionMovement", datatable."positionValid", datatable."positionEngine" FROM datatable WHERE datatable."recordTimestamp" >= 1519744521 AND datatable."recordTimestamp" <= 1519745181 AND datatable."deviceId" = '864495033990901' AND datatable."positionValid" IS NOT NULL AND datatable."recordTimestamp" % 1 = 0;
Run Code Online (Sandbox Code Playgroud)
以下是SELECT的EXPLAIN ANALYZE的结果:
Index Scan using "ix_dataTable_recordTimestamp" on dataTable (cost=0.44..599.35 rows=5 width=301) (actual time=0.070..10.487 rows=661 loops=1)
Index Cond: (("recordTimestamp" >= 1519744521) AND ("recordTimestamp" <= 1519745181))
Filter: (("positionValid" IS NOT NULL) AND (("deviceId")::text = '864495033990901'::text) AND (("recordTimestamp" % 1) = 0))
Rows Removed by Filter: 6970
Planning time: 0.347 ms
Execution time: 10.658 ms
Run Code Online (Sandbox Code Playgroud)
以下是Python计算的时间结果:
Select: 47.98712515830994
JSON: 0.19731807708740234
Run Code Online (Sandbox Code Playgroud)
以下是我的代码分析:
10302 function calls (10235 primitive calls) in 12.612 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 12.595 12.595 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py:882(execute)
1 0.000 0.000 12.595 12.595 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/sql/elements.py:267(_execute_on_connection)
1 0.000 0.000 12.595 12.595 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py:1016(_execute_clauseelement)
1 0.000 0.000 12.592 12.592 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py:1111(_execute_context)
1 0.000 0.000 12.590 12.590 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/engine/default.py:506(do_execute)
1 12.590 12.590 12.590 12.590 {method 'execute' of 'psycopg2.extensions.cursor' objects}
1 0.000 0.000 0.017 0.017 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/engine/result.py:1113(fetchall)
1 0.000 0.000 0.017 0.017 /Users/afeezaziz/Projects/Bursa/backend/env/lib/python3.6/site-packages/sqlalchemy/engine/result.py:1080(_fetchall_impl)
1 0.008 0.008 0.017 0.017 {method 'fetchall' of 'psycopg2.extensions.cursor' objects}
Run Code Online (Sandbox Code Playgroud)
小智 -1
"recordTimestamp","deviceId"和"positionValid",因此,只需确保您已经从 3 列创建了索引。select([Datatable])”中,我猜您选择了所有列,因此,正如您的描述,有 50 多个列,需要时间来解析数据并将数据发送到客户端。更清楚地说,添加索引只会帮助您的“执行时间”(查找结果的时间),但不会帮助您的“获取时间”(当您运行“lol_data = conn.execute(select_statement).fetchall()”时) 。"deviceId"、"recordTimestamp"、 值。"deviceId"您可以使用索引来更改(比较和发送字符串比使用整数花费更多时间)。| 归档时间: |
|
| 查看次数: |
1845 次 |
| 最近记录: |