我有一个代表交付的 SQLAlchemy 模型;交货有目的地、包裹 ID 和日期:
class Delivery(Base):
delivery_id = Column(Integer, primary_key=True, autoincrement=True)
parcel_id = Column(ForeignKey('parcels.parcel_id'))
scheduled_date = Column(DateTime)
destination_id = Column(ForeignKey('location.location_id'))
Run Code Online (Sandbox Code Playgroud)
现在,同一包裹的投递始发地与先前投递的目的地相同。我没有通过维护基于指针的链接列表来非规范化该信息,而是使用计划日期来订购交货,目前如下所示:
def origin(delivery):
prior = session.query(Delivery)
.filter(
Delivery.parcel_id == delivery.parcel_id,
Delivery.scheduled_date < delivery.scheduled_date,
)
.order_by(Delivery.scheduled_date.desc())
.first()
return prior.location_id if prior else None
Run Code Online (Sandbox Code Playgroud)
在纯 SQL 中,我可以将这个单独的查询转换为一个简单的子查询 + 连接,我在加载交付时将其包含在内。我已经足够了,我可以加载当前交付之前发生的所有相关交付:
_prior_delivery = \
select([Delivery.parcel_id, Delivery.scheduled_date, Location]) \
.where(and_(Location.location_id == remote(Delivery.location_id)) \
.order_by(Delivery.scheduled_date.desc()) \
.alias("prior_delivery")
Delivery.origin = relationship(
Location,
primaryjoin=and_(_prior_delivery.c.parcel_id == foreign(Delivery.parcel_id),
_prior_delivery.c.scheduled_date < foreign(Delivery.scheduled_date)),
secondary=_prior_delivery,
secondaryjoin=_prior_delivery.c.location_id == foreign(Location.location_id),
uselist=False,
viewonly=True)
Run Code Online (Sandbox Code Playgroud)
因为uselist=False,这实际上是有效的;但在幕后,它会返回当前交付之前发生的每一次交付;SQLAlchemy 打印一条警告,并且结果集比需要的大得多。
我的问题:如何将 a 应用于limit(1)此只读关系?
这很困难的原因是关系需要能够连接到主查询中。SQLAlchemy 需要能够在同一个查询中加载关系才能实现预加载。问题是,如何编写一个加载s列表Delivery及其每个origins 的查询?
SELECT delivery.*, location.* FROM delivery
LEFT JOIN location ON location.location_id = (
SELECT destination_id FROM delivery prior
WHERE delivery.parcel_id = prior.parcel_id
ORDER BY prior.scheduled_date DESC
LIMIT 1
);
Run Code Online (Sandbox Code Playgroud)
实际上,相关子查询
SELECT destination_id FROM delivery prior
WHERE delivery.parcel_id = prior.parcel_id
ORDER BY prior.scheduled_date DESC
LIMIT 1
Run Code Online (Sandbox Code Playgroud)
成为一个计算外键origin_id,您可以通过它连接到location表。将其转换为 SQLAlchemy,它会是这样的:
delivery = Delivery.__table__
location = Location.__table__
prior = alias(delivery, "prior")
_origin_id = select([prior.c.destination_id])\
.where(delivery.c.parcel_id == prior.c.parcel_id)\
.order_by(prior.c.scheduled_date.desc())\
.limit(1)
Delivery.origin = relationship(
Location,
primaryjoin=_origin_id == location.c.location_id,
viewonly=True)
Run Code Online (Sandbox Code Playgroud)
不幸的是,对于我尝试过的所有和注释的组合,这似乎不起作用。remoteforeign
SELECT带有相关子查询的 a 作为secondary下一个最佳解决方案是使用假辅助表:
SELECT delivery.*, location.* FROM delivery
LEFT JOIN (
SELECT delivery.delivery_id, (
SELECT destination_id FROM delivery prior
WHERE delivery.parcel_id = prior.parcel_id
ORDER BY prior.scheduled_date DESC
LIMIT 1
) AS origin_id FROM delivery
) delivery_origin ON delivery.delivery_id = delivery_origin.delivery_id
LEFT JOIN location ON delivery_origin.origin_id = location.location_id;
Run Code Online (Sandbox Code Playgroud)
在 SQLAlchemy 中,这是:
delivery = Delivery.__table__
location = Location.__table__
current = alias(delivery, "current")
prior = alias(delivery, "prior")
_origin_id = select([prior.c.destination_id])\
.where(current.c.parcel_id == prior.c.parcel_id)\
.order_by(prior.c.scheduled_date.desc())\
.limit(1)\
.label("origin_id")
delivery_origin = select([
UnaryExpression(current.c.delivery_id, operator=custom_op("")).label("delivery_id"),
_origin_id,
]).select_from(current)
Delivery.origin = relationship(
Location,
primaryjoin=delivery.c.delivery_id == foreign(delivery_origin.c.delivery_id),
secondaryjoin=foreign(delivery_origin.c.origin_id) == location.c.location_id,
secondary=delivery_origin,
viewonly=True,
uselist=False)
Run Code Online (Sandbox Code Playgroud)
不幸的是,似乎有一个错误(可能与此问题相关)导致 SQLAlchemy 发出错误的连接,因此我们需要应用一个小技巧:
delivery = Delivery.__table__
location = Location.__table__
current = alias(delivery, "current")
prior = alias(delivery, "prior")
# HACK: wrap delivery_id in an empty unary operator
_delivery_id = UnaryExpression(current.c.delivery_id, operator=custom_op(""))\
.label("delivery_id")
# /HACK
_origin_id = select([prior.c.destination_id])\
.where(current.c.parcel_id == prior.c.parcel_id)\
.order_by(prior.c.scheduled_date.desc())\
.limit(1)\
.label("origin_id")
delivery_origin = select([
_delivery_id,
_origin_id,
]).select_from(current)
Delivery.origin = relationship(
Location,
primaryjoin=delivery.c.delivery_id == foreign(delivery_origin.c.delivery_id),
secondaryjoin=foreign(delivery_origin.c.origin_id) == location.c.location_id,
secondary=delivery_origin,
viewonly=True,
uselist=False)
Run Code Online (Sandbox Code Playgroud)
SELECT带有窗口函数的secondary可能具有更好性能特征的另一种实现是使用窗口函数:
SELECT delivery.*, location.* FROM delivery
LEFT JOIN (
SELECT
delivery.delivery_id,
lag(delivery.delivery_id) OVER (PARTITION BY delivery.parcel_id ORDER BY delivery.scheduled_date) AS origin_id
FROM delivery
) delivery_origin ON delivery.delivery_id = delivery_origin.delivery_id
LEFT JOIN location ON delivery_origin.origin_id = location.location_id;
Run Code Online (Sandbox Code Playgroud)
和以前一样,我们需要应用类似的 hack 来让 SQLAlchemy 生成正确的 SQL:
delivery = Delivery.__table__
location = Location.__table__
current = alias(delivery, "current")
prior = alias(delivery, "prior")
# HACK: wrap delivery_id in an empty unary operator
_delivery_id = UnaryExpression(current.c.delivery_id, operator=custom_op(""))\
.label("delivery_id")
# /HACK
_origin_id = func.lag(current.c.delivery_id)\
.over(partition_by=current.c.parcel_id,
order_by=current.c.scheduled_date)\
.label("origin_id")
delivery_origin = select([
_delivery_id,
_origin_id,
]).select_from(current)
Delivery.origin = relationship(
Location,
primaryjoin=delivery.c.delivery_id == foreign(delivery_origin.c.delivery_id),
secondaryjoin=foreign(delivery_origin.c.origin_id) == location.c.location_id,
secondary=delivery_origin,
viewonly=True,
uselist=False)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1712 次 |
| 最近记录: |