使用SqlAlchemy将数据保存到数据库中,对象不可下标

Too*_*iry 4 python sqlalchemy python-3.x

我正在尝试将一些数据插入到数据库中,不幸的是它失败了并且没有保存,我怀疑我的数据结构不好。在尝试保存“打印(标题,链接,日期)”之前,数据在 process_item 中被很好地打印出来(每个对象 1 个标题、链接和日期),但是它无法保存它。标题、链接和日期各包含 1 个字符串...

感谢您的帮助


错误:

"Traceback (most recent call last):
  File "spider.py", line 63, in <module>
    presstv = spider_html(presstv_url, presstv_extract_item, presstv_xpath, presstv_pipeline)
  File "spider.py", line 58, in spider_html
    pipeline.process_item(extract_function(element), None)
  File "/Users/dav/Projects/python/news/pipeline.py", line 76, in process_item
    if session.query(Presstv).filter_by(link=item['link']) == None:
TypeError: 'Presstv' object is not subscriptable"
Run Code Online (Sandbox Code Playgroud)



代码

from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine
from models import Nordfront, Presstv, db_connect, create_presstv_table
import json



class PresstvPipeline(object):
    """Pipeline for storing scraped items in the database"""
    def __init__(self):
        """
        Initializes database connection and sessionmaker.
        Creates deals table.
        """
        engine = db_connect()
        create_presstv_table(engine)
        self.Session = sessionmaker(bind=engine)


    def process_item(self, items, spider):

        session = self.Session()


        for title, link, date in zip(items['title'], items['link'], items['date']):

            print(title, link, date)
            item = Presstv(title = title, link = link, date = date)


            if session.query(Presstv).filter_by(link=item['link']) == None:
                try:
                    session.add(item)
                    session.commit()
                    logger.info('Item saved')
                except:
                    session.rollback()
                    raise
                finally:
                    session.close()

                return item
Run Code Online (Sandbox Code Playgroud)



模型:

class Presstv(DeclarativeBase):
    """Sqlalchemy deals model"""
    __tablename__ = "presstv"

    id = Column(Integer, primary_key=True)
    title = Column('title', String)
    description = Column('description', String, nullable=True)
    link = Column('link', String, unique=True)
    date = Column('date', String, nullable=True)
    created_at = Column('created_at', DateTime, default=_get_date)
Run Code Online (Sandbox Code Playgroud)

Sim*_*ser 8

你应该使用:

if session.query(Presstv).filter_by(link=item.link) == None:
Run Code Online (Sandbox Code Playgroud)

asitem现在是来自 SQLAlchemy 的对象。这可能是因为您在此items['link']之前使用了几行,但item现在是该类的一个实例,因此您应该使用.link.