0

I am learning scrapy and along with sqlalchemy I got one problem I want to replace the current stock data every time the website is scraped. I came from django background there was update_or_create method as the data used to be replaced on every new update. I have setup pipelines.py in scrapy as following:

class ScrapySpiderPipeline:
    def __init__(self):
        """
        Initializes database connection and sessionmaker.
        Creates deals table.
        """
        engine = db_connect()
        create_nepse_table(engine)
        self.Session = sessionmaker(bind=engine)

    def process_item(self, item, spider):
        """Save deals in the database.

    This method is called for every item pipeline component.

    """
        session = self.Session()
        nepse = NepseDb(**item)
    
        try:
            session.add(nepse)
            session.commit()
        except:
            session.rollback()
            raise
        finally:
            session.close()
    
        return item

and my models.py is as following

DeclarativeBase = declarative_base()


def db_connect():
    """
    Performs database connection using database settings from settings.py.
    Returns sqlalchemy engine instance
    """
    return create_engine(get_project_settings().get("CONNECTION_STRING"))


def create_table(engine):
    DeclarativeBase.metadata.create_all(engine)


class NepseDB(DeclarativeBase):
    __tablename__ = "nepse_data"

    id = Column(Integer, primary_key=True)
    index = Column('index', Text())

How do I create or update my index data on next scrape? Any hint will be appreciated.

nava
  • 683
  • 3
  • 17

0 Answers0