I am learning scrapy and along with sqlalchemy I got one problem I want to replace the current stock data every time the website is scraped. I came from django background there was update_or_create method as the data used to be replaced on every new update.
I have setup pipelines.py in scrapy as following:
class ScrapySpiderPipeline:
def __init__(self):
"""
Initializes database connection and sessionmaker.
Creates deals table.
"""
engine = db_connect()
create_nepse_table(engine)
self.Session = sessionmaker(bind=engine)
def process_item(self, item, spider):
"""Save deals in the database.
This method is called for every item pipeline component.
"""
session = self.Session()
nepse = NepseDb(**item)
try:
session.add(nepse)
session.commit()
except:
session.rollback()
raise
finally:
session.close()
return item
and my models.py is as following
DeclarativeBase = declarative_base()
def db_connect():
"""
Performs database connection using database settings from settings.py.
Returns sqlalchemy engine instance
"""
return create_engine(get_project_settings().get("CONNECTION_STRING"))
def create_table(engine):
DeclarativeBase.metadata.create_all(engine)
class NepseDB(DeclarativeBase):
__tablename__ = "nepse_data"
id = Column(Integer, primary_key=True)
index = Column('index', Text())
How do I create or update my index data on next scrape? Any hint will be appreciated.