0

I have an sql table with 344 maximum columns of financial data of stocks. I have built a code to generate pandas dataframes of financial data in order to update the sql table data in batches/groups. When I run my python code to get the data for each group of stocks, I get different dataframe with potentially a different number of columns each time, as not all stocks have data on the 344 columns. In order to transfer the collected data from the pandas dataframes to SQL (I use sqlite 3)I want to iterate through each row of each dataframe and insert the 344 values into sql or whatever values/columns were collected or available, yet I get an error as some columns are sometimes missing and I might only have 340 columns instead of the the whole 344 columns.

How can I bypass this issue? In other words if a value/column was not available in the dataframe, how can I ask sql to bypass it if it does not exist in the dataframe but continue to write all other available columns to the corresponding row in the sql table? Thank you much for your help everyone

an example of my code is listed below:

for group in list_of stock lists:
  for stock in group:
    #get dataframe for each stock and append to a main dataframe:
      df = blablabla
      df_all_financial_data_a_list.append(df)


  #Concatenate the list to a df:
    df_all_financial_data_a = pd.concat(df_all_financial_data_a_list,axis=0, ignore_index=True)

    # Use this to insert each new row at a time
    try:
        conn = sqlite3.connect("/Users/ralph/Biotech/BiotechDatabase.db") 
        for index,row in df_all_financial_data_a.iterrows():
            try:
                cur = conn.cursor()
                values = (
                          row.loc['symbol'],
                          row.loc['asOfDate'].isoformat(),
                          row.loc['periodType'],
                          row.loc['currencyCode'],
                          row.loc['AccountsPayable'],
                          row.loc['AccountsReceivable'],
                          row.loc['AccruedInterestReceivable'],.... (344 POTENTIAL VALUES but sometimes less values are available)

                )
                cur.execute("""
                INSERT INTO All_financial_data_a(
                          symbol,
                          asOfDate,
                          periodType,
                          currencyCode,
                          AccountsPayable,
                          AccountsReceivable,
                          AccruedInterestReceivable,...((344 columns in the sql table)
                             
                  ) 
                VALUES(?,?,?,?,?,?,?,?,?,?,...(344 values potentially)
                ON DUPLICATE KEY UPDATE FinDataAIdx=FinDataAIdx
                """,
                values 
                )

                conn.commit()
                cur.close()
            except Exception as error:
                print("Failed to add row", error)```


p.s. getting to increase the number of stocks in each group in order to try and capture 344 columns does not work.

Thank you much



Ralph
  • 3
  • 2
  • you could query the columns of the table (see https://stackoverflow.com/questions/947215/how-to-get-a-list-of-column-names-on-sqlite3-database) and create values and the statement by a loop over the columns – Turo May 02 '22 at 06:44

0 Answers0