0
import pandas as pd 
import xlrd 
import openpyxl 
from io import StringIO  
import boto3 

def lambda_handler(event,context):  
    df=pd.read_excel('s3://my-bucket/XL/test-xls.xlsx', engine='openpyxl')    
    bucket = 'my-bucket'   
    csv_buffer = StringIO() 
    df.to_csv(csv_buffer) 
    s3_resource = boto3.resource('s3') 

    # write the data back as a CSV 
    s3_resource.Object(bucket,'XL/test-csv.csv').put(Body=csv_buffer.getvalue()) 
  1. Above code is working fine for one excel but I am searching for solution where I can read XLSX file
  2. If XLSX file has 3 tab then those 3 tabs should get converted into 3 different CSV and save file into bucket with tabname.csv
Jeremy
  • 601
  • 6
  • 18
Snownew
  • 7
  • 4
  • 1
    For reading multiple sheets from the same workbook (2), see if [this post](https://stackoverflow.com/questions/26521266/using-pandas-to-pd-read-excel-for-multiple-worksheets-of-the-same-workbook) helps – Jeremy Apr 23 '22 at 17:48

1 Answers1

0

You can try xlsx2csv instead of pandas. xlsx2csv with -n option might work.

Also xlsx2csv have more options for sheet tabs, you can choose what suits you.

sauraj
  • 23
  • 4