0

I am trying to send out several spark sql work in asyncio, here is my code:

async def count_number(table_name):
    print(table_name+' is running')
    df_cousql_count='select count(1) as volume from '+table_name
    df_count=await spark.sql(sql_count)
    print(table_name+' is finished')
    return df_count

async def main():
    task1=asyncio.create_task(count_number('table1'))
    task2=asyncio.create_task(count_number('table2'))
    value1=await task1
    value2=await taks2
    print(value1,value2)
asyncio.run(main())

and I got this error: TypeError: object DataFrame can't be used in 'await' expression

like how should I do the async job will waiting for the spark return the result?

kiritowow
  • 3
  • 2
  • Have a look here. https://stackoverflow.com/questions/38048068/how-to-run-independent-transformations-in-parallel-using-pyspark – thebluephantom Oct 06 '21 at 08:21

0 Answers0