How to partition query with subselect in spark jdbc

Question

I'm trying execute a query sql particioned by date in spark jdbc. I've seen many examples with single tables but how can I do this on queries with subselects that has filters?

Query example:

select
    col1, col2
from table
inner join 
    (
        select col1, col2 from table2 where {partitionColumn} > ? and  {partitionColumn} < ?
    ) as table2ToBeFiltered

Code java example:

this.sparkSession.read()
    .format("jdbc")
    // what could I put here?
    .option("partitionColumn", "name of col in subselect") 
    .option("lowerBound", "2021-01-01")
    .option("upperBound", "2021-02-04")
    .option("numPartitions", 4)
    .option("oracle.jdbc.mapDateToTimestamp", "false")
    .option("sessionInitStatement", "ALTER SESSION SET NLS_DATE_FORMAT = 'YYYY-MM-DD'")
     // what could I put here?
    .option("dbtable", "how to use select with subselect and joins and partition by specific col")
    .load();

Is there any way to do this?

How to partition query with subselect in spark jdbc

0 Answers0