I'm trying execute a query sql particioned by date in spark jdbc. I've seen many examples with single tables but how can I do this on queries with subselects that has filters?
Query example:
select
col1, col2
from table
inner join
(
select col1, col2 from table2 where {partitionColumn} > ? and {partitionColumn} < ?
) as table2ToBeFiltered
Code java example:
this.sparkSession.read()
.format("jdbc")
// what could I put here?
.option("partitionColumn", "name of col in subselect")
.option("lowerBound", "2021-01-01")
.option("upperBound", "2021-02-04")
.option("numPartitions", 4)
.option("oracle.jdbc.mapDateToTimestamp", "false")
.option("sessionInitStatement", "ALTER SESSION SET NLS_DATE_FORMAT = 'YYYY-MM-DD'")
// what could I put here?
.option("dbtable", "how to use select with subselect and joins and partition by specific col")
.load();
Is there any way to do this?