What is the Use of monotonically_increasing_id in PySpark

Asked Oct 19 '18 at 02:24

Active Oct 19 '18 at 10:36

Viewed 1,418 times

I am trying to understand the use of monotonically_increasing_id in Spark SQL.

Can anyone explain with an example, why do we need to have monotonically increasing ids in case of dataframes?

edited Oct 19 '18 at 02:32

OneCricketeer

asked Oct 19 '18 at 02:24

Nikhil Mishra

1

It's semantically equivalent to a AUTOINCREMENT key in a RDBMS table – OneCricketeer Oct 19 '18 at 02:32
@cricket_007 It's not quite the same because it does not generate a consecutive sequence of numbers. – Terry Dactyl Oct 19 '18 at 05:46
@TerryDactyl If you delete ID's post-creation of an RDBMS's incremented value, then they are also not always consecutive, but the value is still always increasing. I get your point though. – OneCricketeer Oct 19 '18 at 14:00

0 Answers0