0

I'd like to use the ON DUPLICATE KEY UPDATE optionality provided by SQLAlchemy to upsert a bunch of records.

These records have been sucessfully inserted with python using the following (where connection is engine.connect() object and table is a Table object)

record_list = [{'col1': 'name1', 'col2': '2015-01-31', 'col3': 27.2},
               {'col1': 'name1', 'col2': '2016-01-31', 'col3': 25.2}]
query = insert(table)
results = connection.execute(query, record_list)

Looking at the docs at https://docs.sqlalchemy.org/en/13/dialects/mysql.html#insert-on-duplicate-key-update-upsert as well as a number of SO questions (including the suggestion it's possible under the comments on SQLAlchemy ON DUPLICATE KEY UPDATE ) I've tried a number of different examples, but there were none that I could see that address multiple records with the upsert statement using this method.

I'm trying along the lines of

query = insert(table).values(record_list)
upsert_query = query.on_duplicate_key_update()
results = connection.execute(upsert_query)

but either get the issue that the .on_duplicate_key_update() requires cant be empty or that the SQL syntax is wrong.

If anyone has sucessfully managed and could help me with the code structure here I'd really appreciate it.

kowpow
  • 75
  • 8

2 Answers2

5

I just ran into a similar problem and creating a dictionary out of query.inserted solved it for me.

query = insert(table).values(record_list)
update_dict = {x.name: x for x in insert_stmt.inserted}
upsert_query = query.on_duplicate_key_update(update_dict)
0

Your on_duplicate_key_update function requires arguments that define the data to be inserted in the update. Please have a look at the example in the documentation that you have already found.

insert().on_duplicate_key_update({"key": "value"})
julian
  • 455
  • 2
  • 7
  • Thanks for your reply. I tried passing in the record_list again (which is a list of dictionaries) but this didnt work - I think as a dictionary is expected. Any idea what would be required here where multiple records are being updated? Thx – kowpow Dec 11 '19 at 18:12
  • just to clarify the above, I tried insert.on_duplicate_key_update(record_list) – kowpow Dec 11 '19 at 18:23
  • The documentation gives an example of multiple updates, using a dict as well as a list: `on_duplicate_key_stmt = insert_stmt.on_duplicate_key_update( {"data": "some data", "updated_at": func.current_timestamp()}, )` or `on_duplicate_key_stmt = insert_stmt.on_duplicate_key_update( [ ("data", "some data"), ("updated_at", func.current_timestamp()), ], )` – julian Dec 11 '19 at 18:24
  • record_list is a list with a single dict as it's only entry, not a dict. you can inspect `type(record_list)` – julian Dec 11 '19 at 18:26
  • Perhaps I'm misunderstanding, but I see those examples in the docs as updating a single record where multiple columns are identified by the dict key and the record/row is the dict value. In terms of my variable, record_list, it's a list of 2 dictionaries, each of which represents a different record - I'm actually trying to upsert a lot more, but for clarity just limited to 2 in my example. – kowpow Dec 11 '19 at 18:58