Optimizing update/insert mongo query

Asked Sep 22 '21 at 16:46

Active Sep 26 '21 at 10:50

Viewed 39 times

I have a collection that is going receive approx 30K documents. This collection starts empty and is updated (with insert). When I run my script for the second time, this collection don't need to be entirely update, only the products that doesn't exists need to be inserted. What is the best way to this?

I'm using the following code based on this answer

   if(db.Catalog.count_documents({'Sku': prodCatalog["Sku"]}, limit=1) == 0):
         db.Catalog.insert(prodCatalog)
        #db.Catalog.create_index("Sku", pymongo.DESCENDING)

The commented line is used to try to optimize the insert process. I've got the following results testing the code above:

Without INDEXING = 15m7s
TEXTUAL INDEXING "Sku" = 14m40s
ASCENDING INDEXING "Sku" = 13m45s
DESCENDING INDEXING "Sku" = 14m

Sku is a string identifier (only numbers) to a product. Is a unique value for each product.

I've tried to use Upsert, but the results were not satisfactory.

Is there a way to reduce the insert/update execution time to 10 min or less?

edited Sep 26 '21 at 10:50

marc_s

704,970
168
1,303
1,425

asked Sep 22 '21 at 16:46

OdiumPura

Optimizing update/insert mongo query

0 Answers0