22

I am using a simple model with an attribute that stores all the data for that object in a JSONField. Think of it as way to transfer NoSQL data to my PostgreSQL database. Kinda like this:

from django.contrib.postgres.fields import JSONField   

class Document(models.Model):
    content = JSONField()

Each Document object has (more or less) the same keys in its content field, so I am querying and ordering those documents using those keys. For the querying and ordering, I am using Django's annotate() function. I recently came across this:

https://docs.djangoproject.com/en/2.1/ref/contrib/postgres/indexes/

I also know that PostgreSQL using JSONB, which is apparently indexible. So my question is this: Can I index my content field somehow to make my read operations faster for complex queries? And if so, then how do I do it? The documentation page I linked has no examples.

darkhorse
  • 7,008
  • 15
  • 53
  • 120
  • 1
    I think there are similar questions related to this - https://stackoverflow.com/a/49358119/4116955 and https://stackoverflow.com/questions/17807030/how-to-create-index-on-json-field-in-postgres-9-3 Please migrate your json data to a postgresql table using its JSONB type and then try indexing the whole column or a specific field in the json payload. – Mahesh H Viraktamath Feb 22 '19 at 10:20
  • 1
    Thats interesting. However, I was looking for a way to do this from within Django, as that would make my codebase much more easier to manage. – darkhorse Feb 22 '19 at 10:26
  • 7
    Something like this should work - `class Doc(models.Model): data = JSONField() class Meta: indexes = [ GinIndex( fields=['data'], name='data_gin', ), ]` Sorry for the bad formatting. – Mahesh H Viraktamath Feb 22 '19 at 10:38
  • Wouldn't this index the entire column? Is that a good idea considering there might be all sorts of data types within that JSONField? – darkhorse Feb 22 '19 at 10:57
  • I think this index will work when you query for rows in a table where the key in a JSONB field maps to a particular value – Mahesh H Viraktamath Feb 22 '19 at 11:03
  • And also for the best performance, you have to rely on postgresql indices, because it is more a powerful way than index definitions in your python code. For DB performance, you must tune the DB itself rather than your code. – Mahesh H Viraktamath Feb 22 '19 at 11:05

2 Answers2

15

For those that want to index a particular key, create a raw sql migration:

  1. Run ./manage.py makemigrations --empty yourApp where yourApp is the app of the model you want to change indexes for.

  2. Edit the migration i.e.

operations = [
    migrations.RunSQL("CREATE INDEX idx_name ON your_table((json_field->'json_key'));")
]

Where idx_name is the name of the index, your_table is your table, json_field is your JSONField, and json_key in this case is the key you want to index.

Notice the -> operator in the index creation code line, and not ->> as mentioned here.

That should do it, but to verify all went well run the following sql:

SELECT
    indexname,
    indexdef
FROM
    pg_indexes
WHERE
    tablename = '<your-table>';

and see if your index is there.

jfs
  • 33
  • 1
  • 6
radtek
  • 30,748
  • 10
  • 135
  • 106
  • 2
    I think native support was added in 3.2 as part of this PR: https://github.com/django/django/pull/11929 But I'm not clear on how to use. All the Django tickets requesting indexes on JSONB fields were closed based on that PR. – Tim Tisdall Oct 28 '21 at 12:38
1

There is a bit more universal and Django native way. You can use following custom Migration Operation:

class CreateJsonbObjectKeyIndex(Operation):

    reversible = True

    def __init__(self, model_name, field, key, index_type='btree', concurrently=False, name=None):
        self.model_name = model_name
        self.field = field
        self.key = key
        self.index_type = index_type
        self.concurrently = concurrently
        self.name = name

    def state_forwards(self, app_label, state):
        pass

    def get_names(self, app_label, schema_editor, from_state, to_state):
        table_name = from_state.apps.get_model(app_label, self.model_name)._meta.db_table
        index_name = schema_editor.quote_name(
            self.name or schema_editor._create_index_name(table_name, [f'{self.field}__{self.key}'])
        )
        return table_name, index_name

    def database_forwards(self, app_label, schema_editor, from_state, to_state):
        table_name, index_name = self.get_names(app_label, schema_editor, from_state, to_state)
        schema_editor.execute(f"""
            CREATE INDEX {'CONCURRENTLY' if self.concurrently else ''} {index_name} 
            ON {table_name}
            USING {self.index_type}
            (({self.field}->'{self.key}'));
        """)

    def database_backwards(self, app_label, schema_editor, from_state, to_state):
        _, index_name = self.get_names(app_label, schema_editor, from_state, to_state)
        schema_editor.execute(f"DROP INDEX {index_name};")

    def describe(self):
        return f'Creates index for JSONB object field {self.field}->{self.key} of {self.model_name} model'

    @property
    def migration_name_fragment(self):
        return f'create_index_{self.model_name}_{self.field}_{self.key}'

Usage example:

from django.db import migrations

from util.migration import CreateJsonbObjectKeyIndex


class Migration(migrations.Migration):
    atomic = False  # Required if concurrently=True for 0 downtime background index creation

    dependencies = [
        ('app_label', '00XX_prev_migration'),
    ]

    operations = [
        migrations.SeparateDatabaseAndState(
            database_operations=[
                # Operation to run custom SQL command. Check the output of `sqlmigrate` to see the auto-generated SQL
                CreateJsonbObjectKeyIndex(
                    model_name='User', field='meta', key='adid', index_type='HASH',
                    concurrently=True,
                )
            ],
        )
    ]

Tested with Django-2.2 and and AWS Postgres RDS, but should be compatible with other Django

dtatarkin
  • 951
  • 6
  • 6