0

I am thinking about what's the meaning of NULL in my database systems. In some columns it mean unknown / missing data. In some columns it mean a not-applicable field. In some columns it mean the entity represented does really not have the property.

For example, imagine a database which stores a field to identify if the user is opted-in to certain kind of promotion channel which uses an ad ID. The promotion channel will send ad to users using ad ID. If the user does not have an ad ID (i.e. the ad ID is NULL) ads will not be sent.

However, the business now want to improve ad coverage so it starts to think about what the meaning NULL is apart from not having a valid ad ID stored in the system. The business logic is that, eligible users can type in their ad ID to opt-in, however the eligibility is determined by some other business constraint (such as age) with those non-eligible users can't get one. Also we want to know if the user has explicitly said "no" to the question v.s. if the user hasn't been asked at all.

So we now have 3 reasons of NULL: the user hasn't been asked, the user has explicitly said no, the field is not applicable to the user, and these are all stored as NULL in the ad ID column which result in ad not being set. We also want to avoid any special handling (sentinel values) in the code to send ad as it is very simple - it just send ad to the list of ad ID without any exceptions.

So what is the common practice to denote those? If NULL can only mean one thing for a certain column, it is simple.

If NULL can mean either "missing data" or "known to have none", it is also simple - refactor the row to another table where the existence of a NULL row means known to have none, and a missing row means "missing data".

However, now there is the third meaning of "inapplicable to the row", what is everyone's practice to handle this? Is it a common practice adding constraints that trying to set an ad ID on the row for an ineligible user will cause error?

Also, has anyone encountered the case where a NULL column can have even more meanings on the NULL field?

  • "So what is the common practice to denote those" - documentation. I typically use column comments for that comment on column foo.id is 'NULL if not applicable'; or comment on column bar.end_date is 'NULL means open ended/infinity'; –  Apr 17 '20 at 13:41
  • It sounds like your table needs more columns to properly store the extra attributes. – Michael Kutz Apr 17 '20 at 14:55

0 Answers0