top | item 12409784

(no title)

cstavish | 9 years ago

I use JSONB columns for similar use-cases, but to play devil's advocate, you can accomplish that a fourth way, which is almost certainly better than 1 or 2. A table for fields, one row per field. A table for forms, which has a many-to-many relation to fields. Entries in a link table compose a form of arbitrary fields. Answers can be stored in a separate responses table, indexed by form_id and column_id. I don't know enough about database implementation to speculate on how that would perform at scale, but conceptually that's how I think of the problem.

discuss

order

malisper|9 years ago

This is roughly EAV. EAV is where you have a with a schema:

    entity_id | attribute_id | value
EAV is typically considered to be an anti-pattern for several reasons: it becomes very expensive to rematerialize a entity with all of its attributes/values, it becomes difficult to have any kind of constraints on the values (this is also a problem with jsonb), and it's hard for the database to maintain good statistics on the values of different attributes as most databases don't keep multi-column statistics. Don't worry, I've had similar ideas before.

strommen|9 years ago

EAV is an anti-pattern?

If you don't have JSONB, EAV is the only remotely-reasonable way to implement user-defined fields (e.g. product-specific attributes in e-commerce).

BillinghamJ|9 years ago

I would expect that to perform poorly at scale.

There is also a fifth option, though it is not very space efficient. That is to store every field in the JSONB column. Personally, that is what I would do.

This approach is known to be used successfully by many high scale companies. It ensures the highest degree of flexibility and still allows for full indexing of the fields. The schema would need to be enforced by your application, but that should happen anyway.