(no title)
prostodata | 5 years ago
comp = salary / (age - 18)
is (strictly speaking) not part of the relational algebra because it is not using set operations (like join or union). It was added to the relational model because we hardly can process data without such expressions.A better way to formally describe such calculated columns is to introduce functions and treat them as first class elements of the data model. In particular, function operations are better than set operations in these cases [1]:
- Calculating data using calculated columns. We describe a calculated column as a function without generating new relations (as opposed to select-from)
- Aggregating data. We can add an aggregate column without generating new relations (as opposed to groupby)
- Linking data. We can add derived link columns without generating new relations (as opposed to join)
This function-based approach to data modeling and data processing was implemented in Prosto [2] which demonstrates how many typical relational tasks can be solved using functions.
[1] Why functions and column-orientation? https://prosto.readthedocs.io/en/latest/text/why.html
[2] Functions matter! No join-groupby, No map-reduce. https://github.com/prostodata/prosto
No comments yet.