(no title)
yudlejoza | 4 years ago
# cat data.csv | process-1 | process-2 | ... | process-n > final-output
Imagine you have this pipeline that already works for data.csv. But now you have data2.csv which has some difference (e.g., some values are null, while the original data.csv had no null values).
Monads are an approach to making the existing pipeline work (with minimal changes) while still being able to handle both data.csv and data2.csv. The minimal changes follow a strict rule as follows (this is not a valid shell command anymore):
# wrap(cat data.csv) ] process-1 ] process-2 ] ... ] process-n > final-output
In other words, only two kinds of changes are allowed:
- You can bring in a wrap function, that modifies the entries of the given csv data.
- You can bring in a new kind of pipe ']' instead of '|'
The idea being, the wrap function takes in original data stream, and for each "unit" (a line in the csv file, called a value) produces a new kind of data-unit (called monadic-value). Then your new pipe ']' has some additional functionality that is aware of the new kind of data-unit and is able to, e.g., process the null values, while leaving the non-null values unchanged.
Note, you didn't have to modify any of the process-1 through process-n commands.
BTW, the null value handling monad is called the 'maybe monad' (and of course there are other kinds of monads).
If you make the existing pipeline work in this way, you essentially created a monad to solve your problem (monad here is the new mechanism consisting of the new value, and the two new changes, the wrap function, and the new pipe).
edit: There may be a need to also modify the '>' mechanism. But I think that is not essential to the idea of a monad, since you could replace ">" with "] process-n+1 >" (i.e., you created a new outermost function 'process-n+1' that simply converts the monadic-values back to regular values).
edit 2: If instead of handling null-values, the purpose is to "create side-effects" e.g., at every pipe invocation, dump all/some contents of the data into a log file, then the kind of monad you end up creating would be something like an "I/O monad".
girishso|4 years ago