(no title)
bmcfeeley | 5 years ago
- a language for specifying data transformations, and
- an engine to compile programs written in that language into mapreduce jobs to execute on a hadoop cluster
it was designed to easily map some common functional and SQL idioms (e.g. filter, group by w/ aggregation functions) to parallel execution for processing huge amounts of data.
Impala is another big data project that is an engine for planning and executing SQL on data stored in a hadoop cluster.
Zookeeper is... black magic??
No comments yet.