top | item 41127456

(no title)

fs111 | 1 year ago

I have no idea what any of the google tech has to do with anything here.

Quoting from the original spark paper:

> Spark is built on top of Mesos [16, 15], a “cluster operat- > ing system” that lets multiple parallel applications share > a cluster in a fine-grained manner and provides an API > for applications to launch tasks on a cluster

https://people.csail.mit.edu/matei/papers/2010/hotcloud_spar...

Note how Matei Zaharia - the inventor of spark - is also on the mesos paper:

https://people.eecs.berkeley.edu/~alig/papers/mesos.pdf

discuss

dekhn|1 year ago

The RAD lab folks who built Mesos were aware of Borg and how it approached the problem of schedling a bunch of different jobs on a collection of disparate hardware. Prior to borg, most large-scale clusters were managed with batch queue software, while borg and mesos are more from the "service management"- a collection of jobs that run concurrently, with priority levels used to preempt lower-priority jobs to allow higher-priority jobs to schedule and run "immediately".

The need for this pops up for nearly every large scale data processing enterprise- with k8s replacing mesos, yarn, and other systems as the cluster scheduler du-jour.

One of the big advantages of a service scheduler versus a batch queue is that you can implement a batch queue on top of a service scheduler much more easily than you can implement a service scheduler on top of a batch queue.