top | item 11033013

(no title)

jnaour | 10 years ago

Have anybody works with Mesos?

How does it compare to Yarn, particularly in a production perspective? Is it stable, easy to integrate?

We are currently thinking to switch from a Kafka-Spark(on Yarn)-Mongo stack to a SMACK stack (Spark, Mesos, Akka, Cassandra, Kafka)[1]. It seems that there is a good integration between theses projects. Also you can run Docker on Mesos using Marathon[2] so not only our data-driven stack could be on Mesos but the full stack.

[1] https://mesosphere.com/blog/2015/07/24/learn-everything-you-...

[2] https://github.com/mesosphere/marathon

discuss

SEJeff|10 years ago

Mesos is a superset of Yarn and also a bit lower level. In fact, there is a project that allows you to run Yarn on top of mesos thereby getting the best of both worlds. Kind of by design, you couldn't really run mesos on top of Yarn.

http://www.apache-myriad.org

I use Mesos in production and it is great stuff. It also helps knowing 100% of the backend of Siri, 100% of Twitter, much of eBay, PayPal, Airbnb, Uber, etc all run entirely on mesos. It is simply battle tested.

petard|10 years ago

We have been using Mesos+Marathon+Docker in production for more than year. Mesos has been very reliable, there are a lot of moving parts so it does take some effort to get everything going.

Caveat: We only use it to run state-less applications. Applications like Kafka that require data persistence are run outside Mesos as data persistance is not yet handled reliably with Marathon.

[1] http://datajet.io/One-year-with-Apache-Mesos-The-Good-The-Ba...

KirinDave|10 years ago

It seems like Marathon is a really weak link in the chain between Mesos and Docker.

Have you considered using some other tool there? I know there is a Kubernetes target for it. Of course, any non-GCE port of Kubernetes is in likely need of sponsorship/love. And of course, Kubernetes currently introduces some additional network overhead.

Still, it might be helpful there for handling problems like getting persistent storage mixed in.

SEJeff|10 years ago

Note that there is actually a very nice kafka mesos framework for this sort of thing:

https://github.com/mesos/kafka

Mesos supports persistent volumes and what they call "dynamic reservations", which this framework supports. If the application is killed, and the host is still up, it will be restarted on the same host and re-attached to the volume which has the data, otherwise, kafka will just replicate to the new broker (as it does).

The initial code for adding the dynamic reservation bits was added last November as it is a relatively new feature in Mesos:

https://github.com/mesos/kafka/commit/455b20f94b9166b026ea

shahbazac|10 years ago

I tried to use mesos to setup a test spark cluster. I figured I would install mesos, then use it to deploy spark, cassandra, etc. In practice using mesos turned out to be a major pain. Several blogs online made it seem trivial; however, I kept finding out that much of what I thought was mesos was actually a commercial product from mesosphere.

We ended up abandoning mesos just to get our prototype going.