ariskk's comments

ariskk | 9 years ago | on: Drivetribe’s Modern Take on CQRS with Apache Flink

This is exactly what we did initially. Really early on (before any rate limiting was in place), a few spam accounts followed 100K people, created lots spam content etc. Encapsulating those deletions started yielding messages bigger than the default max kafka message size (1MB). Additionally, this method had a few side effects on the downstream processors. We could of course increase the limit, but we decided to deal with the problem at its core.

ariskk | 9 years ago | on: Drivetribe’s Modern Take on CQRS with Apache Flink

You mention the word "batch" when talking about models. Also "BI/Analytics". Since Django/Rails applications do not support any of the two, another sort of system would be needed. This is the point where, having built everything on Django, with no foresight whatsoever about future requirements, we would have ended up creating DataFrames from SQL tables in Spark. Our BI guys have no experience with Spark, so we would need to load data to a DW-like solution, like BigQuery/Redshift/Impala/Presto/you-name-it. Instead of another sink in Flink, we would need to implement and schedule ETL jobs. Even at our current load, computing counters (eg likes) at read time would be slow and inefficient. Which means we would need a way to pre-aggregate them. Maybe another service, possibly behind a queue? You can see where I am going. As requirements evolve, systems evolve, and with no planning before hand, people end up with spaghetti architectures. We knew we were funded enough to run for a couple of years. We knew the site would have traffic. We were tasked with delivering an algorithmicly-driven product, and this is the solution we came up with.

I really do not understand how such a strong set of conclusions can be drawn out of so little information.

ariskk | 9 years ago | on: Drivetribe’s Modern Take on CQRS with Apache Flink

Unfortunately, I am not allowed to. The problem with this is that beforehand you cannot predict the volumes. 1K requests per second? 10K per second? Maybe 50K per second on special occasions? It is difficult to tell, especially when high profile personalities are involved.

PS: we do have lots of load

ariskk | 9 years ago | on: Drivetribe’s Modern Take on CQRS with Apache Flink

Hi. I am the author of the article. Thank you for spending time to read this. The combined reach of the co-founders is very large, thus being able to provably handle scale was an essential prerequisite. Additionally, the requirements of the platform extend way beyond a simple content server. Content performance is tracked in real-time and this is fed to multiple ranking and recommendation models. Those frequently change, thus we need a way to retroactively process our data. Flexibility is key when trying to build an intelligent platform. Thus, we decided to early-on invest time in the ability to quickly iterate and experiment on algorithms, in real time over live data. You are right that the API fleet could be implemented using the aforementioned technologies; We use Scala and thus decided to use Akka HTTP instead. The challenging part is how you manage state behind that.
page 1