Dubsmash: Scaling to 200M Users with 3 Engineers

[+] gjhgqpqndpe|8 years ago|reply

Sort of interesting just to hear about the ups and downs of companies like dubsmash. They were often cited as an example of Berlin's future as a startup city [1]. They went from 35+ employees to 27 [2] to now 12 as they've stated in this post. They also moved from Berlin to New York, which seems to imply they felt like the city couldn't offer what they currently need. It looks like in the process of moving they didn't take that many of the employees with them (maybe this was also a way out of strict German employment rules?) Seems like a bit of an attempt at a restart (co-founder Roland Grenke seems to be gone, etc).

[1] http://www.wired.co.uk/article/european-startups-2016-berlin [2] https://techcrunch.com/2016/11/30/dubsmash-9m/

[+] svantana|8 years ago|reply

Looking at their rank history at AppAnnie, they were doing really well in 2015 but it's been downhill from there (from top 10 to >500 in all the major App store charts). How they were able to go from 140M to 350M downloads in the last year (compare this article with the techcrunch one) is a complete mystery. Also, stating your number of users without any qualifier (e.g. MAU) in a tech article is a bit of a red flag, in my experience that usually means that it's a vanity number (yearly active? Who knows).

[+] kolmogorov|8 years ago|reply

It also sounds odd that they have 3 engineers and 12 employees. What do the other people do? And hopefully they had more than 3 engineers back when they had 35 employees...but even then why would they choose to fire engineers and have that tech to non tech ratio?

[+] tschellenbach|8 years ago|reply

We share a Slack channel with Dubsmash, (they use getstream.io for their feed technology), I heard that they are hiring like crazy at the moment.

[+] kunthar|8 years ago|reply

interesting, they even couldn't manage a basic signup process. app says my email is in the users list. when i try to login, it says no email is exist in their system. tried forgot option same. ok, hire someone to manage your acl part guys ,_,

[+] gaius|8 years ago|reply

I hope those three engineers have meaningful equity and exposure to upside because it sounds like they’re working like sled dogs

[+] vemv|8 years ago|reply

Thought the same. My wild guess is that those engineers are at their career peek in terms of energy / ability to deliver glue code, but a few years behind getting to be a well-rounded engineer that can live/work sustainably.

[+] aerovistae|8 years ago|reply

Three engineers maintain code in Java, Swift, previously Objective-C, Go, Python (both Django and Flask), Node.JS, considering Kotlin, and additionally make use of Celery, RabbitMQ, React, Redux, Apollo, GraphQL, Postgres, Heroku, AWS, Jenkins, Kubernetes, Redis, DynamoDB, Elasticsearch, Algolia, Memcached, and more.

I might be an inexperienced engineer by comparison, but I'll be honest, that sounds absolutely fucking insane. These three people must be geniuses to be able to use all of that with sufficient mastery to effectively handle 200M users.

[+] pepijndevos|8 years ago|reply

Sometimes I wonder if there are any internet companies (startup or otherwise) that do customer support. With numbers like that, it's hard to imagine one of those users getting even one second of attention with any problems they might have.

[+] dan_mctree|8 years ago|reply

You can only really do customer support if it makes financial sense, which it won't unless you make a significant amount of money on your average customer. Tech companies that don't have sales, but instead take their revenue through ads or through selling data are making cents per customer. With average profit that low, even 1/1000 customers making use of your support for 5 minutes would destroy any chance of profit.

[+] mooreds|8 years ago|reply

Pretty cool story. All about automating all the things!

Would love to read more about whether they started with microservices or had an MVP monolith that they then cut parts off of.

[+] bambax|8 years ago|reply

> We since have moved to a multi-way handshake-like upload process that uses signed URLs vendored to the clients upon request so they can upload the files directly to S3.

How does this work in practice / where can one learn more about this?

[+] sergiotapia|8 years ago|reply

It's actually really simple and quick to do. You offload a ton of processing power to S3.

It works like this:

User tells the backend, “I want to upload picture.jpeg!”

Backend tells the user, “Alright you have my permission but ONLY for that filename with that extension. Here’s a token, enjoy.”

User uses that signed token and pushes the file to your S3 bucket.

Here's how you do it in Phoenix. https://sergiotapia.me/phoenix-framework-uploading-to-amazon...

[+] rawnlq|8 years ago|reply

I want to make sure that I understand the security aspect of this.

You can argue that the user can upload anything using the original api anyway. But in the original case you can do server-side validation before the upload is proxied. I am thinking stuff that are domain specific like only allowing videos that are 6 seconds long or something.

You can move the validation to the client but the client can be easily modified. An actual user might not do this but someone trying steal your storage space (for serving malware or something) might?

These signed urls also seem to expire based on time so you can potentially save the url and upload again later if you allow generous expiration. (again, not really something I see being a huge problem)

But I guess these aren't really serious issues compared to the cost savings. Am I missing other ways this can be exploited?

I am looking into the GCS version, not S3, if that matters: https://cloud.google.com/storage/docs/access-control/signed-...

[+] theIV|8 years ago|reply

http://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrlU...

Not 100% sure what they mean by _vendored_ here, but I'm guessing they make a request to one of their backends to generate the URL and return it to the client for use.

[+] antoncohen|8 years ago|reply

Heroku has good docs on this, for multiple languages:

https://devcenter.heroku.com/articles/s3#file-uploads

One thing to keep in mind, users should be able to upload (to the specific signed URL), they should not be able to download from that location. Don't make the files users can upload publicly downloadable, otherwise you can be used to host malware. After the video/image is uploaded, you need to download and process it[1], then upload it to an S3 bucket that allows download (e.g, via CDN).

[1] Use caution when processing user content. It is best to process media in a sandbox that can protect you against exploits in the media processing libraries.

[+] CryoLogic|8 years ago|reply

Client makes request to server passing back auth token, server verifies auth token and uses the S3 library to generate a unique 1 time use URL for upload directly to the client. Client makes a put request to the s3 url. After it's finished s3 revokes the URL.

Multipart signed upload is much harder and requires signing every chunk.

Just google s3 signed upload there are a few tutorials from Amazon.

[+] pul|8 years ago|reply

> However, we discovered after some time that the custom Python implementation for those workers was dropping up to 5% of the events. This was mostly due to the nature of how reading happens with Kinesis: every stream has multiple shards (ours up to 50!) and each reading client would use a so-called shard iterator to keep track of where it was reading last. Since the used machines could always crash, be recycled, or scaled down, we needed to save those shard iterators in some serialized format to Redis and share them across machines and process boundaries. Since we had so many shards, every once in awhile we would skip events and hence lose them.

I've never worked with Kinesis, but in Kafka you'd store offsets specifically to solve this issue. When one of the members of a consumer group would drop out, the partition (read: shard) would automatically be reassigned to another member. This gives an at least once delivery guarantee, combined with idempotent actions gives effectively once semantics. No need to loose any messages. What was the issue that the dubsmash engineers were solving here?

[+] alexatkeplar|8 years ago|reply

With Kinesis, you would just use the Kinesis Client Library (https://github.com/awslabs/amazon-kinesis-client-python) which would automatically handle committing the offsets to DynamoDB.

Home-rolling a checkpoint-free event pipeline is a rookie mistake; it's a pity they didn't come across our Snowplow project (Apache 2.0 event pipeline running on Kinesis, Kafka and NSQ, https://github.com/snowplow/snowplow/).

[+] pinarello|8 years ago|reply

Anyone knows how they handle copyrights of movies and music?

[+] mooreds|8 years ago|reply

Depending on the length of the clip it might be handled under fair use. From the first page of google:

http://lemoinefirm.com/parody-fair-use-or-copyright-infringe...

[+] coob|8 years ago|reply

They don't

[+] mlevental|8 years ago|reply

I wonder if it qualifies as fair use since it's a mashup

[+] karterk|8 years ago|reply

> Although we were using Elasticsearch in the beginning to power our in-app search, we moved this part of our processing over to Algolia a couple of months ago;

How many records are you storing on Algolia?

[+] sundev|8 years ago|reply

I'd love to know also, Algolia pricing seems extremely expensive, but I'd also imagine at 200M users they have sufficient funds to pay for it.

[+] ramshanker|8 years ago|reply

Their signup page failed. It wouldn't accept my email. And when I go to password reset, it says User/Email does not exist.

Neither does Facebook login work.

[+] dominotw|8 years ago|reply

jobs link from that page https://www.dubsmash.com/jobs/ seems to be dead?

[+] wkd|8 years ago|reply

Bad link, remove the trailing slash

[+] gagabity|8 years ago|reply

200M users on Heroku must cost a fortune!

[+] the_scrivener|8 years ago|reply

I am genuinely curious about the trade-offs, as the bad and the ugly are not mentioned. Being realistic, there are too many moving pieces there, and yet the team of 3 remains experimental?

45 comments