top | item 3950622

Show HN: Simperium, a realtime data layer

208 points| cloudmike | 14 years ago |simperium.com | reply

87 comments

order
[+] dtran|14 years ago|reply
"With Simperium, you own your data - that's important!"

Not to take away from some of the great projects featured on HN recently, but statements by Simperium like this make me much happier as a developer and business owner: "We believe the best apps have both a great user experience and unique backend services.".

From https://simperium.com/overview/: It is not a backend-as-a-service. We believe the best apps have both a great user experience and unique backend services. Our focus is to provide a great data layer between your frontend and backend while integrating with other providers of tools, hosting, and services.

[+] erichocean|14 years ago|reply
I totally agree with the idea of owning your own data. I think that's the big thing lost with "web apps", and the cloud in general.

That said, the data needs to be on the cloud to be useful. So...what to do?

My sense is that a dedicated, trustworthy company needs to store the data and only the data. In particular, they need to not be in any other business, e.g. advertising, or selling stuff to developers, etc.

They need to be owned by the people entrusting them with their data, full stop. That means they have to charge for it.

BTW, I wrote code to do this very thing back in 2009, called HubSync. It's great to see all of these other companies making a go at it.

[+] onetwothreefour|14 years ago|reply
This is definitely a cool and interesting service, sure, but how exactly do you "own your data"?

It's sitting in Simperium's database and you're locked into their platform once you decide to use it. Maybe there's an export, but that's irrelevant because all your apps in the wild and all your users are relying on this service to move data around.

If Simperium gets hacked, you get hacked. If Simperium goes down, you go down. If AWS goes down, Simperium goes down, and then you go down.

Say your app gets huge, and Simperium can't scale (and really, "half a million users" is nothing for a free service; that's not any meaningful scale and not a number that can really be cited as a signifier of ability to scale) -- what happens then?

Say you outgrow Simperium, and it's now costing you too much. What happens then?

The above isn't really about Simperium, but really about all backend-as-a-service services. They're great for starting out, but if you actually have a successful product you're going to need to be running your own backend at some point.

If you get too big/successful/etc and things start dying and you end up like Friendster before Facebook decides to buy your company, you're going to look back and wish you built your own backend and knew what was going on behind the scenes.

[+] alanh|14 years ago|reply
Please remove the leading space from your quote.
[+] Void_|14 years ago|reply
This one is the best so far:

1. Works with any JavaScript library

2. You can write server-side logic

3. Operational Transformation baked in

Seriously I cannot find anything bad about this project. (I really wanted to.)

My questions:

1. Is the server side written in Python?

2. Is there offline support for JavaScript apps too?

(I tried this with my own library and I think it was a wrong direction -- it only made it more complicated than it should be)

3. How about relationships in JavaScript objects? How does that work? (I suppose it's all automatic in Core Data objects, but how about JavaScript?)

Thanks!

[+] ecksor|14 years ago|reply
1. Server side is all python (combination of tornado/gevent/zeromq)

2. We've experimented with it, we'll likely add it as an option in the future since there may be security concerns depending on the app.

[+] erichocean|14 years ago|reply
I can see a few missing things (correct me if I'm wrong):

1. peer-to-peer sync (no intermediate server)

2. cloud server isn't passive and stateless

No. 2 is very important for scalability. 1 is mostly a nice to have, especially on the WAN, where you can sync directly from, e.g. an iPad to your iPhone.

Both are doable (I did both in HubSync back in 2009).

[+] cloudmike|14 years ago|reply
I'll take number 3. Core Data one-to-many relationships are indeed handled automatically. On the JavaScript side, you'll see these relationships reflected in both directions using unique identifiers, but you need to interpret them accordingly. This is pretty straightforward using collections in libraries like Backbone.js.
[+] kt9|14 years ago|reply
I missed the part about writing server side logic. How does this service allow you to write service side logic?
[+] equark|14 years ago|reply
We desperately need a high quality open-source Firebase/Simperium/Spire. A simple, robust, realtime data synchronization layer with libraries for major platforms. These projects all look great, but outsourcing this infrastructure is simply not feasible for lots of projects.
[+] tarr11|14 years ago|reply
Are there any operational transform libs that are transport independant? Socket.io and socketstrwam can handle communication.
[+] brianr|14 years ago|reply
Pricing is free during the beta period.

Simperium team: can you give any guidance on when you'll announce pricing, or roughly what pricing is going to be?

[+] cloudmike|14 years ago|reply
Hey Brian, we know this is really important. Our beta label is mostly for the lack of pricing since the technology itself is production-ready. We're looking for more feedback and we aim to announce pricing soon.

What do you think about Urban Airship's model based on active monthly users? What we like is that the costs are obvious and map clearly to your business.

[+] jazzychad|14 years ago|reply
This looks incredible. My one burning question: For integrations on mobile devices, how does this affect the battery life? I'm concerned that there is some open-ended connection from the client to server listening for data changes which will drain the battery while the app is open.
[+] cloudmike|14 years ago|reply
Good question. We haven't heard any complaints about battery life so far. If you'd like you can check out Simplenote to judge for yourself: http://simplenoteapp.com

You do have control over the granularity of your updates. For example, we spoke to a game developer who would want to disable Simperium while a game is being played, and then enable it again at the end of levels. These coarse changes are supported since they'll resolve automatically when the client comes back online.

[+] JaviSoto|14 years ago|reply
Absolutely priceless! This is an amazing product, and I can not start to imagine what developers are going to be able to do with this. Very exciting!
[+] cbsmith|14 years ago|reply
I'm missing something. This is just MVC style observer pattern. I get that it is well packaged/productized, but this is the kind of thing that developers have been doing for decades, with standard libraries for most of that time. Doing it with a browser is a bit newer, but ever since WebSockets it has been common place.

A nice product yes, but I wouldn't expect this changes what developers are going to be able to do.

[+] oacgnol|14 years ago|reply
This almost seems unreal. What kind of latency would you be dealing with when your application(s) deal with many concurrent connections? For example, say a cross-platform multiplayer game with a persistent, shared world?
[+] cloudmike|14 years ago|reply
Latency is good right now (low hundreds or less), but for launch we focused on data integrity and reliability.

That leaves room for improvement, including some low hanging fruit like eventually giving you the ability to disable versioning for certain kinds of data (like multiplayer updates).

[+] saurik|14 years ago|reply
The website mentions that you are using google-diff-match-patch in the JavaScript client to merge changes, it seems as the API does not require the developer to actually specify things like "the user added an A at this position" as opposed to just "commit the changes to this entire object". Is there a reason other than simplicity for this API (I guess maybe because CoreData doesn't have that abstraction, and I presume your timeline was to start with figuring out how to sync CoreData), and on iOS are you also using google-diff-match-patch (there was not the same explicit mention of it in the documentation there)? (edit: Actually, I guess the comment from zbowling about DiffMatchPatch answers the second half of that. ;P)

BTW, this is generally really awesome: I am (right now, as in I'm sitting there right now ;P) helping teach a class on cloud computing at UCSB that happens to currently be discussing database synchronization and replication; after spending a bunch of time today discussing "how PostgreSQL is implemented and the basis of different isolation levels in the SQL standard and in MVCC" I took the time to tell everyone about Simperium (which probably makes more sense if I mention that I've looked into building something similar before for my projects; I'm glad someone else finally seems to be coming at it from the correct mindset). Everyone here seems to agree: this is going in a great direction.

[+] ecksor|14 years ago|reply
Thanks saurik! Yeah that's right, we've tried to make it as easy as we can for developers to use. One of our goals has been to let developers be able to work with data as they do normally if it were just local, while we handle figuring out whats changed.
[+] cloudmike|14 years ago|reply
Simperium dev here. We're launching our beta to gather feedback. Let us know what you think.
[+] bmelton|14 years ago|reply
Jaw-dropping.

Perhaps it's too late to predict that 2012 will be the year of realtime interaction, but between this, Meteor, Firebase, etc., not only are all the tools converging in that direction, but they all appear to be drop-dead easy to use.

Thank you for this. Is there an IRC channel or Google Group for questions?

[+] akrymski|14 years ago|reply
Guys congrats on the launch! I was wondering what was taking you so long ;) Very exciting stuff!

I'm amazed that we're on the same wavelength - we've had to build very similar infrastructure for ourselves for Unipost (www.post.fm). Can't believe we didn't collaborate on this, we'd happily be your first customer :(

A few interesting differences:

- Our approach is more like Meteor - web only, no iOS support

- The backend is a python tornado app that handles validation and conflict resolution before saving stuff to dynamodb

- We have a JS datastore backed by websql/indexeddb/memory that syncs with our backend datastore

- We have "live" Backbone collections that update themselves when datastore queries return different results

- We have a Backbone sync adapter that uses the datastore to persist data locally and kick off synchronization

- We sync a subset of the data (eg 3 months of mail) - thats a core requirement for us

- We sync all of the tables at once, not per bucket, cause queries are joining tables so the datastore has to be consistent at all times

- No operational transforms cause it doesn't seem to apply to us - pretty "notepad" specific I think

- No versioning as we didn't see benefits for us

- We'll probably open source this stuff when we're done

What do you use for storage?

[+] cloudmike|14 years ago|reply
"I'm amazed that we're on the same wavelength - we've had to build very similar infrastructure for ourselves"

Right, we hear that quite a bit. A few comments:

- Here's our Backbone sync adapter: https://github.com/Simperium/simperium/blob/master/javascrip...

- Dealing with subsets is a priority for us

- OT and versioning are generally helpful for managing changes/deltas

- We're using MongoDB for storage

[+] zbowling|14 years ago|reply
This is interesting. We built something very much like this internally at SeatMe. On the iOS side, it's almost identical. It's how we keep our iPads up to date. Might not of built something internally ourselves a year ago if there was a platform like this.

The biggest difference is that since we are not a general platform, we can make assumptions about the model and how version each release and we can built in some constrains and unique security models.

We took a lot of cues from the OData spec and Microsoft's reference design.

The rest of this comment is mostly targeted at the creators of Simperium.

You made very similar design decisions to us in a lot of ways. A lot problems though that you will face I see with your path here so I have a few tips for you.

* It sucks the iOS client isn't open source. I get scared of linking in third party libs into iOS projects because I have to account for anything you do when I go to Apple to submit my app.

* You really got to brush up on the objective-c naming conventions. Not to be harsh but `-(id)getCustomObjectForKey:(NSString * )key;` makes me cringe.

* Don't require me to have to know about your categories.

* Namespace your categories so you don't smash mine ("something like "SP_encodeBase64WithString" instead of "encodeBase64WithString")

* If you include third party libs, you MUST rename them and prefix with your prefix. I see you use ASIHTTPRequest, DDLogger, SocketIoClient, AsyncSocket, Reachability, and a few others. You will smash everyone else's implementations if they already had them included.

* Don't use ASIHTTPRequest internally (it's old and unmaintained and doesn't play nicely with ARC)

* PREFIX ALL YOUR CODE. We don't have any real name-spacing objective-c. As a framework developer, you have to be aware of that more than anyone else. I shouldn't be seeing DiffMatchPatch and SocketIoClient show up in my symbol list after linking your lib

* Your addDelegate/removeDelegate is funny. After you exhausted the need for one delegate, switch to NSNotificationCenter.

* DON'T USE XIB/NIBs. Interface builder for iOS was an after thought and it's only a 90% solution (unlike with Cocoa where interface builder was a first class product built side by side with AppKit). Especially don't make me have to include your XIBs in my bundle. At the very least give me a bundle with it in it.

* Separate your GIT repo. If I want to include your library as a submodule I have to take all the client libraries as well.

Now when it comes to the actual sync layer and how you generate JSON dictionaries and apply "patches" this is fine code.

Here are some feature requests:

* Instead of having to give you a single NSManagedObjectContext let us register them. We have a few (some use different concurrency types).

* Let us override what gets generated or if we want to ignore a field with userInfo keys in the core data model.

* Let us get an idea of your backend sync processes to be able to suspend and start them when we need and know if anything is pending. Give us a callback that you still have things to queued and when we are done so we can at our leisure set up UIBackgroundTasks on iOS 4+.

On an unrelated note, why not create this as a NSIncrementalStore and just put your code behind the persistent store coordinator instead of monitoring it? We are doing the same as you at the high level because we wrote our code pre iOS 5.0 but iOS 5 gives you an awesome new toy there.

[+] cloudmike|14 years ago|reply
"Might not of built something internally ourselves a year ago if there was a platform like this."

A great quote!

There's a lot of good feedback here, particularly regarding playing more nicely with other code. We'll do a pass, thanks.

* You can add overrides in your model's userInfo but this isn't documented yet. We'll do that after cleaning up the naming a bit more.

* The next major release uses NSNotificationCenter for the reason you mentioned.

* We chose a single repo for now since our samples tend to span languages/devices. We'll revisit this eventually. Your point about submodules is a good one.

* We're not currently using NSIncrementalStore for the same reason as you (needed < iOS 5 support).

[+] Void_|14 years ago|reply
Any argument against using CocoaPods for managing ASIHTTPRequest, Socket.IO, etc.?

Renaming seems like an ugly solution.

[+] saurik|14 years ago|reply
One thing that came up in the discussion about some other related offerings (such as Firebase) is the question: "what is your durability story"? When I'm spending the time to store things for my company, I make certain to ask such questions as "is this data important enough to keep on multiple servers, or is having a backup sufficient".

When storing data in services provided and managed by other companies, it becomes more difficult to be certain that the risk tradeoffs are reasonable, as I'm in essence trusting that you aren't just storing the data on a single server in RAM on memcache or something ;P. Some more details would thereby be much appreciated.

(I understand, btw, that I can also have a server keeping a mirror of all of my data, which is definitely awesome, but I also would then want to get a better understanding of whether I would be a fool not to be doing that, or whether I can feel comfortable with having data stored on your servers ;P.)

[+] andy_gayton|14 years ago|reply
Important question saurik! Our primary store is mongo. All data is replicated in different availability zones and at least one additional replica which is kept in a different ec2 region. A running snapshot backup is also taken at least every 30 minutes. Simperium isn't a messaging service. Every change is persisted and versioned. So for the Simplesmash demo, you could have a slider that would allow you to go back in time and replay movements. A note on versions: they are stored in an RRD like fashion. At first every version is kept, and as they accumulate; older versions are compacted and become less granular. We plan on making what level of granularity is kept an adjustable option in the future.
[+] arturadib|14 years ago|reply
I've been a SimpleNote user for a while, great to see the platform layer finally coming out.

I have a few questions:

- As a potential customer, I'm curious as to how sustainable is the company. I understand there was a YC seed investment in '10, but is the company well funded for the next couple of years at least?

- The platform seems to be ahead of Meteor, Firebase, etc, in that it already seems to have implemented a basic login and security model based on expiring tokens, but from the docs it seems like "finer grained control of these permissions are under development". Does this mean that presently any user can erase/modify data from any bucket, such as global data, data from other users, etc? If so, that's a big deterred for me.

- Are there any plans to allow querying the data? Key-value stores are fine for simple games, to-do lists, etc, but for any non-trivial app querying arbitrary fields is a basic requirement.

Thanks again, those are some great strides in the right direction.

[+] cloudmike|14 years ago|reply
By "finer grained control" we mean control over expirations and read/write permissions for a particular token. It's definitely not possible to erase/modify data from other users.

An example of where read-only permissions are useful is the live dashboard you see at simperium.com after you sign in. The "number of syncs" and alerts at the top are all pulled live from Simperium, but the token used on that page is a read-only token. We just need to expose the ability to create these read-only tokens to developers.

Actually, as a Simplenote user, you might be interested to know that our alerts and blog posts are pulled from Simplenote via Simperium. When we tag a note as "Alert" or "Published" it instantly appears on the dashboard.

Regarding querying, we're working on something for apps that can't or don't want to keep all data locally. In the meantime you can locally query however you'd like in your database of choice.

[+] saurik|14 years ago|reply
Is every object on a separate timeline for purposes of operational transformation and synchronization? As an example where this would be noticeable, if I make a change to one object, and then a different one, am I guaranteed that everyone will see the first change before the second? Put from the different side, is it possible for me to not have seen changes on one object yet, but end up downloading a fresh copy of a different object that is from a later point in time? Finally, related to that latter stating, is it possible to use the data before it has fully finished downloading, in that I might only initially need a subset of the objects (and is it even Simperium's goal to keep the entire data store synchronized, or just the active subset)?
[+] ecksor|14 years ago|reply
Correct, each object is on a separate timeline for purposes of operational transform. If both clients are connected at the time the changes are made, then it shouldnt be possible to see the second change before the first. We keep an ordered history of all changes to all objects, though for efficiency, if a client that wasn't present when changes were initially made syncs, changes to multiple objects may appear in a different order. For example, a modification to object A, then object B, then object A again, would appear in the correct order for all clients when those changes are made. If a different client then syncs and asks for what's changed, we may group the two changes to object A such that there would appear to be two changes, a change to B then a change to A.

There is currently limited support for keeping a subset of data synchronized (we have plans to improve this) - if you're using the client libraries in an app, we've focused on supporting the common case of keeping all the data per bucket for a user synced. In the case of keeping all data mirrored to a backend, we provide an endpoint per bucket that you can listen to all changes for all users so you can keep the entire data store synchronized.

[+] monatron|14 years ago|reply
Excellent product and superb demo.

Question that may be a bit off-topic but one I'd really like to know: what editor are you using there when you're editing the python service? Thanks and congratulations on your launch.

[+] zmmmmm|14 years ago|reply
Something I don't understand: when your fundamental value proposition is moving data between platforms ("everywhere it's needed"), why didn't you consider your MVP to include an Android library?
[+] cloudmike|14 years ago|reply
We agree Android is important, but we found enough developers who thought iOS/OSX, JavaScript and Python were sufficiently compelling, so we focused on those for launch.

Releasing more libraries is a priority.

[+] emersonmalca|14 years ago|reply
That is awesome and looks 300 times easier to implement than iCloud! I wish inClass had Simperium to sync across any device and the web.
[+] gmaster1440|14 years ago|reply
Looks awesome, quick question, is it possible to mutate data from a javascript client? In the API reference, all I see for javascript are event listeners. Also, how would a javascript client receive an access token? https://auth.simperium.com/ does not allow cross origin requests.
[+] ecksor|14 years ago|reply
Yes definitely, you can update directly using the bucket.update() method, and pass in an id and the data object. The better way would be to implement the 'local' callback to return the data, then just call bucket.update(id), and the library will call your 'local' method to retrieve the current state for that id. You can read about the 'local' callback here: https://simperium.com/docs/reference/js/#local

For auth, we'll be adding cross-origin support for https://auth.simperium.com, up till now we've been focused on supporting apps with existing backends which generally use the HTTP API to generate auth tokens from their server.

[+] dbfreq|14 years ago|reply
Are there any recommendations for the maximum size of a data set used with Simperium? It looks like the perfect solution for a back office app for a small business, but I'd like to know if I will be running into a brick wall at some point.

Clients will be iOS and Mac OS X only, using Core Data.

Thanks.

Brad