Ask HN: Block based cloud storage application. Promising - or not?
It works by implementing a massive virtual disk volume (up to 256 Terabytes) formatted with the platform filesystem-of-choice - NTFS/Windows only at this stage.
Block updates are encrypted and compressed on the fly then marshalled and snapshotted before exporting to Amazon S3 (but could be any cloud storage provider).
From the users perspective, they've just added a massive internal hard disk accessible via the standard fileystem access API's and tools. One way of thinking about it is as a cloud-backed-TrueCrypt but with full-volume versioning, thin provisioning and compression added.
Its primary benefits are:
* data security (offsite vs local, cloud storage robustness vs hard disk robustness)
* the ability to map and manage storage capacity many orders of magnitude larger than the local storage capacity of the installation platform ie. 100's of Terabytes of storage available on a tablet.
* it offers a transparent cloud storage gateway - users can leverage of cloud storage via the familiar storage disk model
As opposed to the folder/file representation implemented by the the webdav/fuse offerings, the block-based model retains all of the native filesystem attributes associated with the users files are retained (permissions, encryption, compression) as well as other filesystem metadata (journalling, quotas etc).
Bandwidth and storage overheads are surprisingly low: An empty 256 terabyte NTFS volume still requires a 600mb filesystem metadata overhead, yet that formatted volume can be represented using just 150kb of bandwidth/cloud storage initially while subsequent snapshots to the same volume only incur ~40kb each.
So it can readily and efficently perform regular fine-grained snapshots on user's data, consuming bandwidth and storage in the same ballpark as file-based technologies.
So that's what it does - here's where I am at the moment:
When I first started on it (quite some time back), my initial thoughts were to release it as a pro-sumer product or as a freemium service.
What I have come to realise is that even though tools like DropBox and ZumoDrive are not focused primarily on data security (ie. they deliver sharing and collaboration), the level of mindshare these services enjoy makes marketing any sort of cloud storage offering an uphill task, especially for a startup.
So I have had cause to stop and re-evaluate whether I should push on or drop it.
Consequently, I'm after feedback:
Is this approach/technology/product promising or a dog?
Are the benefits of full-fidelity data compelling enough to differentiate it?
Is it just a solution looking for a problem?
[+] [-] TuaAmin13|14 years ago|reply
It sounds like your "data security" isn't my data security. -"robust, efficient, zero-maintenance offsite storage" -"offsite vs local, cloud storage robustness vs hard disk robustness" When I say data security I mean that my data is encrypted and locked down six ways from Sunday. "Robustness" isn't security. "Offsite" doesn't mean security. I've got turnstiles, card readers, locked cages, and passwords standing between you and my local data. "Cloud storage robustness". That's security from failures, not necessarily security from hackers which is what data security means to me. You say it's encrypted but there's no mention if this is encrypted storage or if it's encrypted information transfer.
Other than you needing to re-evaluate your buzzwords, it sounds like you have a cloud SAN rather than a cloud NAS. That does sound different enough that you could definitely have a niche to pursue. There are programs that work better with SAN storage than NAS storage, but at some level I'm wondering: If I'm running my own MSSQL server (or something else I'd host in house that needs SAN storage), why would I pay for a local server but remote storage? Why wouldn't I just have a remote server with remote storage or a local server with local storage? I'm not immediately seeing the use case, perhaps you can paint me a word picture. Sure 256TB is great, but if you need 256TB you probably have more than enough money to buy your own storage or you have some crazy financial regulations or something that would require you to keep it in house. On the smaller scale, I run in to the "Why am I paying to keep a local file server around to distribute this remote block device?"
Again, a solid use-case or pain point may turn me in to a believer of why I need this product. From what you've described I'm not seeing it, but I like the technology potential.
[+] [-] ahaslam|14 years ago|reply
Yes - it is a cloud SAN. And it could easily be bundled with a software iSCSI stack in a virtual machine image as a virtual cloud storage gateway appliance for private datacenters. But for enterprise, I don't see the use case as primary storage ie. as the underlying storage for a DB. More likely as 'near-line' or archival storage where it is retained in a form that is readily mounted and accessible using standard FS api's for indexing, data mining etc. As I've been focused on consumer, I haven't done enough research into the comparable cost against other archival mechanisms like tape.
When I built it, I guess I was inspired by addressing my own pain points: I didn't have a flexible, easily interfaced data protection system that automatically made sure I had a copy of my files offsite. Like having the equivalent of a massive offsite external USB drive. I also wanted a solution that kept my files 'live' - rather than locked up in some arcane backup application image.
* Thanks for the feedback on the terms around security - will take this into account.
* Encryption is AES 256 on-the-fly - so it's encrypted before it hits local storage or remote storage - and it uses SSL for the transfer.
* The 256 Tb capacity is really just a way of moving the upper limit on capacity so that users don't have to worry about exhausting the available space in the volume - something that file-based solutions don't need to worry about. Users can choose whatever size volumes they want.
[+] [-] amock|14 years ago|reply
[+] [-] ahaslam|14 years ago|reply
I believe this class of cloud-backed SAN/NAS products fall under the umbrella of 'hybrid' storage appliances, acting as gateways to easily distribute your data from a private cloud to the public cloud - as well as accelerators by virtue of their caching capabilities.
[+] [-] persona|14 years ago|reply