In all seriousness, though, I wanted this so badly that I started (and failed) a startup with 12 employees nine years ago to build it. It was conceived for use in working with "big data" but the system essentially provided Etherpad-like scrubbable versioning of all common office document formats as a side-effect. All of it was structured in an environment more similar to the social aspects of Github than Dropbox, but you could sync up to your filesystem via a FUSE wrapper. That is, people could easily follow or fork your work in progress. If we'd continued, you'd have been able to accept the equivalent of PRs on your Word docs.
It was so awesome that we couldn't find anyone to pay for it, sadly. Armchair quarterbacks would fairly accuse us of failing to do proper customer development.
I can't speak to the technical limitations of Dropbox's versioning implementation, but given that they already have both viewing AND versioning running for a decade, I honestly can't believe it would take more than a few months for a small team to implement Etherpad-like editing functionality for the office suite document formats.
That sounds pretty cool, though I'd have to think about how such a thing would function with the workflow of my team. I think it could be done though! It sounds like a more reliable way to do shared work on office docs.
I see some issues with it. No matter how smooth you make this, any software with pull requests is going to be considered "technical". Christ, people think basic excel skills are "technical". SO you have to get over that hurdle. But personally, if I've already gotten people over that hurdle, I might as well just use git and LaTeX documents.
I don't know, I think it sounds awesome, but I also think it might be tough to sell.
> but you could sync up to your filesystem via a FUSE wrapper. ... follow or fork ... accept the equivalent of PRs on your Word docs
> It was so awesome that we couldn't find anyone to pay for it, sadly. Armchair quarterbacks would fairly accuse us of failing to do proper customer development.
I get the impression that it was extremely complicated and didn't fit into anyone's workflow. IE, if you approached someone using Dropbox, they'd have to change far too many habits just to switch to you.
The irony here, is that Drew jokes about how Dropbox is going to solve these ridiculous file name versioning convention with their product in their famous YC application:
> Please tell us something surprising or amusing that one of you has discovered.
(The answer need not be related to your project.)
> The ridiculous things people name their documents to do versioning, like "proposal v2 good revised NEW 11-15-06.doc", continue to crack me up.
And yet here we are a decade and change later and Dropbox, while having solved "a" problem, sits like a ridiculous behemoth leaving it's users hungry for so many other pain points to be addressed by another savior, including especially this one problem they said they're gonna solve.
Dropbox still hasn't solved many of its core issues but it has been investing in Paper (which I personally have never seen anyone using) and all that design crap from a couple of years ago.
I introduced a lot of people to Dropbox like 8-9 years ago and after using it to share files with other people I found out the hard way it's a terrible tool for that. I then used it for a couple more years to share files between my machines but they haven been introducing so much crap in their desktop app that I moved to sync.com.
Former dbx employee here— they always wanted to do it but it is technically challenging to build a fully functional product here accounting for things like formatting/comments/etc. when you have such a large enterprise user base there are often trade offs - ship a basic prototype and risk customer confusion/complaints or invest lots of resources and draw away from other projects
This is such a non-argument. They could've easily just started with text-diffs, then photo diffs. And later do doc diffs. Perhaps with some disclaimers. Heck, they don't do that for their main product, so why would they even do that for such a product. It's probably in the terms somewhere.
The reason they're not doing it is because they want a piece of the productivity pie.
It is challenging, very much so, but it can can be done. I built a prototype for Word files based on Git (can also use the GitHub
API, so making it work with the Dropbox API should be doable). I implemented sort of a blame function as well: Jump to the previous version of a paragraph with just one click.
As OP said, it took a lot of effort to get the UI ok. Probably takes even more effort to create a great UI, but I guess Dropbox has some resources, right? Shameless plug: Landing page at https://julesdocs.com
If anyone is interested in pushing this forward, I'd love to hear from you (mail address on the landing page)!
They could have released it to personal accounts? At the rate they have embraced and catastrophically abandoned other vastly more fundamental features (packrat? Photos?) This seems far more easier to roll out slowly. Seems more like they've lost their way.
And I will never ever forgive y'all for what you did with mailbox! (Like seriously what did they do?)
A feature like this is VERY application specific for a lot of files since you can't just take out the rendering engine and would need to usually have a third party make the software to render to web-views, whether opensource or proprietary. It's not even as if first party software is allowed to run as server. Example, psd rendering to web. AFAIK photoshop has no server license. Pretty much all services that need to render psd files use ImageMagick afaik. I looked it up and iirc. Photoshop's own api is pretty terrible to interact with and iirc licensing for servers is weird and expensive even if available.
EDIT: This comment is almost a word salad, I need to sleep lol.
While true, as others have said it could be rolled out piece-meal. For me, for example, simple text diffing would scratch a major itch, both for text and code.
Am I missing something? This already exists in Google Docs. Much easier to implement when the doc is database-backed (recording every keystroke for OT) vs file-based.
The Docs versión history tooling is pretty weak, though — you have to repeatedly select a timestamp. You can’t scrub, and you can’t even click on text and get “who put this here, when?”
Yeah, the thing is, you can't do this in a file-format-agnostic way (you need to know what a Word doc or an Excel sheet is), which makes the file system layer the wrong level of abstraction to consider.
Word already has a diff view implementation that is pretty robust - it’s very useful for figuring out what changed across manually-versioned documents. This is in addition to classical track changes feature.
Adobe Acrobat also has a diff (including visual diff) feature that can be used to do advanced comparisons if necessary.
Granted, author’s suggestion is more user friendly and integrated.
my solution is to use pandoc to generate the diffs. Combines the benefits of word formatting but allows me to see the changes in git. (I use it mainly for my resume)
Have you seen any options take advantage of the fact that docx files are just zipped xml files? I can see the git repo ballooning if you have a few images and you commit frequently!
> ...but this is useless. Timestamps??? Tell me what changed! Let me see the changes over time. Word has a change tracking feature, but my PhD in computer science isn't enough for me to figure it out.
> But but but Austin, you should be using a proper version control system! Just use Git and GitHub!
Found that aside curious, as track changes in Word is a first class versioning implementation with word processing and editors savvy, just as Git is a first class versioning implementation that's code lines and commits savvy.
Surely headspace around track changes is less "PhD" than git.
I can see a company like github or dropbox developing visual versioning and promoting it to make users dependent upon it. It would be an extremely sticky feature that made it hard for users to like competing products.
Imagine how github could push for MS Office integration and become a versioning powerhouse for non-code-stuff.
But I can't see it standing as a stand-alone product that people would really pay for. It has to be part of something else.
Version handling is built into Office 365, and many comments here indicate it's even in the relatively crappy Google Docs, but I'm sure there's a market for pretending it isn't and selling incredibly shitty half-baked attempts as a B2B SaaS offering (this is not a sarcastic "I'm sure", I know about this market space and it disgusts me on a deep level)
I’m a lawyer who uses Word’s track changes as an integral component of my work. I haven’t seen a single meaningful improvement in that feature in 20 years. Right down to the fact that I have to open “compare...” from within a document, but then have to go hunt the same document down in the file system to set it as the original. Don’t get me started on every other reason that Word has failed to innovate on this front.
The solution is to dump Office and use text files, if you can get away with it.
We use Office 365 at my workplace and I have found the version handling to be lacking. AFAIK it can’t diff two versions of a document in a convenient way. Another gripe I have is that it is not, AFAIK, possible to tag/name versions.
This looks similar to redline and blackline document comparisons[1]. We do this on our site[2] where we display large financial documents that average 100 pages. Identifying what text and tables were removed, added and changed from one year to another is useful information for predicting future company earnings[3]
Modern MS documents files are zipped XML. To do this comparison they would need to unzip each file, run it through a rendering engine and hold it in memory, and then do version comparison. For this to be feasible you would need to use a file format that supports this sort of comparison in a way that isn't very resource intensive.
It's not that, it's not like 100% of your users will be diffing documents 100% of the time. The real reason is that office formats are super, super complex and diffing them is a hard problem, even more so for the proprietary Microsoft formats.
The "zipped XMLs" you mention are basically XML dumps of the former binary format that evolved organically from the 1980s, when resources were scarce and they had to hack together a working office solution.
Not all of them. I believe Microsoft uses a special format for Office documents in OneDrive. (These files are converted to xml when you access them with non-Microsoft software)
I’d also like to add on a different note, I don’t really get why git can’t support docx, pptx, and xlsx. They’re open standards not binary blobs. Basically just zipped xml.
Cloud word processors like Zoho Writer & Google Docs already have version comparison features. But this idea of a sliding time traveler for documents is very intuitive!
Also Zoho Writer has a combine feature, that lets you upload a docx and combine it with another docx - with the changes highlighted as tracked-changes. Pretty handy for comparing docx files.
[+] [-] peteforde|6 years ago|reply
In all seriousness, though, I wanted this so badly that I started (and failed) a startup with 12 employees nine years ago to build it. It was conceived for use in working with "big data" but the system essentially provided Etherpad-like scrubbable versioning of all common office document formats as a side-effect. All of it was structured in an environment more similar to the social aspects of Github than Dropbox, but you could sync up to your filesystem via a FUSE wrapper. That is, people could easily follow or fork your work in progress. If we'd continued, you'd have been able to accept the equivalent of PRs on your Word docs.
It was so awesome that we couldn't find anyone to pay for it, sadly. Armchair quarterbacks would fairly accuse us of failing to do proper customer development.
I can't speak to the technical limitations of Dropbox's versioning implementation, but given that they already have both viewing AND versioning running for a decade, I honestly can't believe it would take more than a few months for a small team to implement Etherpad-like editing functionality for the office suite document formats.
[+] [-] haddr|6 years ago|reply
[+] [-] Enginerrrd|6 years ago|reply
I see some issues with it. No matter how smooth you make this, any software with pull requests is going to be considered "technical". Christ, people think basic excel skills are "technical". SO you have to get over that hurdle. But personally, if I've already gotten people over that hurdle, I might as well just use git and LaTeX documents.
I don't know, I think it sounds awesome, but I also think it might be tough to sell.
[+] [-] gwbas1c|6 years ago|reply
> It was so awesome that we couldn't find anyone to pay for it, sadly. Armchair quarterbacks would fairly accuse us of failing to do proper customer development.
I get the impression that it was extremely complicated and didn't fit into anyone's workflow. IE, if you approached someone using Dropbox, they'd have to change far too many habits just to switch to you.
[+] [-] fragmede|6 years ago|reply
[+] [-] ramraj07|6 years ago|reply
> Please tell us something surprising or amusing that one of you has discovered. (The answer need not be related to your project.)
> The ridiculous things people name their documents to do versioning, like "proposal v2 good revised NEW 11-15-06.doc", continue to crack me up.
https://www.ycombinator.com/apply/dropbox
And yet here we are a decade and change later and Dropbox, while having solved "a" problem, sits like a ridiculous behemoth leaving it's users hungry for so many other pain points to be addressed by another savior, including especially this one problem they said they're gonna solve.
[+] [-] pier25|6 years ago|reply
I introduced a lot of people to Dropbox like 8-9 years ago and after using it to share files with other people I found out the hard way it's a terrible tool for that. I then used it for a couple more years to share files between my machines but they haven been introducing so much crap in their desktop app that I moved to sync.com.
[+] [-] anvisha|6 years ago|reply
[+] [-] jbverschoor|6 years ago|reply
The reason they're not doing it is because they want a piece of the productivity pie.
They're not getting it from me. Ever.
[+] [-] jandinter|6 years ago|reply
As OP said, it took a lot of effort to get the UI ok. Probably takes even more effort to create a great UI, but I guess Dropbox has some resources, right? Shameless plug: Landing page at https://julesdocs.com
If anyone is interested in pushing this forward, I'd love to hear from you (mail address on the landing page)!
[+] [-] ramraj07|6 years ago|reply
And I will never ever forgive y'all for what you did with mailbox! (Like seriously what did they do?)
[+] [-] gwbas1c|6 years ago|reply
Sadly, we never developed it.
But, yes, I totally agree with your rationale. This is the kind of rabbit hole that can quickly turn into a distraction.
[+] [-] BLanen|6 years ago|reply
EDIT: This comment is almost a word salad, I need to sleep lol.
[+] [-] mercer|6 years ago|reply
[+] [-] rsync|6 years ago|reply
[1] https://www.rsync.net/resources/howto/remote_commands.html
[+] [-] fragmede|6 years ago|reply
[+] [-] narak|6 years ago|reply
[+] [-] aaronharnly|6 years ago|reply
[+] [-] cwyers|6 years ago|reply
[+] [-] typicalhn|6 years ago|reply
[deleted]
[+] [-] Shorel|6 years ago|reply
Dropbox could just buy one of these companies and work on integrate the solution with its platform.
All arguments about the complexity of this feature are bogus when it has been solved several times by different vendors over the last decades.
One such tool found via DDG: https://draftable.com/compare
[+] [-] cpach|6 years ago|reply
[+] [-] drglitch|6 years ago|reply
Adobe Acrobat also has a diff (including visual diff) feature that can be used to do advanced comparisons if necessary.
Granted, author’s suggestion is more user friendly and integrated.
[+] [-] cpach|6 years ago|reply
It does? Didn’t know that. How does one activate it?
[+] [-] vivekkalyan|6 years ago|reply
I wrote up about it here for the curious: https://www.vivekkalyan.com/using-git-for-word
[+] [-] ramraj07|6 years ago|reply
[+] [-] woozyolliew|6 years ago|reply
[+] [-] Terretta|6 years ago|reply
> ...but this is useless. Timestamps??? Tell me what changed! Let me see the changes over time. Word has a change tracking feature, but my PhD in computer science isn't enough for me to figure it out.
> But but but Austin, you should be using a proper version control system! Just use Git and GitHub!
Found that aside curious, as track changes in Word is a first class versioning implementation with word processing and editors savvy, just as Git is a first class versioning implementation that's code lines and commits savvy.
Surely headspace around track changes is less "PhD" than git.
[+] [-] willvarfar|6 years ago|reply
Imagine how github could push for MS Office integration and become a versioning powerhouse for non-code-stuff.
But I can't see it standing as a stand-alone product that people would really pay for. It has to be part of something else.
[+] [-] cpach|6 years ago|reply
IMHO, if it could be smoothly integrated to e.g. Git then there would probably quite a few companies that would pay good money for it.
[+] [-] juped|6 years ago|reply
[+] [-] nihonde|6 years ago|reply
The solution is to dump Office and use text files, if you can get away with it.
[+] [-] cpach|6 years ago|reply
[+] [-] bnj|6 years ago|reply
[0]: https://etherpad.org
[+] [-] usaar333|6 years ago|reply
[+] [-] hbcondo714|6 years ago|reply
[1] https://en.wikipedia.org/wiki/Document_comparison
[2] https://Last10K.com/compare.gif
[3] https://www.bloomberg.com/opinion/articles/2018-05-22/10-k-c...
Update: The site in reference is https://Last10K.com
[+] [-] cpach|6 years ago|reply
[+] [-] Lanrei|6 years ago|reply
[+] [-] oblio|6 years ago|reply
https://www.joelonsoftware.com/2008/02/19/why-are-the-micros...
The "zipped XMLs" you mention are basically XML dumps of the former binary format that evolved organically from the 1980s, when resources were scarce and they had to hack together a working office solution.
[+] [-] bambax|6 years ago|reply
If you would simply render each version to plain text and compare them (which is a solved problem), it would already be very useful.
[+] [-] jannes|6 years ago|reply
[+] [-] vxNsr|6 years ago|reply
It’s obvious UI on the level of pinch to zoom and mouse input. Hard to come up with but obviously the right choice once suggested.
[+] [-] vxNsr|6 years ago|reply
[+] [-] forrestthewoods|6 years ago|reply
[+] [-] lewisjoe|6 years ago|reply
Also Zoho Writer has a combine feature, that lets you upload a docx and combine it with another docx - with the changes highlighted as tracked-changes. Pretty handy for comparing docx files.
https://writer.zoho.com
[+] [-] Pfhreak|6 years ago|reply
I don't think I'd want a scrub bar like that though, maybe? I suppose I've never tried it.
[+] [-] kurthr|6 years ago|reply
[+] [-] dexen|6 years ago|reply
OHTF Vg'f uneq gb hfr guvf pbzznaq jvgubhg fvatvat.
--
[1] http://man.cat-v.org/plan_9/1/yesterday