> (Without Git-LFS): Heavy cost with zipping, packing, and delta-compression for larger files
Given the caveat (without Git-LFS) it seems odd to include this in the list
> If not properly tracked, binaries become accidentally part of "base" history
That's a big "if", and not an inherent problem. This could easily be resolved by any good design-focused UI (e.g. SnowTrack), so this seems a poor argument against using Git as a backend.
> Removing older commits is cumbersome due to Gits commit hashing integrity
This (like the first bullet point) does not apply to Git-LFS.
> Complicated rewriting history procedure
What?
> Issues with binaries >4GB on Windows
A known bug in Git-LFS that they're working to fix. There are workarounds provided in the linked tickets (that could be leveraged by a UI / abstraction layer like SnowTrack).
This is the first item in the bullet list that is a real disadvantage of Git LFS, but the workaround for it seems much less effort than developing a new VCS backend from scratch.
> Slow in binary modification detection
I'm not sure if this applies to Git or Git LFS; there's little detail provided. But if it's significant, this is probably the only really compelling disadvantage listed.
> Git uses a restrictive license
And finally we see the real reason for not using Git.
---
NOTE: I don't mean to make out that building an alternative VCS to Git is not worth pursuing. Nor that it needs any specific justification. Just that listing a justification that seems (to me) mostly disingenuous is worth pointing out.
Thanks for listing out the points you disagree with! The project is still in early alpha, so is the Readme. Therefore anything that is ambiguous or not clear, is worth to address.
The main requirement is performance (a missing point in your list). If Git would be a good candidate as a versioning system for DCC software packages, it would have been picked up by now, but it didn't happen, among others because of the reasons listed above. Git addresses a completely different target audience and lifecycle than SnowFS. The commit hash integrity is a problem in CG/VFX productions, so is the 4GB limitation, as well as the I/O performance for large binary files. The fact that these issues are still there are fully understandable, given the responsibility and dependencies of this project. That's why SnowFS tries to address the niche requirements with its light implementation.
In terms of the license, this is intentionally the weakest argument of all. It doesn't prevent anyone under the GPL to ship Git as an external program with a commercial software, same counts for libgit2 with its linking exception. So there is not even a real benefit here. But the chosen MIT license is an open invitation for everyone.
P.S. Certain features and technical solutions will be feature-proposed to libgit2
This is great, well done. While at (now defunct) Dotscience we did a lot of work on Dotmesh which you might find interesting: https://github.com/dotmesh-io/dotmesh
I would also look at data science/ML as a potential use for the tool as there are real issues with using Git for training data.
Last point which is more of a tip: Show don’t tell. If you did some side by side workflow walkthroughs showing the difficulties with other tools it will make it easier for people to see that this problem is real (which it definitely is)
Programmers, game makers and 2D/3D artists are very different target audiences with very different needs. To name one example, the commit hash integrity which is a foundation of Git is a must-have for software projects, but might be not useful in environments like VFX or CG productions
JavaScript[0] isn't what I would call a fast storage repository, but I guess it works out for prototyping.
On the context of porting to C and C++, or make it execute faster, I can see two options with minor rewrites.
Use AssemblyScript and generate native code via WebAssembly AOT compilers.
Try to adapt TypeScript to C++ compiler from Microsoft's MakeCode project.
Implement your own C++ code generator.
It would be much easier than keep using multiple code bases in parallel, plus any memory corruption issues would be most likely bugs on the code generator.
Indeed, TS/JS is great for its quick turnaround times for prototyping. But the I/O performance is executed by the underlying C/C++ layer. For the rest TS/JS is fast enough. But a full C++ backport is still on the horizon
lucideer|5 years ago
> Why not Git/Git-LFS, libgit2, or SVN?
> Disadvantages:
> (Without Git-LFS): Heavy cost with zipping, packing, and delta-compression for larger files
Given the caveat (without Git-LFS) it seems odd to include this in the list
> If not properly tracked, binaries become accidentally part of "base" history
That's a big "if", and not an inherent problem. This could easily be resolved by any good design-focused UI (e.g. SnowTrack), so this seems a poor argument against using Git as a backend.
> Removing older commits is cumbersome due to Gits commit hashing integrity
This (like the first bullet point) does not apply to Git-LFS.
> Complicated rewriting history procedure
What?
> Issues with binaries >4GB on Windows
A known bug in Git-LFS that they're working to fix. There are workarounds provided in the linked tickets (that could be leveraged by a UI / abstraction layer like SnowTrack).
This is the first item in the bullet list that is a real disadvantage of Git LFS, but the workaround for it seems much less effort than developing a new VCS backend from scratch.
> Slow in binary modification detection
I'm not sure if this applies to Git or Git LFS; there's little detail provided. But if it's significant, this is probably the only really compelling disadvantage listed.
> Git uses a restrictive license
And finally we see the real reason for not using Git.
---
NOTE: I don't mean to make out that building an alternative VCS to Git is not worth pursuing. Nor that it needs any specific justification. Just that listing a justification that seems (to me) mostly disingenuous is worth pointing out.
sebastian_io|5 years ago
The main requirement is performance (a missing point in your list). If Git would be a good candidate as a versioning system for DCC software packages, it would have been picked up by now, but it didn't happen, among others because of the reasons listed above. Git addresses a completely different target audience and lifecycle than SnowFS. The commit hash integrity is a problem in CG/VFX productions, so is the 4GB limitation, as well as the I/O performance for large binary files. The fact that these issues are still there are fully understandable, given the responsibility and dependencies of this project. That's why SnowFS tries to address the niche requirements with its light implementation.
In terms of the license, this is intentionally the weakest argument of all. It doesn't prevent anyone under the GPL to ship Git as an external program with a commercial software, same counts for libgit2 with its linking exception. So there is not even a real benefit here. But the chosen MIT license is an open invitation for everyone.
P.S. Certain features and technical solutions will be feature-proposed to libgit2
mrmrcoleman|5 years ago
I would also look at data science/ML as a potential use for the tool as there are real issues with using Git for training data.
Last point which is more of a tip: Show don’t tell. If you did some side by side workflow walkthroughs showing the difficulties with other tools it will make it easier for people to see that this problem is real (which it definitely is)
sebastian_io|5 years ago
jarym|5 years ago
sebastian_io|5 years ago
amelius|5 years ago
Honestly, I think the effort was better spent on an improved version of Git.
lhoff|5 years ago
[0] https://dvc.org/
sebastian_io|5 years ago
iaml|5 years ago
kevlar1818|5 years ago
I'll throw in a shameless plug for my tool in this area, Dud[2]. Dud is to DVC what Flask is to Django.
Are the mentioned benchmarks published somewhere?
[1]: https://dvc.org [2]: https://github.com/kevin-hanselman/dud
sebastian_io|5 years ago
[git add texture.psd: 20164ms] [snow add texture.psd: 4596ms] [git rm texture.psd: 575ms] [snow rm texture.psd: 111ms] [git checkout HEAD~1: 9739ms] [snow checkout HEAD~1: 1ms]
You might get slightly slower speeds on NTFS for 'add' and 'checkout' but still very performant
erlend_sh|5 years ago
sebastian_io|5 years ago
pjmlp|5 years ago
On the context of porting to C and C++, or make it execute faster, I can see two options with minor rewrites.
Use AssemblyScript and generate native code via WebAssembly AOT compilers.
Try to adapt TypeScript to C++ compiler from Microsoft's MakeCode project.
Implement your own C++ code generator.
It would be much easier than keep using multiple code bases in parallel, plus any memory corruption issues would be most likely bugs on the code generator.
[0] - Yes I know the source code is Typescript.
sebastian_io|5 years ago