(no title)
brianwski | 4 years ago
>
> This seems needlessly dismissive. I feel like they definitely know that you can execute the same binary multiple times.
I'm the programmer at Backblaze that made the copies on purpose, I wrote some extra code to do this, and it's meant to help us debug certain things. Yes they are identical, the installer only ships with one copy of the executable, the installer then makes the copies on purpose. I get to explain this from time to time. :-)
In Windows when you want to know what is going on behind the scenes, you can bring up Task Manager and look at the different names of the different processes that are running. On the Macintosh this is called Activity Monitor, same sort of thing. The different names for the executables are for different "threads" which have different roles. Backblaze is multi-threaded to get higher performance.
The parent coordination process is called "bztransmit". But when doing the actual transmission it spawns the bztrans_thread01, bztrans_thread02, bztrans_thread03, etc.
So BEFORE I made multiple copies of the executables with different names, a customer would say "bztransmit is hung" or "bztransmit is using up too much memory". There was very little visibility into this. But now that I made multiple copies with different executable names, when the same customer says "bztrans_thread03 is hung". Or they say "bztransmit is using too much memory". We immediately have narrowed down what to look at.
Here is a screenshot showing what "Chrome" looks like to me in Windows, and how it compares to how Backblaze's bztransmit looks like to me in Windows: https://i.imgur.com/KOJHJ9Q.jpg In that screenshot, you can see there is the "main thread", and "worker threads". Meanwhile chrome is just one big list of processes all named the same thing (see the screenshot). I prefer the Backblaze system, but I understand it upsets some customers that prefer the chrome experience.
That's it. It's not some huge mystery.
One question asked here was do we know you can launch the same executable twice? Yes, and we do that. The bztrans_thread05 is launched for thread 05, thread 25, thread 45, thread 65, etc. It's THREAD_NUMBER mod 20. Here is what it looks like to hit 500 Mbits/sec upload speeds, this isn't photoshopped, it's a real screenshot on my development computer: https://i.imgur.com/hthLZvZ.gif
Another question is: why not use hard links or symbolic links? That's the only real optimization possible here, everything else was on purpose. The answer is not an excuse, it's just an explanation if you are curious. The software we develop at Backblaze is cross platform, so what we like to do is make the most general form first that will always work, then if customers complain or we want to refine it we special purpose code in per file system or per platform. The most general thing to do is make full copies. We could then go on to make links on the Mac WHEN POSSIBLE and the equivalent on Windows WHEN POSSIBLE, but it never became a large priority. The reason I can't use one technology is we support several file systems, and not all of them are the same or support the same technology for links.
Every feature we have is the result of prioritizing it over working on other things. Until recently, we did not have a lot of funding or an infinite supply of programmers, so we had to choose what order to implement each feature in. I'm not saying we got all the priorities correct, or that we did things in the correct order. I don't really even think there is one correct order. For example, some individual home user customers prefer saving 180 MBytes of their valuable boot SSD space over me implementing single sign on for our corporate customers. On the other hand some corporate customers DEMANDED single sign on or they wouldn't purchase the product at all. They are both correct, but there is only one of me, so we made some judgement calls and left the multiple copies and worked on single sign on. Some customers were happy, some are miserable.
We do have open client recs for both Windows and Mac programmers, so if you would like to make a good salary, full benefits, and help us out, come join us! :-)
rgovostes|4 years ago
Consider also that having multiple distinct binaries interferes with (dis)allowing Backblaze to reach the internet with process-based firewalls like Little Snitch, because each copy needs to be configured separately.
Some tools, like Docker, have a function for collecting relevant diagnostics. Perhaps it would be useful to you to migrate towards a solution like that rather than asking the user to identify a misbehaving process on their own.
brianwski|4 years ago
Thanks, that was super exiting for us. After 14 years, I claim (and this is controversial) that we're no longer a startup and now we're just a mid-sized publicly traded company. :-)
> process-based firewalls like Little Snitch, because each copy needs to be configured separately
Yeah, that was actually a surprise and unfortunate. What the Mac architect (one of my business partners) and I think is that now that it is nice and stable, we might go down to 1 or 2 bztrans_thread executables, and one bztransmit. That seems like a better tradeoff where we waste much MUCH less disk space, and it is only 3 executables to allowlist in Little Snitch, and it achieves basically what we want now that it's stable and working well.
Originally there were 10 threads MAXIMUM, and we made 10 copies. And each copy was linking with shared libaries so it was only 10 MBytes of disk space which nobody noticed. Then Windows lost their friggin' minds with one of their releases and forced us to link statically which bloated it way up to 5 or 10 MBytes per executable. Then we went to 20 threads maximum and the whole thing was silly. When we went to 100 threads maximum we said "enough" and went to mod 20 for re-using executable names.
By the way, ALL OF THIS could be avoided if Microsoft and Apple provided an API to set the name displayed in Task Manager/Activity Monitor. Maybe that's a security issue, I don't know. But frankly wouldn't it be SUPER TOTALLY USEFUL if chrome displayed the current web page loaded in the process name of each and every chrome process? Then you would know which one to kill when something goes sideways.