Why is printing “B” dramatically slower than printing “#”? (2014)

[+] nsxwolf|9 years ago|reply

Best comment:

"The real answer is clearly because hashes are faster than b trees."

[+] retro64|9 years ago|reply

This reminds me of a bug I had back in ‘98 or so. I was using a Parallax BASIC stamp and could not understand why when I put it to sleep it would sometimes using only 30 uA, and other times 300uA. I kept narrowing down my code until it was utter nonsense, and finally I made it to where if I removed or added a byte of code it would toggle between the two.

I got on the phone with the engineer responsible for the firmware and sent him my code, but I never did find out what the problem was (sorry for the letdown, maybe someone at Parallax remembers and reads Hacker News?). They acknowledged it as a bug, and made a fix. Unfortunately the fix made was to always draw 300uA at sleep!

Fast forward 18 years (!) and from their website it looks like their latest modal draws 50uA, so it was a happy ending after all. The end.

[+] anjc|9 years ago|reply

I love it when somebody immediately knows the answer to something seemingly obscure.

[+] xeniak|9 years ago|reply

Sometimes I think these guys know something obscure, and then post a question about it just to answer it themselves.

SO actually allows this, Q and A from the same account. Some people use sock-puppets though.

[+] yyhhsj0521|9 years ago|reply

He must had a hard time figuring it out when he first encountered the problem..

[+] p333347|9 years ago|reply

I am amused when someone comes up with obscure problems. I am disappointed when that person doesn't brief about the scenario that led them to such a problem.

[+] mrb|9 years ago|reply

This is a great example of a "10x engineer". Someone less experienced would take 10x or 100x the time to investigate and resolve this problem.

[+] atdt|9 years ago|reply

Heh, reminds me of a bug I ran into a couple of weeks ago. Users on Wikipedia were complaining that typing into the editor <textarea> on certain pages was sluggish. I eventually narrowed it down to the presence or absence of an invisible left-to-right or right-to-left mark control characters elsewhere in the page. See https://bugzilla.mozilla.org/show_bug.cgi?id=1296050 for the gory details.

[+] rwmj|9 years ago|reply

My guess was going to be Unicode, the reason why 'sort' on Linux can be so slow if you're in certain locales.

https://bugs.launchpad.net/ubuntu/+source/coreutils/+bug/846...

[+] stygiansonic|9 years ago|reply

So, in effect, with word-wrapping on in the terminal, this scenario effectively produces a "Shlemiel the painter's algorithm" [0] in that for every character output on the same line (when it exceeds the line length), you have go backtracking in vain to find a place to break - where this is none - so outputting N characters on a line is basically an O(N^2) operation.

0. http://www.joelonsoftware.com/articles/fog0000000319.html

[+] heywire|9 years ago|reply

I wonder if this is also why the default terminal in Ubuntu slows to a crawl if you write a ~4kb long line (I was working on a program which was processing some SDR data, outputting 0/1 to the screen without any line breaks).

[+] nkrisc|9 years ago|reply

Maybe it's just not reflected in the question, but did the asker not immediately try other characters to see if there's a pattern? Given the answer, other letters would probably have had the same result and other characters would not. They could have at least narrowed the problem down to letters.

[+] JohnStrange|9 years ago|reply

While we're talking about obscure speed differences, I recently needed to traverse large directories and collect files with a certain suffix and found out that using "find" as a subprocess and parsing its output line-by-line was orders of magnitudes faster than any direct directory traversal I wrote and any other method I've tried, including other Unix utilities.

I don't know how it works but it's insanely fast.

[+] Lanzaa|9 years ago|reply

The find utility is probably smarter than the other methods. I know two things to watch out for: sorting the directory entries and calling stat on each entry.

You might like to read the rationale section of PEP 471: https://www.python.org/dev/peps/pep-0471/

[+] agumonkey|9 years ago|reply

So printing B isn't slower: terminal rendering of lines of text is just less obvious than thought.

[+] __jal|9 years ago|reply

Any time I'm seeing really strange results involving the terminal, I try to look for terminal-related weirdness. There is a lot of stuff going on that people generally ignore or simply don't know about, because it generally just works.

And when it doesn't, if you're not at least a little familiar with what's going on, it can be a deeply confusing and obscure little world.

[+] pedrow|9 years ago|reply

For what it's worth, the difference still exists when printing with Netbeans 8.1 - 41 seconds for 'B' and 2 seconds for '#'

[+] marxidad|9 years ago|reply

That also has to do with line-wrapping of word characters.

[+] acqq|9 years ago|reply

A lot of comments previously on HN here:

https://news.ycombinator.com/item?id=7294186

[+] giomasce|9 years ago|reply

My first guess was that for most fonts the glyph "B" has splines, while "#" does not (only segments) and in line of principle rendering splines is more expensive than rendering segments. But probably this is nowhere that relevant.

[+] bjterry|9 years ago|reply

According to my understanding, the rendered character is cached by the OS, so this probably shouldn't have any effect.

[+] misterdata|9 years ago|reply

That was my first thought too, but I expect glyphs will probably be cached for a while after being rendered, which should make any difference in rendering speed negligible.

[+] Myrmornis|9 years ago|reply

A good first step for investigating this would have been to redirect output to a file.

[+] baristaGeek|9 years ago|reply

Let's talk about any potential application this could have.

The first thing I can think of is that if you're writing a flood-filling algorithm (finding connected components) you should fill with #s instaed of Bs, even if filling with Bs is more conventional. Specially if you're doing competitive programming and need to overkill the optimization process in order to pass the run time.

As for production coding, I can't think of anything. Any ideas?

BTW, this is probably speculation, but it's still worth discussing in my humble opinion.

[+] Sanddancer|9 years ago|reply

Things like printing out progress bars for long operations stand out here. Lots of programs have an option to do such as a quick and dirty way to give feedback.

[+] macromaniac|9 years ago|reply

While stuck at a meeting today I learned you can type faster than notepad can render by making the text really big. (AltSpace X, Alt O, F, S, 600, Enter, spam buttons) The text lags just slow enough so you can use it as a ghetto marquee.

[+] UlyssesSKrunk|9 years ago|reply

Because # is only straight lines, but for b you gotta do the whole bezier curve to make it look right.

54 comments