Hey all, as some keen-eyed commenters have pointed out, it looks like the rust program is not actually equivalent to the go program. The go program parses the string once, while the rust program parses it repeatedly inside every loop. It's quite late in Sydney as I write this so I'm not up for a fix right now, but this post is probably Fake News. The perf gains from jemalloc are real, but it's probably not the allocators fault. I've updated the post with this message as well.The one-two combo of 1) better performance on linux & 2) jemalloc seeming to fix the issue lured me into believing that the allocator was to blame. I’m not sure what the lesson here is – perhaps more proof of Cunningham’s law? https://en.wikipedia.org/wiki/Ward_Cunningham#Cunningham's_L...
arcticbull|5 years ago
Secondly, you re-implemented "std::cmp::min" at the bottom of the file, and I'm not sure if the stdlib version is more optimized.
Lastly, well, you caught the issue with repeated passes over the string.
I've fixed the issues if you're curious: https://gist.github.com/martinmroz/2ff91041416eeff1b81f624ea...
Unrelated, I hate the term "fake news" as it's an intentional attempt to destroy the world public's faith in news media. It's a cancer on civilized society. Somewhere your civics teacher is crying into some whiskey, even though of course you're joking.
[1] http://www.unicode.org/glossary/#unicode_scalar_value
arcticbull|5 years ago
Based on some cursory research, the go version differs in a more subtle way too. A Rune is a Code Point, which is a superset of the Rust "char" type; it includes surrogate pairs.
afiori|5 years ago
If we (correctly) rely on the media to bring to public attentions relevant facts (both criminal and non-criminal) and keep a watchful eye on the nation who then keeps a watchful eye on the media?
is the model entirely based on always being there enough good journalist to spot the bad ones? how is this affected by the very precarious economics of current internet ads-based venture-funded media enterprises?
I just blurted too many questions... what I am trying to say is that similarly with the police there is not as easy answer in shoud-trust should-not-trust (in the US a supreme Court judge advised to "not talk to the police").
in that case I guess part of the problem is that the job of the police can be miscontrued as "arresting people". in the same way the job of a journalist can be miscontrued as "getting clicks"
overall I don't think we can pass an a priori moral judgement on that term, as essentially represent a statement that the default safety measures have failed.
(I want to reiterate that here I try not to intermingle my point with whether I believe or not that the current use is warranted, I am just trying to say that as a concept it needs to be part of an healthy democracy, the same as some distrust in electoral promises)
dcow|5 years ago
Common examples:
* Look at this dank "meme".
meme has come to mean "a picture shared on the internet that has words on it".
* Let's [have a] "cheers".
It's a toast. You say "cheers" when you toast.
* You missed Suzie and I's party last night.
It's Suzie and my party. This one is particularly annoying because it's made it way past editors and into writing, screenplay, etc.
masklinn|5 years ago
I don't know that it would be a gain: Rust is pretty good at decoding UTF8 quickly given how absolutely fundamental that operation is, and "caching" the decoded data would increase pressure on the allocator.
Unless you also changed the interface of the levenshtein to hand it preallocated cache, source and destination buffers (or the caller did that).
edit: to the downvoter, burntsushi did the legwork of actually looking at this[0] and found caching the decoding to have no effect at best, unless the buffers get lifted out of the function entirely, which matches my comment's expectations.
[0] https://news.ycombinator.com/item?id=23059753
> But yes, I did benchmark this, even after reusing allocations, and I can't tell a difference. The benchmark is fairly noisy.
unknown|5 years ago
[deleted]
otterley|5 years ago
It's not Fake News. Fake News is the publication of intentionally false stories. This is just erroneous.
There's a yawning chasm between the two.
arcticbull|5 years ago
will4274|5 years ago
When news organizations take other news organizations word for it and the story is false, that's fake news. We called it something different back then, but fake news led to the invasion of Iraq. Negligence is sufficient for fake news, malice not required.
Ar-Curunir|5 years ago