top | item 23059560

(no title)

chrfrasco | 5 years ago

> The second problem is that at least the Rust code is decoding UTF-8 every iteration of the inner loop instead of decoding once and saving the result, or even better interning the characters and having versions of the inner loop for 32-bit chars and 8-bit and 16-bit interned indexes.

I tried this. Pulling the .chars() call out of the loop & collecting into a Vec made the performance even worse – the following balloons the runtime from ~2.7s to ~5s:

    let target_chars: Vec<char> = target.chars().collect();
    for (i, source_char) in source.chars().enumerate() {
        let mut next_dist = i + 1;

        for (j, target_char) in target_chars.iter().enumerate()  {
> written mindlessly > incompetence of the person

No challenge there :P I am operating under the assumption that I don't need to understand how compilers work to get good performance from rust (where good is "similar enough to an equivalent go program")

discuss

order

burntsushi|5 years ago

Indeed, it looks like doing the UTF-8 decoding up-front is exacerbating the performance difference in the allocator.

I think this is where the GP's first suggestion comes into play. If one were writing this code _and_ cared about performance, then you'd usually find a way to reuse allocations. I submitted a PR to demonstrate this: https://github.com/christianscott/levenshtein-distance-bench...

chrfrasco|5 years ago

Yeah, I can definitely see how that would be a more performant approach.

I suppose I wasn't so interested in figuring out how to make this algorithm as fast as possible as much I was interested in diving into why this particular implementation was slower.

I'm not totally convinced that this difference is down to the string being parsed over and over, though

> doing the UTF-8 decoding up-front is exacerbating the performance difference in the allocator

This seems to suggest that allocation might be dominating here. WDYT? Either way, I've added a disclaimer to the post.