(no title)
aktiur | 10 years ago
In the Numba case, you're basically modifying the image in place: it means no allocating a new array, no full copying. However, your pure-numpy code basically creates a new array (the result of np.dot) before copying it back entirely in image.
If you write the two functions so that they both return a new numpy array and do not touch the original one, the time difference drops from 4 times faster to 2.5 times faster. That's still an impressive difference, but at the loss of a bit of flexibility.
https://gist.github.com/aktiur/e1cddee8f699ded49824
N.B.: numpy.dot does not use broadcasting, i.e. it does not allocate a temporary array to extend the smaller one. The function handles n-dimensional arrays by summing on the last index of the first array, and on the second last of the second array.
onalark|10 years ago
edit: On reviewing, I think the intent of the original blog post was to modify images in place (or at least to do it as quickly as possible with in-place filtering ok). In that case, I think my comparison is fair, since NumPy doesn't offer a faster way to do the requested operation. I didn't try out einsum, but I think Numba would outperform that as well.