This is good to know, but fairly unsurprising. The interesting case is List<T>. The .Count property is actually a function call, and the value could change during the loop. If you don’t mutate the list, is it smart enough to both inline the function call and hoist the value out as an invariant?
There's no mathematical way for it to know that you won't modify (especially remove). The developer would have to express that by, say, iterating over the internal array directly.
Nice benchmarking on Microsoft .Net CLR. Looks like the JIT compiler is smart enough to recognize that array.Length is an invariant and hoist it out of the loop, which is awesome for the common use cases!
One nitpick about the title: C# runs on more runtimes than Microsoft .Net CLR, and those may behave very differently. For example: Mono CLR, or Unity's IL2CPP which is an ahead-of-time compiler.
Specifically, I'd expect IL2CPP would not hoist length out, because it would not recognize it as an invariant. (Some great examples of IL2CPP cross compilation are here: https://jacksondunstan.com/articles/4749 )
TLDR: the Microsoft JIT compiler makes the local variable unnecessary, but this is a property of the JIT, not of C#. Developers on non-MS platforms shouldn't assume this.
Always accessing array.Length is defensively coding. In the event your array is mutated, always accessing array.Length ensures you won't run into an Index Out of Bounds exception.
Even better is to just avoid accessing the array's length. I almost always use foreach or Linq.
I'm not a C# developer, but this kind of thing seems to permeate almost all languages in one form or another.
Maybe it is just a style and readability thing; or maybe (as suggested elsewhere as well) it is meant to be reused elsewhere in the system, so it is cached in a variable for later use.
Or, it's possible that at one time - maybe early in the early days of .NET - doing it this way was more optimized, and the habit stuck with developers (perhaps they all read the same article in the knowledge base about it?). If that's the case, it's a bit of "premature optimization", but one that doesn't apparently harm anything.
What I do wonder is if certain other changes could change the speed?
At least it might be interesting to see in these trivial cases; I admit that in more complex loops it might not be advisable.
But - for instance, what if rather than iterating thru the array from the 0th element to the length of the array, you instead started from the last element and iterated backwards, until you hit zero? That way, you wouldn't be checking the length of the array, but rather for zero?
The code for such a test might look like:
public int WithoutVariable() {
int sum = 0;
for (int i = array.Length - 1; i > -1; i--) {
sum += array[i];
}
return sum;
}
I'm not sure that a "with variable" version would make much difference (or sense), but here it is for completeness sake:
public int WithVariable() {
int sum = 0;
int length = array.Length - 1;
for (int i = length; i > -1; i--) {
sum += array[i];
}
return sum;
}
Again - I'm not a C# developer - maybe my code is wrong above, but hopefully it gets the idea across.
Would this work better? Would it be faster? What would the JIT compiler create? Maybe it wouldn't be any faster or better than the ForEach examples?
I honestly don't know - but if anybody wants to give it a shot, I'd be curious as to the results...
EDIT: I noticed that I said "checking for zero" - but I modified my code to check for -1 as the boundary; I suppose the check in the loops could be modified to be "i == 0;" instead. I'm not sure if whether doing an "i >= 0;" vs "i == 0;" vs "i > -1;" which is faster - another thing to check, I suppose...
”what if rather than iterating thru the array from the 0th element to the length of the array, you instead started from the last element and iterated backwards, until you hit zero?”
That used to be common in assembly, as it leads to smaller and faster code on many systems.
I wouldn't naturally think about summing lists in reverse so it becomes more cognitive effort to understand the code. I've seen plenty of termination condition bugs on reverse iterations that might back that up.
A related thought is how modern code clean-up tools are doing things like reducing if-nesting e.g. turning this:
if (open)
{
stuff();
close();
}
into this, with early returns:
if (!open) return;
stuff();
close();
My feeling is that the first more naturally represents the idea I have of the behaviour and the second, like your reverse iteration, is an encoded version of that idea. I feel I have to make an extra cognitive step decode and reassemble it to create an idea of the behaviour.
I suspect the further you stray from the natural idea, the harder the code is to read, validate by eye and the more likely errors are to crop in. I'm don't know how subjective this is. I generally don't have a strong feeling about early returns it's just I have been noticing the slightly greater cognitive effort they are causing me compared to the logical chunking that nested-ifs provide.
I feel like I saw something about how the JIT only checks the bounds once when going in reverse, but I can't find anything concrete to back that up, so we'll have to go with the 10 year old write-up for now saying: no, it is not optimized compared to the Length optimization. https://blogs.msdn.microsoft.com/clrcodegeneration/2009/08/1...
`i > -1` isn't quite right. I'm not sure about how C# does implicit casting, but arrays should be indexed by `usize` and comparing signed w/ unsigned is a bug waiting to happen or UB in other languages.
The difference is that the article is investigating using a property in the boundary check. The boundary check is evaluated with every iteration. Your example is only using it in the initialization expression which, by definition, is evaluated only once. This holds true for virtually any language with a for construct.
Avoiding doing unnecessary work is something all developers should strive to, experienced or not. Maybe in that case that won't make any difference. But, say, a method call in a loop often sould be done outside the loop. Doing it here but not there leads to inconsistent style. Also depending on the language and compiler that might actually make a difference sometimes. I'm not going to write code that assumes a certain interpreter/compiler behavior.
If I remember correctly, the runtime keeps size of the array in the header of the object along with sync block, etc. If you have VS, you can view the object in memory to see the sync block value, array size value, etc.
It is very possible the reason is not speed, but readability. If you simply named it 'length', then sure, there is no point. If it is given a better, more descriptive name, and then gets used in an equation elsewhere in the code, then it may be very useful because it is easier to read.
it will skip the array bounds check as long as myArray.Length is the terminal condition in the for loop, rather than a local variable with the length stored.
It is odd, that C# needs foreach loop for bound check elimination.
Java supports this optimization for a wide range of loop types... since Java 7, I think. Normally I would argue, that Java is just ahead of curve, but Android has also gained supports for bounds check elimination in 2014-2015.
Either article does not tell us whole truth or Microsoft JIT is subpar by modern standards.
[+] [-] OskarS|6 years ago|reply
[+] [-] legulere|6 years ago|reply
[+] [-] zamalek|6 years ago|reply
[+] [-] 60654|6 years ago|reply
One nitpick about the title: C# runs on more runtimes than Microsoft .Net CLR, and those may behave very differently. For example: Mono CLR, or Unity's IL2CPP which is an ahead-of-time compiler.
Specifically, I'd expect IL2CPP would not hoist length out, because it would not recognize it as an invariant. (Some great examples of IL2CPP cross compilation are here: https://jacksondunstan.com/articles/4749 )
TLDR: the Microsoft JIT compiler makes the local variable unnecessary, but this is a property of the JIT, not of C#. Developers on non-MS platforms shouldn't assume this.
[+] [-] chrisseaton|6 years ago|reply
They shouldn’t behave any differently should they? There’s a single language spec.
[+] [-] ducttape12|6 years ago|reply
Even better is to just avoid accessing the array's length. I almost always use foreach or Linq.
[+] [-] gameswithgo|6 years ago|reply
[+] [-] cr0sh|6 years ago|reply
Maybe it is just a style and readability thing; or maybe (as suggested elsewhere as well) it is meant to be reused elsewhere in the system, so it is cached in a variable for later use.
Or, it's possible that at one time - maybe early in the early days of .NET - doing it this way was more optimized, and the habit stuck with developers (perhaps they all read the same article in the knowledge base about it?). If that's the case, it's a bit of "premature optimization", but one that doesn't apparently harm anything.
What I do wonder is if certain other changes could change the speed?
At least it might be interesting to see in these trivial cases; I admit that in more complex loops it might not be advisable.
But - for instance, what if rather than iterating thru the array from the 0th element to the length of the array, you instead started from the last element and iterated backwards, until you hit zero? That way, you wouldn't be checking the length of the array, but rather for zero?
The code for such a test might look like:
I'm not sure that a "with variable" version would make much difference (or sense), but here it is for completeness sake: Again - I'm not a C# developer - maybe my code is wrong above, but hopefully it gets the idea across.Would this work better? Would it be faster? What would the JIT compiler create? Maybe it wouldn't be any faster or better than the ForEach examples?
I honestly don't know - but if anybody wants to give it a shot, I'd be curious as to the results...
EDIT: I noticed that I said "checking for zero" - but I modified my code to check for -1 as the boundary; I suppose the check in the loops could be modified to be "i == 0;" instead. I'm not sure if whether doing an "i >= 0;" vs "i == 0;" vs "i > -1;" which is faster - another thing to check, I suppose...
[+] [-] Someone|6 years ago|reply
That used to be common in assembly, as it leads to smaller and faster code on many systems.
See https://stackoverflow.com/questions/2823043/is-it-faster-to-..., which also shows how times have changed, with many answers calling this premature optimization.
[+] [-] duncanawoods|6 years ago|reply
A related thought is how modern code clean-up tools are doing things like reducing if-nesting e.g. turning this:
into this, with early returns: My feeling is that the first more naturally represents the idea I have of the behaviour and the second, like your reverse iteration, is an encoded version of that idea. I feel I have to make an extra cognitive step decode and reassemble it to create an idea of the behaviour.I suspect the further you stray from the natural idea, the harder the code is to read, validate by eye and the more likely errors are to crop in. I'm don't know how subjective this is. I generally don't have a strong feeling about early returns it's just I have been noticing the slightly greater cognitive effort they are causing me compared to the logical chunking that nested-ifs provide.
[+] [-] darrenkopp|6 years ago|reply
[+] [-] davemp|6 years ago|reply
[+] [-] mtVessel|6 years ago|reply
[+] [-] laurent123456|6 years ago|reply
[+] [-] ajnin|6 years ago|reply
[+] [-] vips7L|6 years ago|reply
[+] [-] patsplat|6 years ago|reply
It's an interesting analysis and all, but why bother when the language has such an elegant collections API.
[+] [-] gameswithgo|6 years ago|reply
[+] [-] coinerone|6 years ago|reply
[+] [-] Narishma|6 years ago|reply
[+] [-] jay_kyburz|6 years ago|reply
[+] [-] thrower123|6 years ago|reply
[+] [-] cutler|6 years ago|reply
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] artofcode|6 years ago|reply
[+] [-] germanlee|6 years ago|reply
[+] [-] suff|6 years ago|reply
[+] [-] JustSomeNobody|6 years ago|reply
[+] [-] sshine|6 years ago|reply
[deleted]
[+] [-] l-|6 years ago|reply
[+] [-] gameswithgo|6 years ago|reply
[+] [-] yc12340|6 years ago|reply
Java supports this optimization for a wide range of loop types... since Java 7, I think. Normally I would argue, that Java is just ahead of curve, but Android has also gained supports for bounds check elimination in 2014-2015.
Either article does not tell us whole truth or Microsoft JIT is subpar by modern standards.
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] vardump|6 years ago|reply
[+] [-] unknown|6 years ago|reply
[deleted]