Beginner programmers tend to get confused over the convention that
array indexes start with 0 in languages like C/C++/Lisp rather than the more natural
1 (in Pascal or Lua for example).
After programming for a while, the observant programmer will notice that she tends to use the
half open range [0, n) in languages with a zero-based index and a
closed range [1, n] in languages with a 1-based index.
But which is the better way? Zero based indexes appear to be foisted on us by the use of
offsets as indexes - a poorly though-out artefact of the compilation process so-to-speak. Surely 1-based indexes are better.
On the other hand, the STL uses half-open ranges for everything. Plus, certain algorithms do seem to be more natural when using the half-open range...
I found (what I consider to be) the definitive answer today!
Here is Edsger Dijkstra's article "
Why numbering should start at zero (PDF)".
I'll summarize in point form:
- Zero based indexes allow us to begin from the smallest natural number without having out-of-range values (like -1).
- Having the upper bound as half open allows us to define the empty range more naturally
[n,n) instead of [n,n-1].
- The length of the range can easily be found by subtracting the ends of the range.
- Adjacent ranges nicely add up
[a,b)+[b,c)=[a,c) without excessive doubling at the ends.
Considering the points above, it seems that several edge cases of algorithms may become simpler to handle. Hence the preference for
0 over
1 in many programming languages.
The article is a superb read. I'll end with a quote from it:
"The moral of the story is that we had better regard - after all these centuries! - zero as a most natural number."