But while I'm still excited about the conference and my mind processes the thoughts that came to me that day, I'd like to refer to Scott Meyers' talk titled "Support for embedded programming in C++11 and C++14". During his talk Scott presented the
auto
keyword and constexpr
variables and functions.On one of his slides he showed a typical bug related to the assignment of the return value of
std::vector<T>::size()
to unsigned int
, for example:std::vector<int> v; unsigned int size = v.size();
I've seen such code in C++98/03 code bases plenty of times. Some strive to improve the situation by using
std::size_t
:std::vector<int> v; std::size_t size = v.size();
Unfortunately, such a procedure does not make the code any better as the actual type of the return value of
std::vector<T>::size()
is std::vector<T>::size_type
, which is only said to be of some unsigned integral type.Now back to Scott's talk on C++11 and its
auto
keyword. Scott argued that the auto
keyword helps in writing portable (and in some cases more effective) code, such as:std::vector<int> v; auto size = v.size();
in which the type of
size
variable is automatically deduced by the compiler to whatever type is returned by std::vector<T>::size()
. I have no doubts - the code looks cleaner and is free of any issues.But when it comes to practice the very first example which used
unsigned int
for the type of size
variable was not that bad. Problems start only when the vector grows to a size larger than UINT_MAX
. What's the big deal then? How often do we operate on vectors with more than 4 billion elements? Not that often. Hence the example with vector's size might not convince everyone.My motivation for this blog post was to show a more common case in which using the wrong type might result in a serious bug. Consider the following code:
std::string str = "this is my first C++ related post on this blog"; unsigned int pos = str.find("Java"); if (pos == std::string::npos) std::cout << "Hurray!\n"; else std::cout << "We're out of luck.\n";
As some of you remember
std::string::find()
returns std::size_t
, not unsigned int
. The thing is that on some platforms, like x86, the types are of equal size, so the code might actually work as expected and print "Hurray!". However, for example on x86_64 platform the size of std::size_t
type is larger than the size of unsigned int
. In this case the if
condition in the exemplary code evaluates to false
, regardless of the input and the search strings! This is because std::string::npos
is defined to be the greatest possible value for an element of type std::size_t
. Therefore assigning it to unsigned int
looses some part of the value and makes the code simply buggy.The issue becomes even more serious when
std::string::find()
is put inside a loop to parse a string in an iterative manner:std::string str = "this is my first C++ related post on this blog"; unsigned int pos = str.find(" "); while (pos != std::string::npos) pos = str.find(" ", pos + 1);
Yes, the code isn't much useful but the point is it might result in an endless loop or a crash, and it does so only on certain platforms. I discovered this portability issue some time ago while I was porting a large code base to a 64-bit platform.
Obviously, as in the case with
std::vector<T>::size()
the solution is to use the auto
keyword:std::string str = "this is my first C++ related post on this blog"; auto pos = str.find(" "); while (pos != std::string::npos) pos = str.find(" ", pos + 1);
and the problem is gone.
The conclusion is that, among a few other things, the
auto
keyword helps in writing more portable code. Now everyone should be convinced about that.But C++11 is a luxury not everyone can afford... Those, like me, who cannot use C++11 for any reason, are undoubtedly required to specify the full type as they shall not assume
unsigned int
fits everywhere. And the above examples with std::string::find()
prove the problem is much more common than one might expect.
very good example!
ReplyDeleteExcellent example.
ReplyDeleteThanks!
Delete+10 for an actual real life example. Send it to Scott so that he can use it the next time instead of the vector one!
ReplyDeleteI sent it to Scott the day I wrote this article. He appreciated the example very much and promised to use it next time.
DeleteBut this is only a problem when compiling with a warning level that doesn't scream on assignment to a smaller type, right?
ReplyDeleteDo you refer to -Wconversion in gcc? Honestly you made me google for it and I just found out about it.
DeleteC++11 and 64-bit Issues: http://www.viva64.com/en/b/0253/
ReplyDeletenice one!
ReplyDeleteNote that both gcc and clang catch the npos conversion issue when using -Wconversion see it live on godbolt https://godbolt.org/g/RpcXqJ
ReplyDelete