Palmík

Valgrind and std::copy_backward

Today, while I was enjoying my free time programming, I came across weird valgrind report. It's reproducible one my machine using this relatively short sample:

#include <algorithm>
#include <iostream>

using std::cout;
using std::endl;

int main()
{
    const std::size_t size = 10;
    int data[size + 1] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

    // The data should now contain 1, 2, 2, 3, 4, 5, 6, 7, 8, 9
    std::copy_backward(data + 2, data + size, data + size + 1);

    for (std::size_t i = 0; i < size + 1; ++i) {
        cout << data[i] << ' ';
    }
    cout << endl;
}

It does output what we would expect. For those not familiar with std::copy_backward I recommend reading the documentation.

Of great importance is the footnote of the C++ standard Section 25.2.2, which says:

copy_backward should be used instead of copy when last is in the range [result - (last - first), result).

Which is indeed our case.

But yet valgrind finds this code problematic. Here is what it has to say about that program's execution:

Source and destination overlap in memcpy(0x7ff00039c, 0x7ff000398, 32)
   at 0x4C28B46: memcpy (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x400B18: int* std::__copy_move_backward<false, true, std::random_access_iterator_tag>::__copy_move_b<int>(int const*, int const*, int*) (stl_algobase.h:561)
   by 0x400AB8: int* std::__copy_move_backward_a<false, int*, int*>(int*, int*, int*) (stl_algobase.h:581)
   by 0x400A58: int* std::__copy_move_backward_a2<false, int*, int*>(int*, int*, int*) (stl_algobase.h:590)

   by 0x4009E8: int* std::copy_backward<int*, int*>(int*, int*, int*) (stl_algobase.h:625)
   by 0x400902: main (Test.cpp:13)

Interesting, so what's the problem here?

If memcpy is indeed called, then valgrind is right to point this out since using memcpy in this case leads to undefind behaviour according to the C Standard Section 7.21.2.1:

If copying takes place between objects that overlap, the behavior is undefined.

Instead of memcpy you should use memmove in such cases. memmove avoids the undefined behaviour by making temporal copy first.

But is memcpy really called? Well, not according to the libstdc++ documentation here. I have yet to find the strength to read through that mangled code and confirm it firsthand, but on a first glance it seems that the documentation is correct.

This is line 561 of stl_algobase.h reffered by valgrind:

__builtin_memmove(__result - _Num, __first, sizeof(_Tp) * _Num);

Reproducibility

I could only reproduce it on my local g++ of version 4.6.2 using valgrind 3.6.1 and could not reproduce it on a school server with g++ 4.4.5, 4.5.1, 4.5.3 nor 4.6.1 using valgrind 3.6.0.

I would be glad if anyone tried this and reported back.

Update

This appears to be connected with these two (1, 2) bug reports.