2 Comments

I think the fix for the bug just happened to work on x86_64, and isn't guaranteed to work on every architecture. In particular, notice that if you had made the second call to getEnd be sequentially consistent instead of setBase, then the assembly would have been unchanged and the bug would have remained, despite the explanation making it sound like that would have fixed it too. The problem is that making an operation sequentially consistent has no effect on operations on different variables that aren't sequentially consistent. I think the best way to fix the problem is just to make both of those functions be sequentially consistent. Then the C++ standard guarantees that they'll remain in order even on other compilers and architectures you haven't thought of, and it shouldn't make your code any slower, since the generated assembly will still be what it is now.

Expand full comment

Thanks. You are right!

I think my preferred solution is probably to allow all of the accesses to be relaxed, but then explicitly add a

std::atomic_thread_fence(std::memory_order_seq_cst);

(https://en.cppreference.com/w/cpp/atomic/atomic_thread_fence)

I may revise the post (or follow up with another), since "It was still wrong, this is hard!" is definitely on message :-)

Expand full comment