
Keith B. answered 07/05/19
Software Engineer and Math Geek
The process of malloc then memset is a two step operation, and depending on how the memset is implemented, it's most likely going to have to walk all 268,435,456 elements of your array to assign each the value of 0, and that's going to take time. calloc (or as I like to think of it, clear allocation) does this at the same time it is allocating your memory request, and as it knows what the value is going to be (0) it can use a different assembly code to implement the operation.
A good compiler would be able to optimize this; try building it with no debug and optimization turned up, and see if your test results aren't different.