(A more appropriate but even-less-catchy title might have been ‘accelerating runs from the debugger‘. As elaborated below, these two are not strictly equal).
A common notion is that debug builds can and should carry as much debugging overhead as one can possibly scram in – after all, the point in debug builds is exactly this, debug, and you should never care about their performance. After too many cases of slow-to-the-extent-of-utterly-unworkable builds, I respectfully disagree. In this and the next post, a few techniques to make debug builds run faster are laid out.
Introducing the Windows Debug Heap
As many, many, have already discovered – the WDH is a big deal as far as performance goes, and yet MSDN is unusually terse about it. The HeapSetInformation page says:
When a process is run under any debugger, certain heap debug options are automatically enabled for all heaps in the process. These heap debug options prevent the use of the LFH. To enable the low-fragmentation heap when running under a debugger, set the _NO_DEBUG_HEAP environment variable to 1.
And in some arcane corner of the WinDBG documentation:
Processes that the debugger creates (also known as spawned processes) behave slightly differently than processes that the debugger does not create.
Instead of using the standard heap API, processes that the debugger creates use a special debug heap. You can force a spawned process to use the standard heap instead of the debug heap by using the _NO_DEBUG_HEAP environment variable or the -hd command-line option.
(While the latter was written for windbg, everything except the –hd switch holds equally for VS).
What are these ‘certain heap debug options’? What is the price in performance? Can the WDH be avoided altogether? Stay tuned.
Creating and Avoiding the WDH
The debugger itself calls IDebugClient5::CreateProcess2 which creates a debuggee process with WDH by default. The WDH creation can be bypassed by specifying DEBUG_CREATE_PROCESS_NO_DEBUG_HEAP in the options argument, and the MS debuggers do exactly that when the aforementioned environment variable _NO_DEBUG_HEAP exists and is set to 1.
( I suspect the underlying appartus is that CreateProcess with the DEBUG_PROCESS flag causes windows to check the environment variable _NO_DEBUG_HEAP and decide which process heaps to create, but I didn’t verify).
You can set this environment variable either globally for the machine (as I do) or in a specific debug session via the project properties:
What the WDH Does
- The only documented effect is disabling the LFH – which makes sense, as these are mutually exclusive heap layouts. You do lose some speedups by dropping the LFH but by and large this is a negligible factor compared to the others.
2. On every allocation the memory manager initializes every allocated DWORD to 0xbaadfood, and on every deallocation sets the memory to 0xfeeefeee – in addition to some bookkeeping just after the allocated chunk. Here’s the normal view:
And here’s the view with _NO_DEBUG_HEAP=1:
These magic numbers can help in some debugging scenarios – use of uninitialized heap memory, and usage after free – but truth be told, they rarely do. Here are some more details. Most of the extra time, however, is not spent there.
3. On every memory operation, the WDH walks the heap and checks for integrity! To observe, add some corruption:
Now run again with _NO_DEBUG_HEAP set to 1 – and watch the assertion vanish.
Err, this stuff actually sounds useful. Sure I should I disable it?
For regular C++ applications – beyond a doubt, yes.
the CRT delivers identical functionality, on top of the windows debug heap, with different magic numbers: 0xcdcdcdcd for fresh allocations and 0xdddddddd for freed memory. If you leave the WDH on you’re initializing memory chunks twice – and worse, checking heap integrity – for each allocation. In regular development scenarios WDH is just empty, very expensive overhead.
By ‘regular’ C++ programs I mean those that don’t do anything fancy with the heap and just stick to the built in CRT heap. You can overload new/delete, as long as your overloads eventually call the shipped new/debug/malloc/free, or some dbg/aligned siblings.
One potential argument in favour of leaving the WDH on is that unlike the CRT debug heap the WDH is operational in release builds also, but (1) it is disabled for any launch outside a debugger anyway, (2) in the extremely unlikely case that you’d require memory integrity checks but don’t want to run a debug build, I would suggest just editing your debug configurations to include optimizations. (add /O2).
Oh, and in our applications setting _NO_DEBUG_HEAP=1 accelerated some runs by a factor of 10. Nough said.
Edit (Oct 6 2014):
Remarkably, 3 weeks after initially publishing this post it seems the VC team themselves agree. Beginning with VS “14”, the WDH will be opt-in, not opt-out – as it ought to be.