Accelerating Debug Runs, Part 1: _NO_DEBUG_HEAP

(A more appropriate but even-less-catchy title might have been ‘accelerating runs from the debugger‘. As elaborated below, these two are not strictly equal).

A common notion is that debug builds can and should carry as much debugging overhead as one can possibly scram in – after all, the point in debug builds is exactly this, debug, and you should never care about their performance. After too many cases of slow-to-the-extent-of-utterly-unworkable builds, I respectfully disagree. In this and the next post, a few techniques to make debug builds run faster are laid out.

Introducing the Windows Debug Heap

As many, many, have already discovered – the WDH is a big deal as far as performance goes, and yet MSDN is unusually terse about it. The HeapSetInformation page says:

When a process is run under any debugger, certain heap debug options are automatically enabled for all heaps in the process. These heap debug options prevent the use of the LFH. To enable the low-fragmentation heap when running under a debugger, set the _NO_DEBUG_HEAP environment variable to 1.

And in some arcane corner of the WinDBG documentation:

Processes that the debugger creates (also known as spawned processes) behave slightly differently than processes that the debugger does not create.

Instead of using the standard heap API, processes that the debugger creates use a special debug heap. You can force a spawned process to use the standard heap instead of the debug heap by using the _NO_DEBUG_HEAP environment variable or the -hd command-line option.

(While the latter was written for windbg, everything except the –hd switch holds equally for VS).

What are these ‘certain heap debug options’? What is the price in performance? Can the WDH be avoided altogether? Stay tuned.

Creating and Avoiding the WDH

The debugger itself calls IDebugClient5::CreateProcess2 which creates a debuggee process with WDH by default. The WDH creation can be bypassed by specifying DEBUG_CREATE_PROCESS_NO_DEBUG_HEAP in the options argument, and the MS debuggers do exactly that when the aforementioned environment variable _NO_DEBUG_HEAP exists and is set to 1.

( I suspect the underlying appartus is that CreateProcess with the DEBUG_PROCESS flag causes windows to check the environment variable _NO_DEBUG_HEAP and decide which process heaps to create, but I didn’t verify).

You can set this environment variable either globally for the machine (as I do) or in a specific debug session via the project properties:

What the WDH Does

  1. The only documented effect is disabling the LFH – which makes sense, as these are mutually exclusive heap layouts. You do lose some speedups by dropping the LFH but by and large this is a negligible factor compared to the others.

2. On every allocation the memory manager initializes every allocated DWORD to 0xbaadfood, and on every deallocation sets the memory to 0xfeeefeee – in addition to some bookkeeping just after the allocated chunk. Here’s the normal view:

And here’s the view with _NO_DEBUG_HEAP=1:

These magic numbers can help in some debugging scenarios – use of uninitialized heap memory, and usage after free – but truth be told, they rarely do. Here are some more details. Most of the extra time, however, is not spent there.

3. On every memory operation, the WDH walks the heap and checks for integrity! To observe, add some corruption:

And run:

Now run again with _NO_DEBUG_HEAP set to 1 – and watch the assertion vanish.

Err, this stuff actually sounds useful. Sure I should disable it?

For regular C++ applications – beyond a doubt, yes.

the CRT delivers identical functionality, on top of the windows debug heap, with different magic numbers: 0xcdcdcdcd for fresh allocations and 0xdddddddd for freed memory. If you leave the WDH on you’re initializing memory chunks twice – and worse, checking heap integrity – for each allocation. In regular development scenarios WDH is just empty, very expensive overhead.

By ‘regular’ C++ programs I mean those that don’t do anything fancy with the heap and just stick to the built in CRT heap. You can overload new/delete, as long as your overloads eventually call the shipped new/debug/malloc/free, or some dbg/aligned siblings.

One potential argument in favour of leaving the WDH on is that unlike the CRT debug heap the WDH is operational in release builds also, but (1) it is disabled for any launch outside a debugger anyway, (2) in the extremely unlikely case that you’d require memory integrity checks but don’t want to run a debug build, I would suggest just editing your debug configurations to include optimizations. (add /O2).

Oh, and in our applications setting _NO_DEBUG_HEAP=1 accelerated some runs by a factor of 10. Nough said.

Edit (Oct 6 2014):

Remarkably, 3 weeks after initially publishing this post it seems the VC team themselves agree. Beginning with VS “14”, the WDH will be opt-in, not opt-out – as it ought to be.

Posted in Debugging, Visual Studio, Win32 | Leave a comment

Debugging Memory Corruption II

Some years ago I shared a trick that let’s you call _CrtCheckMemory from the debugger anywhere, without re-compilation.   The updated (as of VS2013) string to type at a watch window is:


Let’s expand on that today, in two steps.

Checking memory on every allocation

The CRT heap accepts a neat little flag, called: _CRTDBG_CHECK_ALWAYS_DF.  Here’s how it used:

int main()
// Get current flag
int tmpFlag = _CrtSetDbgFlag(_CRTDBG_REPORT_FLAG);

// Turn on corruption-checking bit

// Set flag to the new value

int* p = new int[100]; // allocate,
p[101] = 1;   // corrupt,    and…

int* q = new int[100];  // BOOM! alarm fires here


Testing for corruption on every allocation can tangibly slow down your program, which is why the CRT allows testing only every N allocations, N being 16, 128 or 1024.  Usage adds half a line of code – pasted from MSDN:

// Get the current bits
tmp = _CrtSetDbgFlag(_CRTDBG_REPORT_FLAG);

// Clear the upper 16 bits and OR in the desired frequency
tmp = (tmp & 0x0000FFFF) | _CRTDBG_CHECK_EVERY_16_DF;

// Set the new bits

Note that testing for corruption on every memory allocation is nothing like testing on every memory write – the alarm would not fire at the exact time of the felony, but since your software allocates memory (even indirectly) very often – this will hopefully help narrow down the crime scene quickly.

Checking memory on every allocation – from the debugger

You might reasonably want to enable/disable these lavish tests at runtime.

The debug flags are stored in {,,msvcr120d}_crtDbgFlag, and the numeric value of _CRTDBG_CHECK_ALWAYS_DF is 4, so one might hope that these lines would enable and disable these intensive memory tests:


Alas, this doesn’t work – _CrtSetDbgFlag contains further logic that routes the input flags further to internal variables. The easiest solution is to just call it:


First two lines enable, last two lines disable.  If you’re running with non default flags, the actual values you’d see might be different.

Posted in Debugging, VC++ | 3 Comments

Hidden Tracepoint Keywords

The tracepoints window includes instructions for several special keywords, the most useful by far being $CALLSTACK:


These are not all there are – two more exist: $TICK and $FILEPOS. Quoting the documentation:

$TICK inserts the current CPU tick count, while $FILEPOS inserts the current file position.

$TICK displays a time counter in hex, but otherwise both work as advertised and are documented and official. There is just a good chance nobody knows them, since –reasonably – no one thought of going on MSDN to dig them out, as the dialog itself goes unusually deep into details.

Posted in Visual Studio | Leave a comment

Debugging Handle Leaks

This is all well documented stuff and I won’t go into details – it’s here mostly for self reference (3rd time I had to chase this down in google).

Steps are:

(1) Install WDK to integrate the WinDbg engine with VS (not strictly necessary, but very convenient).

(2) Attach to the debugee via ‘User Mode’ transport:


(3) Continue execution, and break at the spot where the handle count is at ‘reference’ value.

(4) At the ‘Debugger Immediate Window’ type ‘!htrace –enable’

(5) Continue execution and break at a point where the handle count is supposed to be at reference value but isn’t.

(6) At the ‘Debugger Immediate Window’ type ‘!htrace –diff’.


The offending stack[s] should be visible at the debugger immediate window.  If you get garbage, there’s a good chance you’re debugging a 32bit process on a 64bit machine.

Posted in Debugging, Visual Studio, Win32 | Leave a comment

UseDebugLibraries and Wrong Defaults for VC++ Project Properties

Many of the projects I’m working on seem to have wrong default properties in Debug configuration.  For example, ‘Runtime Library’ is explicitly set to /MDd but defaults to /MD. ‘Basic Runtime Checks’ is explicitly set to /RTC1 but defaults to  none. ‘Optimization’ is explicitly set to /Od but defaults to /O2, and so on:



This recently caused us some trouble, and the investigation results are dumped below.

The direct reason is that these vcxproj’s are missing the ‘UseDebugLibraries’ element, under the ‘Configuration’ PropertyGroup: it should be set to true in Debug and false in Release.   A correct vcxproj should include some elements like –

<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'" Label="Configuration">
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'" Label="Configuration">

Most ‘Configuration’ sub-elements (CharacterSet, ConfigurationType etc.) directly control import of custom property sheets, but UseDebugLibraries doesn’t. Instead, it is expected in various hooks around regular property sheets. For example, Microsoft.Cpp.mfcDynamic.props includes the following –

<RuntimeLibrary Condition="'$(UseDebugLibraries)' != 'true'">MultiThreadedDll</RuntimeLibrary>
<RuntimeLibrary Condition="'$(UseDebugLibraries)' == 'true'">MultiThreadedDebugDll</RuntimeLibrary>

Why UseDebugLibraries was missing from some libraries and present in others remained a mystery until I noticed that the younger libraries tended to have this element. Indeed, the real culprit is the migration from VS2008- (vcproj format) to VS2010+ (vcxproj/MSBuild format).  MS’s migration code just did not add this element. The generated projects are functional – they just explicitly set every individual compilation switch affected by UseDebugLibraries, which makes it overly verbose and a bit sensitive – especially in the presence of junior devs who tend to stick to defaults…

So every library you have which is 4Y+ old is susceptible to this migration bug, and I suggest you manually add UseDebugLibraries.  If you have a central prop sheet where you can control multiple projects – add it there.

Not much point in reporting this to MS, is there? The chances of a fix are practically zero, and the issue would get equal web-presence here.

Posted in MSBuild, VC++ | 4 Comments

Reading Specific Monitor Dimensions

Almost 2 years ago I wrote about the proper way of getting the EDID – and in particular the physical monitor size. I did leave a loose end:

I actually had to query the dimensions of a specific monitor (specified HMONITOR). This was an even nastier problem, and frankly I’m just not confident yet that I got it right. If I ever get to a code worth sharing – I’ll certainly share it here.

Several commenters requested the full solution, and two years later I noticed this is still the most highly viewed post on this blog – so while I am still uncertain of the solution it’s worth dumping here and hope it does more good than evil out there.

Bridging the HMONITOR and the HDEVINFO Universes

HMONITOR is the primary user mode handle to per-monitor information, dating back to GDI. This is how you specify your monitor of interest:  you can obtain an HMONITOR from a window or list them all and pick the one whose RECT matches a location of interest.

HDEVINFO is a handle to a device information set, the primary device-installation data type. This is what eventually allows you to read the per-monitor EDID and read – among others – the monitor physical dimensions.

There is no I couldn’t find a direct way of obtaining one handle from the other. There are many description strings scattered along structs obtainable from these two data types, and the closest I have to a match are these two routes:





As an example, one of my monitors returns ‘DeviceID’ of:

MONITOR\GSM4B85\{4d36e96e-e325-11ce-bfc1-08002be10318}\ 0011

and ‘Instance’ of


So DeviceID and Instance share a common substring.    There is probably more robust information in the last substrings (‘0011’, ‘5&273756F2&0&UID1048833’) but Device/Instance IDs are a mess, and I can’t for the life of me find a way to use this extra info.  I suspect (based on this 2010-2013 discussion) it was once possible but Windows 7 broke it.


Teh Codez

Usual disclaimers apply more than ever – your mileage may seriously vary on this one. Please do tell me in the comments if it worked for you.


#include <atlstr.h>;
#include <SetupApi.h>;
#include <cfgmgr32.h>;   // for MAX_DEVICE_ID_LEN
#pragma comment(lib, "setupapi.lib")

#define NAME_SIZE 128

const GUID GUID_CLASS_MONITOR = { 0x4d36e96e, 0xe325, 0x11ce, 0xbf, 0xc1, 0x08, 0x00, 0x2b, 0xe1, 0x03, 0x18 };

CString Get2ndSlashBlock(const CString& sIn)
	int FirstSlash = sIn.Find(_T('\\'));
	CString sOut = sIn.Right(sIn.GetLength() - FirstSlash - 1);
	FirstSlash = sOut.Find(_T('\\'));
	sOut = sOut.Left(FirstSlash);
	return sOut;

// Assumes hEDIDRegKey is valid
bool GetMonitorSizeFromEDID(const HKEY hEDIDRegKey, short& WidthMm, short& HeightMm)
	BYTE EDIDdata[1024];
	DWORD edidsize = sizeof(EDIDdata);

	if (ERROR_SUCCESS != RegQueryValueEx(hEDIDRegKey, _T("EDID"), NULL, NULL, EDIDdata, &edidsize))
		return false;
	WidthMm = ((EDIDdata[68] & 0xF0) << 4) + EDIDdata[66];
	HeightMm = ((EDIDdata[68] & 0x0F) << 8) + EDIDdata[67];

	return true; // valid EDID found

bool GetSizeForDevID(const CString& TargetDevID, short& WidthMm, short& HeightMm)
	HDEVINFO devInfo = SetupDiGetClassDevsEx(
		NULL, //enumerator
		NULL, // device info, create a new one.
		NULL, // machine name, local machine
		NULL);// reserved

	if (NULL == devInfo)
		return false;

	bool bRes = false;

	for (ULONG i = 0; ERROR_NO_MORE_ITEMS != GetLastError(); ++i)
		SP_DEVINFO_DATA devInfoData;
		memset(&devInfoData, 0, sizeof(devInfoData));
		devInfoData.cbSize = sizeof(devInfoData);

		if (SetupDiEnumDeviceInfo(devInfo, i, &devInfoData))
			SetupDiGetDeviceInstanceId(devInfo, &devInfoData, Instance, MAX_PATH, NULL);

			CString sInstance(Instance);
			if (-1 == sInstance.Find(TargetDevID))

			HKEY hEDIDRegKey = SetupDiOpenDevRegKey(devInfo, &devInfoData,

			if (!hEDIDRegKey || (hEDIDRegKey == INVALID_HANDLE_VALUE))

			bRes = GetMonitorSizeFromEDID(hEDIDRegKey, WidthMm, HeightMm);

	return bRes;

HMONITOR  g_hMonitor;

BOOL CALLBACK MyMonitorEnumProc(
	_In_  HMONITOR hMonitor,
	_In_  HDC hdcMonitor,
	_In_  LPRECT lprcMonitor,
	_In_  LPARAM dwData

	// Use this function to identify the monitor of interest: MONITORINFO contains the Monitor RECT.
	mi.cbSize = sizeof(MONITORINFOEX);

	GetMonitorInfo(hMonitor, &mi);

	// For simplicity, we set the last monitor to be the one of interest
	g_hMonitor = hMonitor;

	return TRUE;

BOOL DisplayDeviceFromHMonitor(HMONITOR hMonitor, DISPLAY_DEVICE& ddMonOut)
	mi.cbSize = sizeof(MONITORINFOEX);
	GetMonitorInfo(hMonitor, &mi);

	dd.cb = sizeof(dd);
	DWORD devIdx = 0; // device index

	CString DeviceID;
	bool bFoundDevice = false;
	while (EnumDisplayDevices(0, devIdx, &dd, 0))
		if (0 != _tcscmp(dd.DeviceName, mi.szDevice))

		ZeroMemory(&ddMon, sizeof(ddMon));
		ddMon.cb = sizeof(ddMon);
		DWORD MonIdx = 0;

		while (EnumDisplayDevices(dd.DeviceName, MonIdx, &ddMon, 0))

			ddMonOut = ddMon;
			return TRUE;

			ZeroMemory(&ddMon, sizeof(ddMon));
			ddMon.cb = sizeof(ddMon);

		ZeroMemory(&dd, sizeof(dd));
		dd.cb = sizeof(dd);

	return FALSE;

int _tmain(int argc, _TCHAR* argv[])
	// Identify the HMONITOR of interest via the callback MyMonitorEnumProc
	EnumDisplayMonitors(NULL, NULL, MyMonitorEnumProc, NULL);

	if (FALSE == DisplayDeviceFromHMonitor(g_hMonitor, ddMon))
		return 1;

	CString DeviceID;
	DeviceID.Format(_T("%s"), ddMon.DeviceID);
	DeviceID = Get2ndSlashBlock(DeviceID);

	short WidthMm, HeightMm;
	bool bFoundDevice = GetSizeForDevID(DeviceID, WidthMm, HeightMm);

	return !bFoundDevice;
Posted in Win32 | 7 Comments

Blogging 101

This is post #101, which makes the previous post #100.

When I started all this I didn’t think I’d have 100 things to say.  Glad I was wrong, and hope to still have useful things to say for 100 more posts.

Thanks for sticking around!

Posted in Musings | Leave a comment