Debugging Memory Corruption II

Some years ago I shared a trick that let’s you call _CrtCheckMemory from the debugger anywhere, without re-compilation.   The updated (as of VS2013) string to type at a watch window is:

{,,msvcr120d.dll}_CrtCheckMemory()

Let’s expand on that today, in two steps.

Checking memory on every allocation

The CRT heap accepts a neat little flag, called: _CRTDBG_CHECK_ALWAYS_DF.  Here’s how it used:

int main()
{
// Get current flag
int tmpFlag = _CrtSetDbgFlag(_CRTDBG_REPORT_FLAG);

// Turn on corruption-checking bit
tmpFlag |= _CRTDBG_CHECK_ALWAYS_DF;

// Set flag to the new value
_CrtSetDbgFlag(tmpFlag);

int* p = new int[100]; // allocate,
p[101] = 1;   // corrupt,    and…

int* q = new int[100];  // BOOM! alarm fires here

}

Testing for corruption on every allocation can tangibly slow down your program, which is why the CRT allows testing only every N allocations, N being 16, 128 or 1024.  Usage adds half a line of code – pasted from MSDN:

// Get the current bits
tmp = _CrtSetDbgFlag(_CRTDBG_REPORT_FLAG);

// Clear the upper 16 bits and OR in the desired frequency
tmp = (tmp & 0x0000FFFF) | _CRTDBG_CHECK_EVERY_16_DF;

// Set the new bits
_CrtSetDbgFlag(tmp);
}

Note that testing for corruption on every memory allocation is nothing like testing on every memory write – the alarm would not fire at the exact time of the felony, but since your software allocates memory (even indirectly) very often – this will hopefully help narrow down the crime scene quickly.

Checking memory on every allocation – from the debugger

You might reasonably want to enable/disable these lavish tests at runtime.

The debug flags are stored in {,,msvcr120d}_crtDbgFlag, and the numeric value of _CRTDBG_CHECK_ALWAYS_DF is 4, so one might hope that these lines would enable and disable these intensive memory tests:

image

Alas, this doesn’t work – _CrtSetDbgFlag contains further logic that routes the input flags further to internal variables. The easiest solution is to just call it:

image

First two lines enable, last two lines disable.  If you’re running with non default flags, the actual values you’d see might be different.

Hidden Tracepoint Keywords

The tracepoints window includes instructions for several special keywords, the most useful by far being $CALLSTACK:

 

These are not all there are – two more exist: $TICK and $FILEPOS. Quoting the documentation:

$TICK inserts the current CPU tick count, while $FILEPOS inserts the current file position.

$TICK displays a time counter in hex, but otherwise both work as advertised and are documented and official. There is just a good chance nobody knows them, since –reasonably – no one thought of going on MSDN to dig them out, as the dialog itself goes unusually deep into details.

Debugging Handle Leaks

This is all well documented stuff and I won’t go into details – it’s here mostly for self reference (3rd time I had to chase this down in google).

Steps are:

(1) Install WDK to integrate the WinDbg engine with VS (not strictly necessary, but very convenient).

(2) Attach to the debugee via ‘User Mode’ transport:

image

(3) Continue execution, and break at the spot where the handle count is at ‘reference’ value.

(4) At the ‘Debugger Immediate Window’ type ‘!htrace –enable’

(5) Continue execution and break at a point where the handle count is supposed to be at reference value but isn’t.

(6) At the ‘Debugger Immediate Window’ type ‘!htrace –diff’.

 

The offending stack[s] should be visible at the debugger immediate window.  If you get garbage, there’s a good chance you’re debugging a 32bit process on a 64bit machine.

UseDebugLibraries and Wrong Defaults for VC++ Project Properties

Many of the projects I’m working on seem to have wrong default properties in Debug configuration.  For example, ‘Runtime Library’ is explicitly set to /MDd but defaults to /MD. ‘Basic Runtime Checks’ is explicitly set to /RTC1 but defaults to  none. ‘Optimization’ is explicitly set to /Od but defaults to /O2, and so on:

image

image

This recently caused us some trouble, and the investigation results are dumped below.

The direct reason is that these vcxproj’s are missing the ‘UseDebugLibraries’ element, under the ‘Configuration’ PropertyGroup: it should be set to true in Debug and false in Release.   A correct vcxproj should include some elements like –

<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'" Label="Configuration">
    <ConfigurationType>StaticLibrary</ConfigurationType>
    <UseDebugLibraries>true</UseDebugLibraries>
    <PlatformToolset>v120</PlatformToolset>
    <CharacterSet>Unicode</CharacterSet>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'" Label="Configuration">
    <ConfigurationType>StaticLibrary</ConfigurationType>
    <UseDebugLibraries>false</UseDebugLibraries>
    <PlatformToolset>v120</PlatformToolset>
    <CharacterSet>Unicode</CharacterSet>
</PropertyGroup>

Most ‘Configuration’ sub-elements (CharacterSet, ConfigurationType etc.) directly control import of custom property sheets, but UseDebugLibraries doesn’t. Instead, it is expected in various hooks around regular property sheets. For example, Microsoft.Cpp.mfcDynamic.props includes the following -

<ClCompile>
<RuntimeLibrary Condition="'$(UseDebugLibraries)' != 'true'">MultiThreadedDll</RuntimeLibrary>
<RuntimeLibrary Condition="'$(UseDebugLibraries)' == 'true'">MultiThreadedDebugDll</RuntimeLibrary>
</ClCompile>

Why UseDebugLibraries was missing from some libraries and present in others remained a mystery until I noticed that the younger libraries tended to have this element. Indeed, the real culprit is the migration from VS2008- (vcproj format) to VS2010+ (vcxproj/MSBuild format).  MS’s migration code just did not add this element. The generated projects are functional – they just explicitly set every individual compilation switch affected by UseDebugLibraries, which makes it overly verbose and a bit sensitive – especially in the presence of junior devs who tend to stick to defaults…

So every library you have which is 4Y+ old is susceptible to this migration bug, and I suggest you manually add UseDebugLibraries.  If you have a central prop sheet where you can control multiple projects – add it there.

Not much point in reporting this to MS, is there? The chances of a fix are practically zero, and the issue would get equal web-presence here.

Reading Specific Monitor Dimensions

Almost 2 years ago I wrote about the proper way of getting the EDID – and in particular the physical monitor size. I did leave a loose end:

I actually had to query the dimensions of a specific monitor (specified HMONITOR). This was an even nastier problem, and frankly I’m just not confident yet that I got it right. If I ever get to a code worth sharing – I’ll certainly share it here.

Several commenters requested the full solution, and two years later I noticed this is still the most highly viewed post on this blog – so while I am still uncertain of the solution it’s worth dumping here and hope it does more good than evil out there.

Bridging the HMONITOR and the HDEVINFO Universes

HMONITOR is the primary user mode handle to per-monitor information, dating back to GDI. This is how you specify your monitor of interest:  you can obtain an HMONITOR from a window or list them all and pick the one whose RECT matches a location of interest.

HDEVINFO is a handle to a device information set, the primary device-installation data type. This is what eventually allows you to read the per-monitor EDID and read – among others – the monitor physical dimensions.

There is no I couldn’t find a direct way of obtaining one handle from the other. There are many description strings scattered along structs obtainable from these two data types, and the closest I have to a match are these two routes:

 

HMONITOR –> DISPLAY_DEVICE –> DeviceID

HDEVINFO -> SP_DEVINFO_DATA –> Instance

 

As an example, one of my monitors returns ‘DeviceID’ of:

MONITOR\GSM4B85\{4d36e96e-e325-11ce-bfc1-08002be10318}\ 0011

and ‘Instance’ of

DISPLAY\GSM4B85\5&273756F2&0&UID1048833

So DeviceID and Instance share a common substring.    There is probably more robust information in the last substrings (‘0011’, ‘5&273756F2&0&UID1048833’) but Device/Instance IDs are a mess, and I can’t for the life of me find a way to use this extra info.  I suspect (based on this 2010-2013 discussion) it was once possible but Windows 7 broke it.

 

Teh Codez

Usual disclaimers apply more than ever – your mileage may seriously vary on this one. Please do tell me in the comments if it worked for you.

 

#include <atlstr.h>
#include <SetupApi.h>
#include <cfgmgr32.h>   // for MAX_DEVICE_ID_LEN
#pragma comment(lib, "setupapi.lib")

#define NAME_SIZE 128

const GUID GUID_CLASS_MONITOR = { 0x4d36e96e, 0xe325, 0x11ce, 0xbf, 0xc1, 0x08, 0x00, 0x2b, 0xe1, 0x03, 0x18 };

CString Get2ndSlashBlock(const CString& sIn)
{
	int FirstSlash = sIn.Find(_T('\\'));
	CString sOut = sIn.Right(sIn.GetLength() - FirstSlash - 1);
	FirstSlash = sOut.Find(_T('\\'));
	sOut = sOut.Left(FirstSlash);
	return sOut;
}

// Assumes hEDIDRegKey is valid
bool GetMonitorSizeFromEDID(const HKEY hEDIDRegKey, short& WidthMm, short& HeightMm)
{
	DWORD dwType, AcutalValueNameLength = NAME_SIZE;
	TCHAR valueName[NAME_SIZE];

	BYTE EDIDdata[1024];
	DWORD edidsize = sizeof(EDIDdata);

	for (LONG i = 0, retValue = ERROR_SUCCESS; retValue != ERROR_NO_MORE_ITEMS; ++i)
	{
		retValue = RegEnumValue(hEDIDRegKey, i, &valueName[0],
			&AcutalValueNameLength, NULL, &dwType,
			EDIDdata, // buffer
			&edidsize); // buffer size

		if (retValue != ERROR_SUCCESS || 0 != _tcscmp(valueName, _T("EDID")))
			continue;

		WidthMm = ((EDIDdata[68] & 0xF0) << 4) + EDIDdata[66];
		HeightMm = ((EDIDdata[68] & 0x0F) << 8) + EDIDdata[67];

		return true; // valid EDID found
	}

	return false; // EDID not found
}

bool GetSizeForDevID(const CString& TargetDevID, short& WidthMm, short& HeightMm)
{
	HDEVINFO devInfo = SetupDiGetClassDevsEx(
		&GUID_CLASS_MONITOR, //class GUID
		NULL, //enumerator
		NULL, //HWND
		DIGCF_PRESENT | DIGCF_PROFILE, // Flags //DIGCF_ALLCLASSES|
		NULL, // device info, create a new one.
		NULL, // machine name, local machine
		NULL);// reserved

	if (NULL == devInfo)
		return false;

	bool bRes = false;

	for (ULONG i = 0; ERROR_NO_MORE_ITEMS != GetLastError(); ++i)
	{
		SP_DEVINFO_DATA devInfoData;
		memset(&devInfoData, 0, sizeof(devInfoData));
		devInfoData.cbSize = sizeof(devInfoData);

		if (SetupDiEnumDeviceInfo(devInfo, i, &devInfoData))
		{
			TCHAR Instance[MAX_DEVICE_ID_LEN];
			SetupDiGetDeviceInstanceId(devInfo, &devInfoData, Instance, MAX_PATH, NULL);

			CString sInstance(Instance);
			if (-1 == sInstance.Find(TargetDevID))
				continue;

			HKEY hEDIDRegKey = SetupDiOpenDevRegKey(devInfo, &devInfoData,
				DICS_FLAG_GLOBAL, 0, DIREG_DEV, KEY_READ);

			if (!hEDIDRegKey || (hEDIDRegKey == INVALID_HANDLE_VALUE))
				continue;

			bRes = GetMonitorSizeFromEDID(hEDIDRegKey, WidthMm, HeightMm);

			RegCloseKey(hEDIDRegKey);
		}
	}
	SetupDiDestroyDeviceInfoList(devInfo);
	return bRes;
}

HMONITOR  g_hMonitor;

BOOL CALLBACK MyMonitorEnumProc(
	_In_  HMONITOR hMonitor,
	_In_  HDC hdcMonitor,
	_In_  LPRECT lprcMonitor,
	_In_  LPARAM dwData
	)

{
	// Use this function to identify the monitor of interest: MONITORINFO contains the Monitor RECT.
	MONITORINFOEX mi;
	mi.cbSize = sizeof(MONITORINFOEX);

	GetMonitorInfo(hMonitor, &mi);
	OutputDebugString(mi.szDevice);

	// For simplicity, we set the last monitor to be the one of interest
	g_hMonitor = hMonitor;

	return TRUE;
}

BOOL DisplayDeviceFromHMonitor(HMONITOR hMonitor, DISPLAY_DEVICE& ddMonOut)
{
	MONITORINFOEX mi;
	mi.cbSize = sizeof(MONITORINFOEX);
	GetMonitorInfo(hMonitor, &mi);

	DISPLAY_DEVICE dd;
	dd.cb = sizeof(dd);
	DWORD devIdx = 0; // device index

	CString DeviceID;
	bool bFoundDevice = false;
	while (EnumDisplayDevices(0, devIdx, &dd, 0))
	{
		devIdx++;
		if (0 != _tcscmp(dd.DeviceName, mi.szDevice))
			continue;

		DISPLAY_DEVICE ddMon;
		ZeroMemory(&ddMon, sizeof(ddMon));
		ddMon.cb = sizeof(ddMon);
		DWORD MonIdx = 0;

		while (EnumDisplayDevices(dd.DeviceName, MonIdx, &ddMon, 0))
		{
			MonIdx++;

			ddMonOut = ddMon;
			return TRUE;

			ZeroMemory(&ddMon, sizeof(ddMon));
			ddMon.cb = sizeof(ddMon);
		}

		ZeroMemory(&dd, sizeof(dd));
		dd.cb = sizeof(dd);
	}

	return FALSE;
}

int _tmain(int argc, _TCHAR* argv [])
{
	// Identify the HMONITOR of interest via the callback MyMonitorEnumProc
	EnumDisplayMonitors( NULL, NULL, MyMonitorEnumProc, NULL);

	DISPLAY_DEVICE ddMon;
	if (FALSE == DisplayDeviceFromHMonitor(g_hMonitor, ddMon))
		return 1;

	CString DeviceID;
	DeviceID.Format(_T("%s"), ddMon.DeviceID);
	DeviceID = Get2ndSlashBlock(DeviceID);

	short WidthMm, HeightMm;
	bool bFoundDevice = GetSizeForDevID(DeviceID, WidthMm, HeightMm);

	return !bFoundDevice;
}

Blogging 101

This is post #101, which makes the previous post #100.

When I started all this I didn’t think I’d have 100 things to say.  Glad I was wrong, and hope to still have useful things to say for 100 more posts.

Thanks for sticking around!

Vector Deleting Destructor and Weak Linkage

Now that the discussions on weak linker symbols and vector deleting destructors are in place, it is time to discuss a fact that might seem esoteric but has far reaching implications. After that, it is time to ask for your help.

In VC++, Vector deleting destructors are defined with weak linkage at the translation unit that defined the class, and strong linkage at any translation unit that calls new[] on the class.

Say what?

The first part of this statement (v-d-dtors have weak linkage) was already demonstrated at the post on weak linkage – given any cpp file which defines a non trivial class, you can dumpbin its obj file and see for yourself.

Now some code to demonstrate the full statement:

 
//C.h 
struct C 
{
  virtual ~C(); 
}

//C.cpp 
#include "C.h" 
C::~C() {} 

//D.h 
struct D 
{ 
Func(); 
} 

//D.cpp 
#include "D.h" 
#include "C.h" 
D::Func() 
{ 
  C* = new C[42]; 
} 

A dumpbin of C.obj shows:

017 00000000 UNDEF  notype ()    External     | ??3@YAXPAX@Z (void __cdecl operator delete(void *))
018 00000000 SECT4  notype ()    External     | ??1C@@UAE@XZ (public: virtual __thiscall C::~C(void))
019 00000000 SECT6  notype ()    External     | ??_GC@@UAEPAXI@Z (public: virtual void * __thiscall C::`scalar deleting destructor'(unsigned int))
01A 00000000 UNDEF  notype ()    WeakExternal | ??_EC@@UAEPAXI@Z (public: virtual void * __thiscall C::`vector deleting destructor'(unsigned int))

While a dumpbin of D.obj shows:

01D 00000000 UNDEF  notype ()    External     | ??_L@YGXPAXIHP6EX0@Z1@Z (void __stdcall `eh vector constructor iterator'(void *,unsigned int,int,void (__thiscall*)(void *),void (__thiscall*)(void *)))
01E 00000000 UNDEF  notype ()    External     | ??_M@YGXPAXIHP6EX0@Z@Z (void __stdcall `eh vector destructor iterator'(void *,unsigned int,int,void (__thiscall*)(void *)))
01F 00000000 UNDEF  notype ()    External     | ??2@YAPAXI@Z (void * __cdecl operator new(unsigned int))
020 00000000 UNDEF  notype ()    External     | ??3@YAXPAX@Z (void __cdecl operator delete(void *))
021 00000000 SECT8  notype ()    External     | ?Func@D@@QAEXXZ (public: void __thiscall D::Func(void))
022 00000000 UNDEF  notype ()    External     | ??1C@@UAE@XZ (public: virtual __thiscall C::~C(void))
023 00000000 SECT4  notype ()    External     | ??0C@@QAE@XZ (public: __thiscall C::C(void))
024 00000000 SECT6  notype ()    External     | ??_EC@@UAEPAXI@Z (public: virtual void * __thiscall C::`vector deleting destructor'(unsigned int))

What this means is that to successfully complete the linkage of C.obj, the linker must now load D.obj – because both contain implementations of the same function, but C defines a weak external implementation and D defines a strong external implementation (of a C method!).

Ok, that’s kinda weird, but why should I care?

Here’s why:

What happens when C.cpp and D.cpp are part of a static library?

Unlike executables (.exe or .dll), when processing a static lib the linker only loads obj files that are referenced, i.e., whose contents are needed for successful linkage. Once loaded, an obj file must have it’s contents successfully link (unless you’re building with /GL, but let’s ignore that here). Let’s expand the previous example a bit :

//main.cpp
#include "StaticLib\C.h"

int main(int, char)
{
  C c;
  return 0;
}

//StaticLib\C.h 
struct C 
{
  virtual ~C(); 
}

//StaticLib\C.cpp 
#include "C.h" 
C::~C() {} 

//StaticLib\D.h 
struct D 
{ 
  Func(); 
}

//StaticLib\D.cpp 
#include "D.h" 
#include "C.h" 

extern void SomeJunkImplementedElsewhere();
D::Func() 
{ 
  C* = new C[42]; 
  SomeJunkImplementedElsewhere();
}

Can you already see what happens now?

Now for the program to successfully build you must satisfy D.cpp’s linkage – which means dragging in another library – although you never consumed D’s functionality in the first place.

I wish this was just a theoretical peculiarity. The solutions I’m working on consist of a complicated network of literally hundreds of static libraries, and time and time again we find ourselves forced to drag in weird dependencies that the code we actually run never uses.  It seems unbelievable, but almost all of these unexplainable dependencies boil down to this esoteric fact – vector deleting destructors have weak linkage at the point of class definition.

That was nice. Now go and report it.

I did. Over half a year ago.   The report was originally closed as ‘By Design’, and after an explicit request the following explanation from Karl Niu arrived:

To explain the “By Design” resolution, imagine that you have “new A[n]” and “delete[] pA” in different translation units. In such a case, the compiler needs to define the strong external in the translation unit containing the “new A[n]“.

Which I just don’t understand: the weak/strong debate is not over new[] or delete[], but rather over vector deleting destructors, which are not user-overridable in the first place. Wherever delete[] is overloaded, it should be able to fetch the vector-deleting-dtor from the translation unit that defined it – hopefully, the one that defined the class it’s deleting.   I tried to ask again, twice, and got no response for 6 months now.

Now, I regularly report many bugs at MS Connect, almost all of which never get resolved (which I can live with. I’m doing this mostly in hope of helping fellow devs googling their trouble) – but this one leaves me frustrated. It feels as if despite my best efforts I failed to clearly communicate the issue.    It seems like an esoteric technicality, yet it actively hinders decoupling – thereby damaging large software systems at the architecture level!

Why golly Ofek, that’s really bad. But what can I do?

You can either -

(1) Dig in and tell me in the comments where I’m wrong.  It was initially resolved as ‘by design’, and even got an explanation (sorta), so I might be missing some valid reason for this sorry state of affairs.

(2) Go to the bug page and upvote it.  This one realy deserves attention from the VC++ team.

But I urge you to do either.  Thanks!

red-pill-or-blue-pill