On OMP_WAIT_POLICY

Some years ago I encountered a crash that I reduced down to the following toy code, composed of a dll:

// DLLwithOMP.cpp : build into a dll *with* /openmp
#include <tchar.h>
extern "C"
{
   __declspec(dllexport)  void funcOMP()
   {
#pragma omp parallel for
    for (int i = 0; i < 100; i++)
        _tprintf(_T("Please fondle my buttocks\n"));
   }
}

and a console app:

// ConsoleApplication1.cpp : build into an executable *without* /openmp

#include <windows.h>
#include <stdio.h>
#include <tchar.h>

typedef void(*tDllFunc) ();

int main()
{
    HMODULE hDLL = LoadLibrary(_T("DLLwithOMP.dll"));
    tDllFunc pDllFunc = (tDllFunc)GetProcAddress(hDLL, "funcOMP");
    pDllFunc();
    FreeLibrary(hDLL);  // !  BOOM  !
    return 0;
}

As emphasized and commented, FreeLibrary causes a crash – typically (but not always) an access violation, with weird stacks in weird threads:

To understand what happens, let’s go over the full flow of events.

  1. The app loads the dll.
  2. The dll makes use of openmp, and thus the openmp runtime (part of the VC redist package) is loaded. It is a single dll, named vcomp[%VS_VER%][d].dll. ([d] when you’re running a debug build).
  3. The OMP runtime opens its own thread pool, and does some work.
  4. The work ends and the dll function returns.
  5. The app frees the dll
  6. vcompXXX.dll refcount is decremented to zero (since the app doesn’t use it). vcompXXX.dll is thus unloaded as well.
  7. The threads in the OMP thread pool keep spinning, but the code they’re running has just been unloaded! The rug had been pulled from under their feet and they crash spectacularly – while their stack frame seems to point somewhere in outer space.

This much I understood myself. What remained unclear was what is the correct solution. Was this an OMP implementation bug? Was there some OMP cleanup API that I missed? (not for lack of searching) Are we stuck with a (weird) requirement that components which call into OMP-linked-components, must link against OMP themselves??

I went first on StackOverflow and then on Connect (hey, it was 2015). As often happens in Connect reports, it was arbitrarily deleted some time later. Part of Eric Brumer’s response I did document at the SO post:

for optimal performance, the openmp threadpool spin waits for about a second prior to shutting down in case more work becomes available. If you unload a DLL that’s in the process of spin-waiting, it will crash in the manner you see (most of the time).

You can tell openmp not to spin-wait and the threads will immediately block after the loop finishes. Just set OMP_WAIT_POLICY=passive in your environment, or call SetEnvironmentVariable(L”OMP_WAIT_POLICY”, L”passive”); in your function before loading the dll. The default is “active” which tells the threadpool to spin wait. Use the environment variable, or just wait a few seconds before calling FreeLibrary.

MSDN explicitly mentions (for many versions now) that VC supports only OpenMP 2.0. OMP_WAIT_POLICY is part of the newer OpenMP 3.0 specification, and is the only newer environment variable that MS implemented. There’s a good chance they did it as part of this 2012 hotfix – and in the 5 years since, it remains undocumented.

Eric Brumer did mention in his Connect answer that he will nudge the documentation team to add it – but that either didn’t happen or didn’t help. Oh well, these tidbits are what keeps me blogging occasionally.

Advertisements
This entry was posted in C++, VC++. Bookmark the permalink.

2 Responses to On OMP_WAIT_POLICY

  1. Anonymous says:

    Hi,
    Nice post.

    Just out of curiosity, cause you seem to have a good grasp on what’s going on, in step 4, are you sure the threads finished their work?

    If they are done with their work they should not have any of your code on their stack, right?


    Adar Wesley

    • Ofek Shilon says:

      Hey Adar! A sample stack is shown in a screenshot in the post, and indeed it doesn’t contain any user code.

      However if i understand your question correctly that is not always the right criterion. If e.g. the dll function was asynchronous and the dll was unloaded before it finished – its stack still wouldn’t have shown user code (since it is unloaded)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s