Archive for the 'coding' Category

Oct 13 2008

VisEmacs 3.1.1 Released

Published by michael under Projects, coding

Well, I’m feeling overwhelmed these days, but at least I’ve knocked one item off my todo list– I’ve released a new version of VisEmacs.  You can get the installer here.

Release 3.1 included support for Emacsclient, but I think introduced a bug– the new configuration settings weren’t being saved correctly (or maybe they never were– I’m too tired to check). Many thankgs to Andrew Ng for first, catching it, and second fixing it.

Also, Christoph Conrad tells me that the DotEnvCommand tool makes a really handy companion to VisEmacs. I haven’t tried it yet, but it looks pretty cool. If anyone’s using it, do let me know.

No responses yet

Apr 26 2008

Article on writing Add-Ins restored…

Published by michael under coding

About eighteen months ago, I wrote an article about how to build an Add-In that would load itself into DeStudio 6.0, Visual Studio (2003 or 2005), and Office (2003). Like some other things I’ve written up, it found a small but appreciative audience (here, for instance).

I’d hosted the article, along with source code, on the old site. Over the past week or so, I updated it for Visual Studio 2008 & re-posted it here. To the half-dozen or so people who might interested in such a thing, enjoy ;)

No responses yet

Apr 23 2008

Figuring out dependencies introduced by static libraries

Published by michael under coding

I don’t know if anyone else has this problem, but I sometimes want to know what dependencies will be introduced into my program by linking against a static library (on Windows). If I’m linking against a DLL, I can just run depends, which will tell me what other DLLs that DLL needs to load (and even which exports it’s pulling in), but in the case of a static library, I can’t find an analagous tool.

This came up for me recently at work: I’ve been asked to port a static library which I wrote some time ago to another platform. To get a sense of what kind of dependencies this library will want to drag along, I wanted to get a list of all it’s unresolved externals.

This has turned out to be harder than you might think. A few minutes with Google didn’t really turn up anything, so I took a look at dumpbin. dumpbin /symbols prints out a nice report of the library’s symbol table, complete with unresolved symbols clearly marked as UNDEF. Great, I thought: I’ll just type:

dumpbin /symbols foo.lib | grep -e '^[0-9a-fA-F]\+ [0-9a-fA-F]\{8\} UNDEF'

(this would be in a Cygwin shell, obviously) and have my list (which I’d pretty up later).

Not so fast. Persusing the list, I started seeing symbols listed as undefined which I knew to be defined in this library… hmmmm. A few more minutes spent persuing the original dumpbin output showed that they were, in fact, defined in this library! The symbols would show up once as undefined, and a second time as defined.

I can only guess that dumpbin is just concatenating the output I would get if I ran it against each .obj separately. That is, if symbol _XYZ is defined in module a.obj, and referenced in mobule
b.obj, we get two records (one for each module):

67F 00000000 SECT183 notype ()    External    | _XYX
....
107 00000000 UNDEF  notype ()    External     | _XYZ

Damnit. Ok, so I’m going to have to write a little code, here. What I want to do is walk dumpbin’s output, parsing each record containing a symbol definition, that symbol’s undecorated name (if present), and whether or not it’s defined. The trick is that it may show up more than once.

IOW, a “mark & sweep” approach: as I parse each record, I need to check to see if it’s already been recorded and only mark it as undefined if the current record says it is and if it hasn’t already been marked down as present. Else, I want to mark it as defined. Once I’m done, I’ll sweep the datastructure of any records corresponding to symbols defined inside
my library.

I fired up a Python shell, even tho this kind of little reporting problem “feels” like Perl to me, so that I could horse around with these ideas interactively:

>>> import os, re
>>> f = os.popen("dumpbin /symbols foo.lib", "r")
>>> x = f.readline()
>>> print x

Now, the records we want generally look like this:

023 00000000 SECT9  notype       External     | ?FRAG_ACK@WscMsg@ani8021x@@2EB (public: static unsigned char const ani8021x::WscMsg::FRAG_ACK)

but we get lots of stuff we dont’ care about like:

Section length    1, #relocs    0, #linenums    0, checksum E963A535, selection    2 (pick any)

and some stuff that’s not un-decorated:

357 00000000 UNDEF  notype ()    External     | _memset

I guessed at a regexp,

^[0-9a-f]{3} [0-9a-f]{8} (SECT[0-9a-f]+|UNDEF) [^|]+\| ([^(]+) ?(?:\((.*)\))?

but how to tell? I tried it a few times in the interpreter:

>>> for i in range(1, 25):
...     x = f.readline()
...     m = re.search("^[0-9a-f]{3} [0-9a-f]{8} (SECT[0-9a-f]+|UNDEF) [^|]+\| ([^(]+) ?(?:\((.*)\))?", x, re.I)
...     print x
...     if m: print m.groups()
...     else: print None
...

Cool. This let me watch my regex in action over enough lines to get some confidence in my approach: it was discarding the stuff about which I didn’t care, and parsing what I wanted.

So, let’s do this:

>>> program = re.compile("^[0-9a-f]{3} [0-9a-f]{8} (SECT[0-9a-f]+|UNDEF) [^|]+\| ([^(]+) ?(?:\((.*)\))?", re.I)
>>> print program
<_sre.SRE_Pattern object at 0x00A507B8>

With the regex now compiled, we’re ready to rock:

>>> f.close()
>>> data={}
>>> f = os.popen("dumpbin /symbols foo.lib", "r")
>>> x = f.readline()
>>> while x:
...     m = program.search(x)
...     if m:
...         sym = m.group(2).strip()
...         if sym[0] != '.' and sym[0] != '$':
...             undefd = m.group(1) == "UNDEF"
...             und = m.group(3)
...             if not data.has_key(sym):
...                 data[sym] = [ undefd, und ]
...             elif not undefd:
...                 data[sym][0] = False
...     x = f.readline()
...

So at this point, we’ve traversed all the symbols in our library, and marked those that are undefined. Cleanup,

>>> f.close()

& sweep:

>>> for k in data.keys():
...     if data[k][0]:
...         undefined_symbols.append([k, data[k][1]])
...

That’s it– undefined_symbols is now a list of lists, each sub-list containing two elements: the symbol name and the undecorated version (which may be None).

We can just as quickly pretty-print our results to file:

>>> f = file("C:\\tmp\\report.txt", "w")
>>> for x in undefined_symbols:
...     und = ""
...     if x[1]: und = x[1]
...     f.write("%s | %s\n" % (x[0], und))
...
>>> f.close()

Of course, I still have to figure out how to enumerate template instantiations made in my library, but whose definitions were pulled in from external code…

No responses yet

Apr 19 2008

How does WTL connect HWNDs to C++ objects?

Published by michael under coding

This is another recycled post from the old blog.

In any windowing library, the question is always how to connect instances of whatever C++ class is representing a Window, and the HWNDs that the OS actually uses. I was curious as to how ATL & WTL do that, so I did a little digging.

I started with ATL’s support for Property Sheets. Looking at the implementation of CPropertySheetImpl, we see:

ATL::_AtlWinModule.AddCreateWndData(&pT->m_thunk.cd, pT); // 1
INT_PTR nRet = ::PropertySheet(&m_psh);                   // 2

Line number 1 obviously looks interesting. So, what the heck is AddCreateWndData? Like so many ATL methods, it ends up in a global:

AtlWinModuleAddCreateWndData(_ATL_WIN_MODULE* pWinModule,
                             _AtlCreateWndData* pData,
                             void* pObject)
{
    pData->m_pThis = pObject;
    pData->m_dwThreadID = ::GetCurrentThreadId();
    pData->m_pNext = pWinModule->m_pCreateWndList;
    pWinModule->m_pCreateWndList = pData;
...

Here’s what’s happening; ATL maintains a per-thread, singly linked list of AtlCreateWndData structures. These structures record the current thread and the current class instance. Here’s the structure definition:


struct _AtlCreateWndData
{
void* m_pThis;
DWORD m_dwThreadID;
_AtlCreateWndData* m_pNext;
};

for every kind of window (plain-jane Windows, Dialogs, Property Pages, &c), ATL finds some kind of “hook” that will be called after the window is actually created, but before it begins receiveing messages. There, (when it has an HWND laying around) it pops the head of the current list to find the C++ object corresponding to the window being created.

The current head of the list (for the current thread) is retrieved by calling AtlWinModuleExtractCreateWndData. This functon is called by CAtlWinModule::ExtractCreateWndData, which is in turn called by:

  • CWindowImplBaseT< TBase, TWinTraits >::StartWindowProc
  • CDialogImplBaseT< TBase >::StartDialogProc
  • CCommonDialogImplBase::HookProc
  • CColorDialogImpl::HookProc
  • CPropertySheetImpl::PropSheetCallback

Let’s dig into StartWindowProc. This is the WNDPROC that gets registered with the Window class WNDCLASS. What does it do?

It pops the head off the current thread’s list of AtlCreateWndData, sets pThis to the corresponding member, and then:


pThis->m_thunk.Init(pThis->GetWindowProc(), pThis);

Alright, so what’s m_thunk? This is a member variable of type CWndProcThunk. The actual implementation is hidden behind a typedef or two, but the upshot is this: it’s a little structure whose member variables actually make up executable code! Specifically:


mov dword ptr [esp+0x4], pThis
jmp relwndproc

The Init member sets up the pThis & the address of the actual WNDPROC we’re going to call. Here’s the trick: if this code is executed at the beginning of a function in which the first parameter on the stack is an HWND (like, say, a window procedure), this code will write the address of the C++ class instance representing the window that’s just been created over top of the HWND and then jump to the beginning of the window procedure returned from the GetWindowProc function.

Remember that while we get back to StartWindowProc. StartWindowProc next gets the address at which this new code resides by calling:

WNDPROC pProc = pThis->m_thunk.GetWNDPROC();
WNDPROC pOldProc = (WNDPROC)::SetWindowLongPtr(hWnd, GWLP_WNDPROC,
                                               (LONG_PTR)pProc);

What we’ve just done is substituted our little thunk for the new window’s window procedure. With the thunk in place, we can now do this first off in the real WNDPROC:

CWindowImplBaseT< TBase, TWinTraits >* pThis = (CWindowImplBaseT< TBase, TWinTraits >*)hWnd;

which is actually kind of slick (although a maintenance headache, I’d guess…).

No responses yet