The old ones are the best ones, and I spent too many hours yesterday butting my head against a classic Windows bug. I was testing an optimisation in the SpreadServeEngine‘s handling of nested invocations of XLL supplied functions. I was using QuantLib‘s YieldCurveBootstrapping spreadsheet, and the QuantLib 1.4.0 XLL addin. The sheet invokes the qlPiecewiseYieldCurve function, and the fourth parameter, RateHelpers, is supplied by an invocation of ohPack. Normally in this scenario, the XLOPER returned by ohPack would be marshalled to the engine’s internal representation, before being marshalled back to an XLOPER for handing into qlPiecewiseYieldCurve. I added a shortcut that made the XLOPER available as well as the internal representation, so the marshalling process could proceed by shortcut if possible. Unwittingly I ended up newing an object in one DLL which was released by another DLL when qlPiecewiseYieldCurve returned and released it’s stack frame. My debug build threw a run time assert on _BLOCK_TYPE_IS_VALID. Cue a bug hunt with me toothcombing through the codebase for a double delete or a buffer overrun. Spreadsheet engines are complex beasts since they’re basically a runtime for a functional programming language, so must maintain an object graph and dispatch imperatively at the nodes to tokenised code and XLL supplied C/C++ functions. There’s a lot of memory pool and stack management to do in all that! So I was sure there was a subtle memory management bug somewhere that had been exposed by my optimisation for the nested XLL functions corner case. While I was looking for the bug I did get some good reading on Windows heap debugging done: this and this are recommended. When debugging it’s easy to keep going deeper and deeper in pursuit of a supposedly subtle bug. Better to take a breath and try and widen one’s focus. It was this article that made me think again about the fact that each DLL has it’s own C run time linked in, and if their memory management implementations differ, there will be problems. So it was in this case. Adding a static allocator function to the DLL responsible for freeing the object ensured that the new & delete were done by the same DLL, and the bootstrapping model started running through smoothly. Of course what I need to do now is revisit my build system’s link model and ensure all DLLs and EXEs are linking the same CRT implementation…

SpreadServe Beta

December 6, 2014

For the last few months I’ve been working on a new product: SpreadServe. SpreadServe’s mission is to take all those unwieldy spreadsheets full of XLL addins and VBA macros off trader desktops, and turn them into resilient, automated, scalable enterprise services. Excel spreadsheets are great, because they empower users to create their own solutions quickly. But on the other hand, they’re a liability, because they’re usually poorly tested and they have to be manually operated on the desktop. SpreadServe’s goal is to fix that by providing an alternate runtime for Excel spreadsheets. Retain business agility by continuing to design and build models in Excel. Auto-test, deploy, scale, log and automate those models in SpreadServe. Do contact us if you’re interested in joining the beta program.

Suppose you have a third party Win32 DLL, but don’t have access to the source code. The DLL exports a well known init function that makes callbacks into your code to register more functions. So you can call Win32’s LoadLibrary to load up the DLL, and you can use GetProcAddress to get hold of a pointer to the init func. You can invoke the init function from your code because you know the function prototype – the number and type of parameters. Then you get callbacks into a Register function you supply, which gives your code the names of other functions exported by the third party DLLs, as well as the number and type of parameters. Excel developers will recognise the description of the XLL function registration mechanism. So, given the names of those functions you can use GetProcAddress to get function pointers for them. But how do you invoke them? You don’t have function declarations available at compile time in your code. The functions don’t use varargs, so you can’t use va_start and va_end to figure out the params at run time.

The only way to resolve this dilemma is to kick down to assembler, and hand code some x86 that follows the Win32 calling conventions, which are well explained here and here. So here’s the code I wrote to invoke arbitray functions exported from a DLL. I used a couple of great resources to refresh my ASM chops, which have become very rusty after years of neglect: this primer and this x86 instruction set reference. It’s in inline assembler, together with the C++ preamble that sets up parameters to simplify the assembler.

bool cc_cdecl = true;                         // stdcall if false
int parmBytes = ( parmCount - 1) * 4;         // parmCount includes ret val, so subtract 1
int parmPop = ( cc_cdecl ? parmBytes : 0);    // number of bytes to pop off the stack after call
void* rvInt = 0;                              // for receiving int or ptr return value
double rvDbl = 0.0;                           // for a float return value from ST(0)
int paddr = ( int)parms;                      // parms is void** array of parameters. Cast to int
                                              // to prevent implicit ptr deref by asm
// Then asm code to do a cdecl or stdcall dispatch and call xf.
__asm {                      // push parms onto stack in reverse order
        push eax             // save eax
        mov eax, paddr       // point to start of parms
        add eax, parmBytes   // point to last parm
    pp: push [eax]           // stack a parm
        sub eax, 4           // point to next parm
        cmp eax, paddr       // have we hit the start yet?
        jg pp                // if eax > parms goto pp
        call xf              // invoke the function!
        add esp, parmPop     // pop parms if cdecl
        mov rvInt, eax       // int or ptr retvals are in eax
        fst rvDbl            // float ret vals are in st0
        pop eax              // restore eax

The Resolver

February 1, 2008

So I’ve had a little play with Resolver One, now that it’s out of beta and at 1.0. To give a somewhat simplistic view, it’s an Excel clone implemented in IronPython. Where it scores big over Excel is in exposing its internals, specifically the calc model and worksheet in one seamless view. In Excel, we have several different APIs for injecting our own logic into a spreadsheet: Excel’s own internal C API for coding XLLs, VBA and COM. Those APIs allow us to express worksheet functionality in C, VB and any COM language. Resolver gives us one seamless view of its internals in our favourite programming language: Python. Simple example here.

This is attractive for those of us already using Python for front office systems. I’m not sure how much it will appeal to others though…

A new kind of spreadsheet…

September 28, 2007

Excel is unavoidable for anyone working in front office trading systems. Front office support teams often bemoan the hassle of supporting myriad sheets, all a spaghetti mixture of various internal and external plugins and trader coded VB. Whole lines of business get built on spreadsheets, and replacing them with robust apps soaks up many developer years.

But spreadsheets will always be on the trading floor, no matter how hard IT may try to eliminate them. The reason is simple: Excel is the traders’ preferred development platform. Yet at the same time, it’s a general purpose tool, quite ill suited for trading apps. I’ve long thought that a determined start up could make a killing by building a better spreadsheet for the trading floor.

Being a long time Pythonista, and just recently dipping my toes in the water with IronPython for desktop etrading apps, I was excited to discover Resolver Systems. The Resolver is an IronPython .Net based spreadsheet that supports injection of Python scripts. I hope the Resolver guys are thinking about multi threading and serverisation as well as integration with existing .Net based Excel addins. Multi threading and server side capability would allow serverisation of trader coded pricing, and support for existing .Net addins would enable Resolver sheets to plug into market data infrastructures etc.

Good luck to the Resolver guys !  I’m very happy to see a promising startup furthering the cause of Python on the trading floor.


March 25, 2007

So if you’re building an XLL in VC8, and you’re using a .def file to specify the exported symbols, make sure you give the linker a /DEF:myaddin.xll switch. Unlike the VC6 linker, VC8 doesn’t automatically pick up the .def. Took me a good few hours to figure that, using depends and filemon to look for missing dependencies etc. All the time depends was telling me that my XLL was exporting no symbols, but I couldn’t see it, so fixated was I on more subtle possible problems.


September 26, 2006

Sparklines is a visualisation technique I’ve been following for a while – so I read this with interest. And then something else on cmiles blog caught my eye: a link to Juice Analytics. They have a nice blog on Excel hacks, which mentions a toolkit that gives you sparklines in Excel. Not the same Juice as founded by Charles Ferguson who wrote High Stakes – probably the best book I’ve read on founding a software start up.

Matlab, Python, R

September 22, 2006

Thanks to Vince for this Matlab/Python/R crib sheet. It’s good to see the three way equivalence, especially since I’m an old hand with Python and an R newbie. Matlab is the tool of choice on our trading floor for traders who’ve hit the buffers with Excel.

Excel 2007 at CSFB

August 1, 2006

Just got an email from MS referring me to the presentations from last Tuesday’s Office 2007 presentation at the Credit Suisse offices. Paul Foster’s session on Excel Services is worth checking out…

Playing with R

August 1, 2006

I’m doing some data analysis at the moment, and I’ve gone through some toolsets in search of a combination that will give me power, expressiveness and charting capability. The raw data comes out of our trading system’s RDB. I started off with SQL queries, but soon found the slowness of complex queries and the arcana of SQL was holding me back. So I exported the data as CSV and set to work with Excel. I made some more progress, but didn’t want to resort to VBA for the ad hoc slice and dice coding. So then I switched to processing my 300Mb CSV with Python. Progress was good for a couple of days, but then my script started to get hairier. The data extraction and cleansing logic was all mixed up with the analysis logic. And I still didn’t have any good charts. Back to the drawing board. I vaguely remembered folk on Victor Niederhoffer’s mailing list discussing R. I checked out the programming recommendations on the great man’s site. So I downloaded R, and after 30 minutes playing, I’m very impressed. Scripting, powerful maths, built in vector and matrix data types, and flexible charting all rolled together. I’ll be asking around on the floor to see if any of our traders are using R…