Banks as platforms

September 29, 2014

Zac Townsend‘s post on how Standard Treasury aims to turn banks into platforms is intriguing. There’s certainly no lack of ambition in his goal. But I do wonder if he’s setting himself to tilt against the very nature of both banks and platforms. One of the key phrases in Zac’s post is: “allowing developers to think of banks as platforms”. I’ll just unpack that a little. First, platforms, as explicated in Evans & Hagiu’s excellent Invisible Engines. Platforms are multi-sided markets. One side pays for access to the revenue generating customers that an effective platform aggregates by offering free or cheap access. For example, in gaming, game devs pay licenses to platform (console) owners so they can sell to gamers. The console manufactures sell consoles at or even below cost. In financial trading clients pay Bloomberg for access to information & liquidity, and dealers get access to the platform without paying fees to Bloomberg. Famously, Google and Facebook offer services free to consumers to enable them to sell to advertisers. So if banks are going to spend a load of cash adopting Standard Treasury tech so they can become more like real software platforms, who is going to pay?

Let’s bear in mind that banks are already liquidity platforms. They charge fees for access to the liquidity they provide by aggregating client capital. They disguise fees by making some things “free”, and charging for others when they cross sell. If you attempt to commoditise or aggregate by means of a software platform, they lose the cross sell, and so the margins. They will certainly resist that prospect. So, any software platform that integrates banks with with software services needs to offer the prospect of more margin in existing deal flow, or new deal flow to justify the cost of adoption. Judging by Zac’s post, it looks as if he thinks the new deal flow would come from the underbanked via mobile apps. Will that deal flow justify the cost of implementing Standard Treasury tech? I’m sceptical…

Standard Treasury should also be considering the cost of decommissioning those expensive legacy systems. In banking old and new systems tend to run in parallel until all stakeholders are confident that the new systems supports all legacy system functionality. So new tech that promises cost savings tends to cause a cost spike until the old system really has been put to bed. And, believe me, that can be a lengthy and painful process! I have first hand experience of systems that have resisted retirement for decades…

Suppose you have a third party Win32 DLL, but don’t have access to the source code. The DLL exports a well known init function that makes callbacks into your code to register more functions. So you can call Win32’s LoadLibrary to load up the DLL, and you can use GetProcAddress to get hold of a pointer to the init func. You can invoke the init function from your code because you know the function prototype – the number and type of parameters. Then you get callbacks into a Register function you supply, which gives your code the names of other functions exported by the third party DLLs, as well as the number and type of parameters. Excel developers will recognise the description of the XLL function registration mechanism. So, given the names of those functions you can use GetProcAddress to get function pointers for them. But how do you invoke them? You don’t have function declarations available at compile time in your code. The functions don’t use varargs, so you can’t use va_start and va_end to figure out the params at run time.

The only way to resolve this dilemma is to kick down to assembler, and hand code some x86 that follows the Win32 calling conventions, which are well explained here and here. So here’s the code I wrote to invoke arbitray functions exported from a DLL. I used a couple of great resources to refresh my ASM chops, which have become very rusty after years of neglect: this primer and this x86 instruction set reference. It’s in inline assembler, together with the C++ preamble that sets up parameters to simplify the assembler.

bool cc_cdecl = true;                         // stdcall if false
int parmBytes = ( parmCount - 1) * 4;         // parmCount includes ret val, so subtract 1
int parmPop = ( cc_cdecl ? parmBytes : 0);    // number of bytes to pop off the stack after call
void* rvInt = 0;                              // for receiving int or ptr return value
double rvDbl = 0.0;                           // for a float return value from ST(0)
int paddr = ( int)parms;                      // parms is void** array of parameters. Cast to int
                                              // to prevent implicit ptr deref by asm
// Then asm code to do a cdecl or stdcall dispatch and call xf.
__asm {                      // push parms onto stack in reverse order
        push eax             // save eax
        mov eax, paddr       // point to start of parms
        add eax, parmBytes   // point to last parm
    pp: push [eax]           // stack a parm
        sub eax, 4           // point to next parm
        cmp eax, paddr       // have we hit the start yet?
        jg pp                // if eax > parms goto pp
        call xf              // invoke the function!
        add esp, parmPop     // pop parms if cdecl
        mov rvInt, eax       // int or ptr retvals are in eax
        fst rvDbl            // float ret vals are in st0
        pop eax              // restore eax
 }

Invisible Engines

June 4, 2014

I first read Invisible Engines back in 2007. I still rate it highly now, and I’m pleased to see you can download the whole book as a PDF from the MIT site. It’s topic is the economic and technical aspects of software platforms. Anyone who’s followed the fortunes of IBM, Microsoft, Apple and Sun, and their respective software and hardware platforms should get a lot from this book. I had high expectations when I originally read it, and I wasn’t disappointed. Looking at the download again today, I can see it’s stood the test of time. The book goes well beyond operating systems as platforms and has excellent material on gaming consoles and the three way tension between mobile handsets, their OSes and network operators. It came out in 2006, so pre dates the rise of social software platforms. But the principles it elucidates on multi sided markets and APIs are obviously applicable to Facebook, Google and Twitter. And in the financial sector they apply equally to industry giants like Bloomberg who are a classic multi sided play. Bloomberg as a platform derives revenue from clients paying $1000/month for a terminal. The other sides of its market are dealers contributing quotes and liquidity to Bloomberg as an ECN, and software developers using Bloomberg APIs.

When you’re building a server product that will support C++ APIs you need to consider your ABI – the binary interface. Typically, C++ APIs are distributed as headers and libs. If the functions your API exports include parameters that use, for instance, std::string, you immediately have a problem, as you’re requiring client code to use the same STL implementation as you did to build the lib. That’s OK if client code has access to the source, and can rebuild. But commercial, proprietary products, tend not to distribute source. So how to avoid forcing dependencies on API client code? I went searching for some resources, and found two especially good ones I had to flag.

Here’s  Thiago Macieira on binary compatibility: an excellent presentation with guidelines for library authors. Here’s a summary of Thiago’s recommendations…

 

  1. Use pimpl idiom to hide object size
  2. Use plain old data types in function signatures
  3. Don’t hand out ptrs or refs to internals
  4. No inline funcs
  5. All classes need one non inline virtual; probably the dtor
  6. Avoid virtuals in template classes
  7. Do not reorder or remove public members, or change access levels

2 means no STL or Boost types in function parameters. I’ll address 6 by avoiding templates in my API.

This article by Agner Fog is a superb detailed survey of data sizes & alignments,  stack alignment, call conventions for register usage and parameter handling, name mangling schemes and object file formats inc COFF for all the major x86 & x64 OSes and C++ compilers. Strongly recommended.

I’ve been getting into windbg while working on my POC. I’ve long been a fan of Microsoft’s Visual Studio debugger. Even back in the late 90s, when I got a serious case of Open Source Religion after falling under the spell of Eric Raymond’s Cathedral and the Bazaar, and went through an anti Microsoft phase, I never stopped rating the debugger. Visual Studio debugger is great, but windbg is a whole ‘nother thing. It’s the debugger MS use themselves for debugging Windows. Yes, the interface is a little clunky compared to the VS debugger, but the power of the command set more than compensates. It’s got it’s own scripting system built in, so you can construct custom breakpoints: for instance, break the 5th time round the loop when this int is greater than 100. And it has an API too, so the debug engine can be driven from other languages like Python. Personally, I’m really enjoying discovering the power of windbg. While I do so I’m capturing tips and tricks here.

Python DAGs

May 5, 2014

Tales flags some interesting developments in the Python world. The demand for Python developers in finance does seem to be building. Both Man and Getco are big users, and as Tales points out, JP Morgan and BAML both use Python as the primary programming languages in their Athena and Quartz systems, both of which are inspired by Goldman’s SecDB/Slang. Tales wonders if Washington Square Tech will the fourth implementation of this paradigm; I believe it may be the fifth, as Morgan Stanley had an Athena like project called Pioneer during Jay Dweck’s ill fated tenure. Apparently that project is now defunct. The Athena paradigm is technically a very powerful solution for trading businesses that have run on ad hoc solutions using Excel for pricing and risk. Partly because they seek to replace Excel based pricing and risk, and partly because it’s compute efficient, all implementations of the Athena SecDB/Slang paradigm implement Directed Acyclic Graphs. I’m guessing that Washington Square Tech will think this could be very appealing for buy side firms that don’t have big in house tech stacks, together with incumbent tech teams defending them against replacements.

Directed Acyclic Graphs (DAGs) are a powerful implementation technique for minimising the load of compute intensive tasks like pricing and risk, and this is one good reason why Excel has been so successful in this area, as Ben Lerner of DataNitro explains persuasively here. So it should be no surprise to see that Man Groups own open source Python codebases include a DAG implementation, MDF. The MDF docs include a very good illustration of the power of the DAG approach.

tornado & websockets

April 26, 2014

The core of my POC prototype is a server engine, which doesn’t really make for a great demo. Generally, people grasp concepts quicker if they can see a tangible realization. So I needed a realistic way to show live ticking data getting cranked out by the server. A browser GUI seemed a natural candidate. And being a Pythonista I wanted to do the server coding in Python. Until recently getting live ticking data pushed up to a browser was a big deal, requiring sophisticated server products like the Caplin Liberator, and rich GUI toolkits like Caplin Trader. Fortunately, it’s now possible to hack some demoware in the form of a live, ticking webpage using some really simple jQuery & websockets in the browser, and tornado on the server side. JavaScript and browser GUIs are not my forte, so I won’t comment any further, except to note how much easier it seems than five years ago. On the server side, though, I do have more experience. Back in 2000 I was doing server side web dev in Python using Zope. Zope is a very powerful system, featuring a built in Object DB and an inheritance by instance rather than class mechanism called acquisition. Consequently it has a rather steep learning curve. In recent years Plone has had some traction as a CMS built on top of Zope. In 2001/2 I discovered Twisted Matrix, a general networking toolkit you can use to build any IP based networking functionality. Again there’s a steep learning curve, but it’s much lighter than Zope, and is now very mature. I will be using Twisted to build a general socket server capability for my core product: I’ve got C++ and Python APIs, but I’ll need a socket server for Java support. But what I needed for my demo purpose was real time server push to the browser. And tornado proved to be a good choice. Simple, lightweight, lots of worked examples and focused entirely on websockets. It didn’t take long to get ticking data into a webpage. Recommended!

Nodally Xenograte

April 10, 2014

Well, xenograte from nodally sounds pretty cool: loosely coupled software components in the cloud. Brad Cox’s dream of snap together building blocks, finally realised. Yahoo Pipes, anyone? I’ve even hacked around similarly motivated code myself, but never got so far of course. The problem with these new paradigms is that they ask you to throw away all your old software assets so you can rebuild them again in the new framework. A bit like media companies asking us to buy the same content over and over on different formats: LPs, tapes, CDs, audio DVDs, downloads, pono, VHS, DVD, bluray….  Why can’t someone find a way to breath fresh life into existing assets without reengineering them. Why not, indeed?

POC M1 running

February 16, 2014

I’ve been building a proof of concept for a new product since last August. I’ve just got the first milestone running, which is a big, big step forward. When I’ve got M2 done it will be time to come out of stealth mode, get the message out, do some demos, and start fund raising so I can put a team together…

procmon

November 9, 2013

I’ve long been a fan of Mark Russinovich’s sysinternals utilities, especially procmon and procexp. Today I discovered that in procmon, when browsing filtered file system events, you can get a stack trace by right clicking on the event. Wow!  I didn’t know that. Very powerful diagnostic technique.