Windows Containers

October 13, 2015

Recently Microsoft has added support for containers to Windows Server. It’s available on the Azure cloud on VMs running Windows Server 2016 Tech Preview 3. I’ve been playing with it and I’ve got SpreadServe running inside a container. There’s much more detail here. But to summarise I found three workarounds were necessary…

  • A two step process to build images as Windows container doesn’t like SpreadServe’s NSIS installer
  • Web server inside the container should be on port 80 only internally
  • A one line launch script that sets up environment variables is necessary

Recently I’ve been using the excellent Very Sleepy profiler to performance tune the SpreadServeEngine’s loader, compiler and interpreter. One of our beta users had helpfully supplied a very large spreadsheet which was causing very long load and calc cycles. Back in the 90s you had to spend serious money on licenses for Pure Software‘s Purify and Quantify tools for this kind of work. Now tools like Dr Memory and Very Sleepy are free and OSS. It’s a while since I did a real, systematic performance profiling and tuning exercise, and I was soon reminded of how quickly preconceptions about which parts of the code might be CPU hogs can be shattered. It wasn’t long before I was soon nose to nose with one of the eternal truths of C++ development, or indeed development in any language: malloc & free are expensive. That’s why the LMAX team coded their own Java Collections. Printf isn’t cheap either. The answer was to introduce memory pooling for many of the most heavily used compiler and interpreter classes, and to set up config switches for the interpreter tracing. Interpreter tracing needs to be available in release builds as it a tremendously useful way of looking inside the execution of your spreadsheet. The result was a thirty fold improvement in load time, and much snappier calc cycles on very large sheets.

Upgrading the SpreadServe GUI

September 19, 2015

Recently I’ve been updating SpreadServe‘s JavaScript web GUI. The edit facility presented on the livesheet page was quite crude. It used a bootstrap-editable to pop up an edit field in a message box. Not really a seamless user experience, so I’ve upgraded to a JavaScript grid that supports in place editing. The end result is a user experience closer to Excel itself. I chose to use webismymind’s editablegrid since it can attach to an existing HTML table. In the case of SpreadServe the HTML table is generated in C++ code inside the SpreadServeEngine whenever a running spreadsheet is published to the RealTimeWebServer. I changed the C++ code to add attributes to the table and tr elements that editablegrid uses to latch on to. I made a couple of small changes to the editablegrid JavaScript so that the isEditable and modelChanged callbacks passed through more information. My isEditable implementation needed to get hold of the DOM element as well as the row column address so it could check for an ssedit attribute. And modelChanged needed the element ID so it could POST the changed data back to the RealTimeWebServer with a unique ID. The RealTimeWebServer sends the updated value back in to the SpreadServeEngine. I’ve forked editablegrid on github here with my changes. While I was doing this JS dev I found two resources very helpful: Oscar Rotero’s jQuery cheat sheet, and the Chrome debugger. I was a Firefox advocate for a long time, as Firebug was the first JavaScript debugger I used. The Chrome debugger is just faster, snappier, breakpointing is easier, and examining the DOM and variables seems better too.

Blockspring symphony

September 10, 2015

Blockspring is a cool startup that enables users to invoke web service APIs from Excel. There are ready made integrations with Amazon, Bing, Dropbox, Facebook, Github, Google, linkedin, Quandl, Slack, Twitter and many more. The beauty of Blockspring is that it normalises all of these diverse web APIs so they can be invoked from a single Excel worksheet function: =BLOCKSPRING(…) As we all know, the great strength of Excel is that it enables users who are not developers to build all kinds of solutions independently. Blockspring has dramatically extended the range of what you can do in Excel. To make a service available via =BLOCKSPRING(…) you code a ‘Block’ in Python, Ruby, JavaScript, PHP or R. The blocks are cloud hosted on Amazon servers. All well and good, but this means that you still need to be a developer to create a block. What if Excel users could implement blocks in Excel itself? To do that you’d need a cloud hosted server that can execute a spreadsheet automatically. And, of course, that’s exactly what SpreadServe does! All that’s necessary is some code for a block that can talk to a SpreadServe instance. Here’s the source on github, and here it is deployed on Blockspring. Using this SpreadServe block, a client spreadsheet can use any SpreadServe hosted spreadsheet as a web service. Here’s an example. InvestExcel‘s BlackScholes spreadsheet can calculate Call and Put prices for an option, given the usual inputs: stock and strike prices, volatility, risk free rate and tenor. Here’s that same spreadsheet deployed on an Amazon hosted SpreadServe instance, which makes it’s calculation functionality available to Blockspring client spreadsheets. This spreadsheet calculates the Put and Call prices of a series of options at different strike prices. Make sure you have Formulas/Calculation Options set to automatic when you run it. There are nine different strike prices in the series, so there are nine round trips to the back end spreadsheet, one for each call & put calculation. Try changing one of the inputs; stock or strike price. It will cause all the put and call prices to recalculate. If you point your browser at the SpreadServe host’s RealTimeWebServer while the recalculation is happening you’ll see all the inputs get pumped through, and all the outputs calculate. So there you have it: Blockspring and SpreadServe – two great tastes that taste great together!

There are a couple of spreadsheets in the SpreadServe beta that illustrate point 3 (component reuse) from my recent Spreadsheets are code post. One of them – ycb_quandl_pub.xls – is running on the AWS host, and a recent post explained in detail how it uses Quandl data to drive QuantLib’s yield curve bootstrapping functions. ycb_quandl_pub.xls is paired with ycb_quandl_sub.xls. You can download both of them from here, and as their names suggest, ycb_quandl_pub.xls is a publisher, and ycb_quandl_sub.xls is a subscriber. ycb_quandl_pub.xls will run equally happily in Excel or SpreadServe, but it only becomes a reuasable component when it’s running in SpreadServe. Try downloading ycb_quandl_sub.xls and running it in Excel on your desktop. You’ll need to install SSAddin to make it work. Then you’ll see that ycb_quandl_sub.xls is updated with the dates and rates of the bootstrapped curve calculated by ycb_quandl_pub.xls. You may see #N/A in the cells for a few minutes until the first tick arrives from the server, which recalcs every five minutes. The s2cfg sheet in ycb_quandl_sub.xls configures the SSAddin to use its s2websock function to subscribe to the rates published by the RealTimeWebServer every time the ycb_quandl_pub.xls sheet hosted in a SpreadServeEngine instance recalculates. The RealTimeWebServer can support many subscribers, so all the logic in ycb_quandl_pub.xls from Quandl, QuantLib and the worksheet formula is shared by all the subscribers. A user with edit permission could change some aspect of the model on the publisher side, the Interpolator or TermStructureCalendar perhaps, and all the subscribers would get the same updated data as a result. Those familiar with typical pricing engine architectures in investment banks will recognise the makings of a graph of pricing engines here. But the major difference is that no server side C++, C# or Java coding is necessary to make it happen. Graphs of quant or trader developer spreadsheets can be strung together very rapidly. The benefit of the spreadsheet level component reuse that SpreadServe makes possible should be apparent.

Spreadsheets are code

August 13, 2015

Felienne Hermans has made it her mission to point out that “spreadsheets are code”. She’s most definitely right about that, and a whole host of the other consequences that she draws from that insight, specifically that we should apply the techniques developed by mainstream software engineering to spreadsheets: version control, testing and design guidelines for clean structure, like the FAST standard. Whenever you create a sheet with formulae in it you’re programming. Ignoring that fact is one of the reasons spreadsheet disasters keep happening. I couldn’t agree more with Prof Hermans on that score. But I think we need to go further in the comparison of spreadsheets with code, and point out some major differences.

  1. Conventional code, when deployed to its production runtime environment does not come with an IDE that enables any user to change the implementation! A trader can’t reach inside his Bloomberg or TradeWeb terminal and change its implementation. But Excel allows any user to change any formula in a financial model.
  2. Well structured conventional code separates user interface from business logic. This separation of concerns is often called the MVC pattern. A typical modern web application has an HTML/CSS/JavaScript user interfaces running in browsers talking to a server hosted backend coded in Python on top of Django and an RDBMS. Excel makes it impossible to decouple the user interface from the logic expressed in formulae and VBA. Modern applications present a choice for their architects; does a given piece of code belong in the browser, the server tier or the DB ?
  3. Conventional code enables reuse through components. Each Excel spreadsheet is like an island, and monolithic. How can spreadsheets be composed together to draw input and feed output to each other? Only with manual, error prone operations.
  4. Unit testing: the unit testing philosophy calls for any significant component to have a set of separate test code that proves compliance with pre and post conditions as well as yielding specified results. Also required is the ability to run a set of tests automatically and record the results. All of that is a capability that Excel simply doesn’t have.

To realise points 1 to 4 for spreadsheets we need an alternate run time that can host spreadsheets on a server, and decouple the financial logic expressed in worksheet formulae, VBA & XLLs from the user interface. In the next post I’ll give more detail on how SpreadServe solves all the issues raised above.

In yesterday’s post I promised to give more detail on the Yield Curve Bootstrapping sheet running on the Amazon hosted SpreadServe instance. If you’d like to try running the sheet on your own desktop you can download it from the repository; just click on ycb_quandl_pub.xls. To run the sheet in your own Excel you’ll need to download the QuantLib and SpreadServe addins. ycb_quandl_pub.xls is based on one of QuantLibXL’s example spreadsheets, YieldCurveBootstrapping.xls, which gives a sample QuantLib Excel solution to a common fixed income rates maths problem: bootstrapping a yield curve. If you look at the original sheet you’ll see that all input data is present as simple cell values. To change it you must rekey it. Ideally this would be automated, so that deposit, futures and swap rates could be regularly pulled from a clean data source, and the bootstrapping results recalculated and published. ycb_quandl_pub.xls uses the SpreadServe Addin to pull the depo, futures and swap rates from quandl. Look at the top left block on the Quandl sheet within the ycb_quandl_pub workbook to see the invocations of the s2quandl function that pull the rates into the sheet from Lower down on the same sheet you can see the s2cron invocation that schedules a timer to go off every 5 minutes and trigger a new download of the same data. The same trigger is used as input to QuantLib’s qlPieceWiseYieldCurve function on the Bootstrapping sheet to force a recalculation when freshly downloaded data arrives. All that is great for automating an Excel spreadsheet. With SpreadServe we can take it one step further and get the sheet off the desktop and onto a server. The whole process is then automated, centralised and freed from possible manual disruption on the desktop.

NB QuantLib date calcs mean the results of this sheet are only good on weekdays, Mon-Fri, and not Sat or Sun.

The readthedocs github workflow is so smooth I had to knock together some docs for the SpreadServe Addin. Here they are

SpreadServe resources

August 7, 2015

In preparation for the launch of SpreadServe‘s beta program I’ve added a page of resources to this blog. I’ve just finished moving the documentation on to It’s very cool to be able to edit the docs on my laptop, push the changes to github, and have them appear automatically, via webhook, on readthedocs. The source ReStructured Text docs are on the SpreadServe github repository. Also on github is the SpreadServe Addin which extends Excel with background thread quandl queries and cron like scheduled triggers. And there’s a link to the Amazon hosted instance running a yield curve bootstrapping sheet that automatically pulls depo, futures and swap rates from quandl. More on that in another post. Finally, there’s a link to the Google Group for SpreadServe. Please join the group if you’d like to download the SpreadServe beta and kick the tyres.

Python 3 & PyCharm

July 1, 2015

I’ve been coding in Python since 2000, and for a long time my dev env preferences haven’t changed. Like many I used Python 1.5.2 with a basic text editor, often vim, for a long time. Once the 2.x series of Python releases started I held off and stuck with 1.5.2 for a long time. I never used 1.6.x. I can’t remember whether I made the jump to 2.1 or 2.2, but I’ve been using 2.x for a long time now, usually with notepad++ as my editor. Part of the reason is that it takes time for the extensive Python ecosystem to catch up and port all the libraries and frameworks. Anyway, I’ve just finished a contract where I used Python 3.3 and the PyCharm IDE, and it was a breath of fresh air. I’d never consider development in Java or C++ without an IDE, and my preferences are IntelliJ & MS Visual C++ respectively. Previously I’d felt an IDE was unnecessary in Python, mainly because the rapid cycle time is so quick. Unlike C++ the cycle is not edit, compile, link, test. In Python one just edits and tests, which makes the printf style of debugging far more effective. PyCharm turbocharges the debugging process with breakpoints and visual object graph traversal. And during coding it interactively highlights syntax errors and variable references. That’s a big time saver too, since it makes code run at the first attempt without throwing syntax errors. +1 for PyCharm!

So what about the shift from Python 2.x to 3.x ?  For me the important points have been the move to more iterator based coding. The iteritems( )/iterkeys( )/itervalues( ) methods no longer exist as items( )/keys( )/values( ) no longer return lists, they return iterable view objects. Those view objects are not stand in replacements for lists. And I had to get used to using the next( ) operator with generators. And, of course, print is now a function and no longer a statement. But apart from that it was straightforward.

Update 2015-07-01: I’ve just been pinged by an old coding compadre who downloaded PyCharm on my recommendation, and needs a tip on fixing up interpreter paths to pick up libs. I had to read a couple of StackOverflow articles to figure this out too, so I though I’d document it here. I’m using PyCharm Community Edition 4.5.2, and to add libraries to my interpreter search path I go to the File/Settings dialog. In the left had tree control, under the Project: <myproj> node I select Project Interpreter. Then I click on the cog icon in the top right, next to the selected interpreter, and choose the More… option. This throws up another dialog: Project Interpreters. On the right are several icons. The bottom one is a mini tree control that shows a pop up tooltip saying “show paths for the selected interpreter”. Click on that, and finally you get the Interpreter Paths dialog, and you can add your library. Phew!! Could this config be buried any deeper? IntelliJ: sort it out! PyCharm is very, very good, but this is quite a useability flaw….