Analysing the Enron spreadsheets with SpreadServe

March 13, 2017

Recently I’ve been using the Enron spreadsheets to test SpreadServe, simply because it’s good testing practice to expose any codebase to a high volume of diverse inputs. Felienne made them available on figshare, but they’re in a slightly obscure zip format, so I posted them on github to make them more accessible. SpreadServe posts information about the formulae used in each sheet into the DB, so I did a simple analysis of Enron formula use. The results are on github here. To summarise: there are 15927 sheets, and only 8421 use formulae. 152 different functions are used across all sheets, and only 170 sheets use maths funcs that go beyond arithmetic. So the Enron spreadsheets weren’t as diverse as I’d hoped. They made for a good volume test though. Here’s a short video about exploring the Enron spreadsheets with SpreadServe…


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s