Playing with R

August 1, 2006

I’m doing some data analysis at the moment, and I’ve gone through some toolsets in search of a combination that will give me power, expressiveness and charting capability. The raw data comes out of our trading system’s RDB. I started off with SQL queries, but soon found the slowness of complex queries and the arcana of SQL was holding me back. So I exported the data as CSV and set to work with Excel. I made some more progress, but didn’t want to resort to VBA for the ad hoc slice and dice coding. So then I switched to processing my 300Mb CSV with Python. Progress was good for a couple of days, but then my script started to get hairier. The data extraction and cleansing logic was all mixed up with the analysis logic. And I still didn’t have any good charts. Back to the drawing board. I vaguely remembered folk on Victor Niederhoffer’s mailing list discussing R. I checked out the programming recommendations on the great man’s site. So I downloaded R, and after 30 minutes playing, I’m very impressed. Scripting, powerful maths, built in vector and matrix data types, and flexible charting all rolled together. I’ll be asking around on the floor to see if any of our traders are using R…


2 Responses to “Playing with R”

  1. Rob Steele Says:

    R is most excellent. I love it. It’s my main ax. I’ve never been so productive. It simply cannot be beat for exploratory data analysis and it’s great for building production systems too, though you have to practice some discipline–like using asserts to make sure functions get the type parameters they’re expecting. It’s certainly possible to produce write-only code in R though not as easy as in Perl or MATLAB.

    Make sure to check out these too:

