May 15, 2007

Egor Kraev: Just two more items on my MATLAB wishlist

I know you have probably had enough of MATLAB by now, and this is indeed the last post in this series. After I wrote the last one, I realized there are two more things I have long been wishing for.

The first one is simple: MATLAB does allow you to call perl scripts straight from its command line. However, as far as I could see, it doesn’t allow to call perl functions, and thus not to pass any kind of objects besides strings between perl and MATLAB.

Now in my view, this renders that feature virtually useless. The one time I’ve used a perl script it was pumping data from a remote source – but the only halfway reasonable way to put that data into MATLAB that I could find was to open a socket and pump it over as a string, and parse that in MATLAB – how much easier would it be if I could return a perl array of doubles, or a hash of such arrays, and MATLAB would convert it!

This, then, is my next-to-last wish: allow to call perl, and python, functions from inside MATLAB, and do on-the-fly conversion of at least arrays, strings, and structs to/from their perl/python equivalents.

My very very last wish is a bit trickier to explain. Suppose you have a vector v of length N, and a square NxN matrix A, and you want to multiply each row of A by the corresponding entry of V. Currently, you have to construct a diagonal matrix from v, and then multiply it by A, and try to remember whether you need to do A=diag(v)*A or A=A*diag(v). Now suppose that instead of one colon operator, you had a family of named colon operators – that you could just write

A(:a, :b) = A(:a,:b).*v(:a)

This makes it rather clearer which indexes of A are matched to those of v, no?

Now suppose you wanted to sum across some of those ranges, but not all – then you could mark those that you want to sum over by an exclamation sign, say. Then, multiplying a vector by a matrix could be written as

sum(A(:a,:b!) .* v(:b!)).

MATLAB already has matrix multiplication? Sure, but what about multiplying two multi-dimensional tensors across a given dimension, such as

A(:a,:b,:d)=sum( B(:a,:c!,:d).* C(:c!,:b) )

(Yes, there are legitimate modelling uses for this kind of thing!)

In true physicist fashion, once ! is seen in an expression using colon operators, the summing might even be automatic, without needing to write sum().

No existing code would be affected, as all the expressions I suggest are not legal in current MATLAB syntax.

Would that not be a neat extension to the already concise and powerful colon operator?

This is the very last post in this series. Coming up: a discussion of HDF5, my favourite format for structured numerical data.

Send in your comments!
Egor

May 10, 2007

Egor Kraev: What I miss most in MATLAB

My last post dealt with the annoyances of MATLAB. This, the final post of the series, is about one bit of extra functionality that is not, unlike most of the entries in my last post, embarrassing not to have (that is, if you hold them to the supremely high standard of the rest of MATLAB), but rather an extra feature that would make my using MATLAB a lot more productive yet.

The request is very simple to state: make using external C++ code with MATLAB as easy as using external Java code. When using Java, you issue one command to load the jars, and you can instantiate any objects therein, call their functions and receive the results. As you do so, MATLAB auto-converts a lot of native Java types to/from MATLAB types, such as strings and double arrays. Thus, the integration if the java classes is truly seamless.

Now compare that to the options of using C++ in MATLAB: either you need to write wrappers to make the C++ code available as MEX functions (and first you need to understand the C++ object model for MATLAB types); or you can load existing dlls, but only if they have been C-linked and you have access to the header files and the header files are simple enough that the MATLAB built-in parser can digest them. To me, neither of the options has been worth the effort involved, so far.

How nice it would be if you could just load an existing library and instantiate its classes, with MATLAB doing at least simple conversions such as strings to std::strings, MATLAB arrays to std::vectors, and std::maps to MATLAB structs? That would allow me to do moving-boundary coding, ‘graduating’ bits of code into C++ as they mature, yet continue to use them in MATLAB. Wouldn’t that be cool?

Yes, I know such tricks are inherently easier in Java than in C++, but I’m sure the smart people at MATLAB could come up with a way. Maybe leverage the CINT interpreter?

Egor

May 08, 2007

Egor Kraev: The bits of MATLAB I really hate!

While there are a lot of things to like in MATLAB, there are of course also annoyances and things I miss. I’ll devote this post to itemizing annoyances and bits of behaviour that in my opinion are just plain embarrassing; next post will be devoted to an additional major bit of functionality that I would like to see that’d make MATLAB an even cooler environment to work in. (Most of the problems refer to version 2006b, the one I’ve been playing with. If any of these have been fixed in 2007a, please let me know).

 

  • The debugger loses settings and breakpoints. If I want to go into debug mode each time an error is encountered, I have to set that option EVERY TIME I start MATLAB. Also, during a session I often find the breakpoint I’d set has disappeared, and the program happily zaps past the spot I wanted to drill into
  • In the editor, there is no option for collapsing cells, for-loops, comment blocks etc. Even Visual Studio, hardly anybody’s candidate for The Friendliest GUI Ever, can do that!
  • The only way to spawn a separate process is in a DOS window – this means if want to run a script in the background that launches, say, a handful of perl processes, each of these pops up in a DOS window and steals my focus (that actually happened to me).
  • Secondary axis plotting. Even Excel can do that. Plot some series with scaling on the left axis, some more series on the same plot with scaling on the right axis, sharing the x axis – how hard is that? And no, plotyy is not nearly good enough; and ‘this is not easy, because of the way MATLAB graphics work’ is not a good enough excuse.
  • The HDF5 file format support is a great thing to have, but the implementation at least in 2006b appears to have a major memory leak. At least, reading a 300MB HDF5 file increased MATLAB memory consumption by the same amount (fair enough in itself), and neither clearing all variables nor any other action I could think of, short of restarting MATLAB, managed to release that memory. Also, why does the MATLAB version not support szip decompression? I know compression must be licensed, but decompression is free - why not enable it?
  • That brings me to my major gripe, namely memory management. MATLAB runs in a Java virtual machine, and appears bound by its memory limit. If I try to set the JVM memory limit to over 1GB (on a 2GB RAM machine), it won’t even start. After working with 100+ MB matrices, after some point memory fragmentation appears to set in, so that you can’t create any reasonably-sized objects before restarting the program. There is the ‘pack’ command that is supposed to remedy that, but you can’t call it from scripts, only interactively (why???), and even then it does not always help greatly. Likewise, the textscan command for parsing long strings is reasonably fast, but requires the strings to fit into contiguous RAM chunks (which, as you can see from the above, is a tough call). If there’s not enough space, it gives up. Did MATLAB programmers never hear about swap?

Do you have your own favourite MATLAB annoyances? Can you tell me of ways to fix the above problems? Do tell.

Egor

May 06, 2007

Egor Kraev: Why I Love MATLAB: the GUI

  • The side of MATLAB that in my opinion puts it well ahead of any comparable software that I’m aware of is the very mature integration of the command line with the Graphical User Interface. For example, you can select any portion of code in the editor, and right-click to evaluate it at the command line. Also, most interactive GUIs (such as plot windows in interactive mode, or the wonderful Distribution Fitting Tool) will generate a script for you that will reproduce the result, so that you can tweak things with the mouse and then look at that code to find out how to script it.
  • Another nifty feature is the ‘cell mode’ (not related to cell arrays mentioned in my last post): by formatting your comments in a particular way, you can split your files into ‘cells’ and execute them one cell at a time using ctrl-enter (and a range of related tricks).
  • Oh, and you can dock/undock any window from the MDI mother window – so on a four-monitor screen, I have a swarm of detached windows, and when I’m remoting into my machine, they can all be docked nicely into one window. A minor thing perhaps, but having to break my thought flow to hunt for a lost window is just so annoying and distracting.
  • Thus, I’d say MATLAB manages to give me the best of both the command line and the GUI world by making scripts more interactive and creating scripts from my mouse-clicks on GUIs (though admittedly some bits of that code are more readable than others).
  • It has the tools you’d expect from any decent development environment, such as a debugger and a profiler (on a side note, does S-plus have a debugger yet? I’m pretty sure they didn’t a year ago), and also on-the-fly code analyzer that points out, for example, if a variable is written to but never read (usually signifying a typo), and even occasionally suggests a better way of using a command.
  • Finally, MATLAB graphing facilities are just awesome. I’d keep it around just to do my plotting even if it couldn’t do anything else.

So much for the praises – it is no accident I’m using MATLAB for most of my data exploration and prototyping. The next two posts will be devoted to annoyances and things I wish MATLAB had but doesn’t.

Egor

May 04, 2007

Egor Kraev: Why I Love MATLAB: The Language

A bad workman always blames his tools. But even a good workman finds himself wasting a lot of time when faced with tools not adequate to the job at hand. Thus, after a lengthy exploration of confidence intervals, I’m devoting the next couple of posts to one of my favourite tools, MATLAB

Why do I like it so much? Well, there’s two reasons. First of all, there’s the language – wonderfully concise and expressive. Second, the mature GUI. I’ll devote the next post to the GUI, and talk about the language today. It starts simple - just treat everything as a matrix; operations on vectors can be written really simply. For example, if you are given a vector x and want to extract the excess of each element over a threshold t, but only if that excess is positive, all you have to do is to write y=x(x>t)-t;

Suppose you want to make a function that does that, and call it thresh:

           thresh=@(x0,t0) x0(x0>t0)-t0;

Then y=thresh(x1, t1) will give exactly the same result as above. And the ‘function handle’ thus created can be assigned to variables and stored in cell arrays, as described next.

  • If you want to store heterogeneous objects, there is something called a ‘cell array’ which is like a matrix whose elements can be anything – structs, matrices, other cell arrays, functions, etc. And the objects are self-aware – there is a bunch of functions to ask an unknown object whether it’s a struct, a number, or a function.

  • Another feature that makes MATLAB ideal for prototyping and data exploration is that if you assign to something that doesn’t exist, it’s normally created for you, be it a variable, a field in a struct, or extra elements in an array.

  • A final little-known feature of MATLAB that I find really cool is its close Java integration. After issuing a single command to load your jar, the classes therein can be instantiated inside MATLAB and intermixed freely with native objects. With just a little effort, it’s also possible to hook MATLAB up to Eclipse via JDWP, so that you can set breakpoints in your Java code in Eclipse, manipulate the java objects inside MATLAB, and use Eclipse debugger if they throw exceptions or hit your breakpoints. Thus, MATLAB can serve as a scripting playground for your Java code.

The other side of MATLAB that I like a lot is the really mature integration of the command line with the GUI/development environment, which I’ll discuss in the next post.

Do you also have your favourite corners of the language? Do let me know.

Egor

April 06, 2007

Egor Kraev: 6) How Not to Estimate Standard Deviations

To finish off this series, let me cover a very common mistake in estimating standard deviation. Say you’ve got a daily series, going back 1.5 years (say 382 points), and you want to compute its one-year (252-day) rolling moving average. That gives you 130 points of rolling moving average, each representing the average of one year prior to a given date. Now, how do you estimate the standard deviation of the last point in that series?

   Did you think ‘take the std of the 130 points’? If so, congratulations - you just made the mistake I’m talkng about. The above approach would work if the errors in the 130 points were independent - but of course in reality they are heavily correlated. They are, after all, averages over largely overlapping periods. Therefore, the std of the 130 points will be much lower than the std of the last point.

   What should you do instead? Why, just treat that last point (and each other point in the rolling series) as a point estimate, and apply the methods I discussed earlier.

   Let’s illustrate this with the following MATLAB simulation. We generate 382 points with mean 2.0 and std of 1.0, compute the rolling averages, and repeat that simulation 1000 times. My previous posts suggest the std of the final point should be 1Confidence15x252) 0.063.   

for n=1:1000
    x=2+randn([1 382]); % generate iid numbers with mean=2 and std=1
    y=cumsum(x)/252;
    y=y(253:end)-y(1:130); % compute rolling average
    final(n)=y(end);
    rollstd(n)=std(y);
end
std(final) % the ’true’ standard deviation of the final point
mean(rollstd) % the average std of the 130 rolling points

The simulation gives the std of the final point of 0.067, and the average std of the rolling series of 0.024. So using the rolling series can lead us to underestimate our margin of error by almost a factor of 3!

   This is the last post in this series. The next series will be about MATLAB - why I love it, why I hate it, and what I miss most in it.

Send in your comments please!

Egor

April 04, 2007

Egor Kraev: 5) Confidence Intervals of Almost Anything at All

In my posts to date, I’ve covered a variety of simple confidence interval/standard deviation estimates for particular estimators. Now, I’ll present a more general, if slower, method that should work with virtually any estimator you want.

   Suppose you have a sample of N points (each point can be a tuple of real numbers) from some population, and you want to estimate a population parameter Θ. For that purpose, you have a function F that takes k points and returns an estimate Confidence14x. Suppose this function is a black box, that is all you know about it is you can feed k points into it and get an estimate back. How do you compute the confidence interval for such an estimate?

   If you had M k distinct points from your population, with M ’very large’, you could split them in M groups of k, feed each group to F, and thus get M independent estimates of Θ. The appropriate percentiles of that group of estimates would give you the confidence interval, and you could use the standard deviation of that group of estimates as a proxy for the std of your estimator.

   However, in reality one seldom has access to huge amounts of independent samples - what do you do then? You bootstrap. That is, you choose a ’very large’ M and draw M k random points from your sample of size N, with repetition (that is, drawing a particular point does not exclude that point from subsequent draws). Then proceed as above. Simple? Yes. Crude? Definitely. Does it make sense? I think so. After all, all you know about your population are those N samples you have - so using them as a proxy for your population seems to me both natural and meaningful.

   If you have other (prior) information about the distribution of Θ, you can then integrate it with the confidence intervals you just obtained using Bayesian tricks (if you don’t know what these are, drop me a line and I’ll do a post about that).

   In the next, final post of this series: how not to estimate standard deviations.

Egor

March 15, 2007

Egor Kraev: 4) Confidence intervals for Correlation

   

Lets now look at confidence intervals for correlation. The honest way of doing it can be found in Wikipedia under ‘Fisher transform’. Basically, the Fisher transform    

Confidence9x

is a way of transforming the [-1,1] interval onto the real line, such that the standard deviation of F(ρ) equals Confidence10x_1 if you have used N pairs of points to compute correlation. You get the confidence interval for the transform, transform it back, and there you are. Not too hard, but still too fiddly for my taste.

A simpler way is to notice that ρF(ρ) = Confidence11x and thus linearizing the inverse Fisher transform, we get the pleasingly simple formula    

Confidence12x

So for a correlation of 50% computed using a year’s worth of days, the standard deviation is about 0.75Confidence13x 0.06, so a 95% confidence interval is 24 percentage points wide.

Be careful for correlations very close to ±1 though, especially for small samples - the distribution there is so skewed that you might be better off generating asymmetric confidence intervals with the exact Fisher transform.

By the way, all this should work for any correlation estimate, be it Pearson, Spearman, or Kendall (ask Wikipedia if you don’t know what some of these are).

Send me more comments and questions!

Egor


March 09, 2007

Egor Kraev: 3) More fun with confidence intervals

So, how do we take a standard deviation of a standard deviation? Two formulas are helpful here, namely the results of applying a function before and after averaging. Firstly, for any function f, the mean value of f over the sample is f(x) = 1N-    n=1Nf(xN) and its standard deviation is    

          + -------------------
          ¦¦  1 ?N          ---2
std(f (x)) = ° --   f(xN )2 - f(x)
             N n=1

Apply that to f(x) = (x -x)2 and you get    

             +¦ -----------------------
             ¦° 1-?N      --4        2
std(var(x)) =   N    (xN -x)  - var(x)
                 n=1

 

   Secondly, for any reasonably smooth function F, std(F(x)) xF(x) std(x)

   Applying that to F(x) = vx-- and using std(x) = °var(x)- we get    

            std(var(x))
std(std(x)) =  2·std(x)

 

   Piece of cake, eh? But if it’s still too much hassle, and you are ready to pretend that the residual distribution is normal, you can always just use    

            svtd(x)-
std(std(x)) ˜   2N

   So, say, on a 30-vol stock, if you use one month (22 days’) worth of daily returns, the standard deviation of realized vol estimate is 30 4.5 vol points, thus the 95% confidence interval is 18 vol points wide. Something to be aware of, no?

   Posts forthcoming in this series: std of correlation, and how to compute std of absolutely anything.

Egor

March 05, 2007

Egor Kraev: 2) Confidence Intervals

One of the most basic points of estimation is that an estimate is quite meaningless without some kind of confidence interval around it. If I dont know what the confidence interval is, how can I rely on the number? Is the 1.205 really [1.204, 1.206], or [0.8, 1.5]? If the parameter estimated on last years data is 1.2 an on this years data 0.9, has it changed or is it just measurement error? I cant know what Im doing unless I can answer questions like that - and yet there is a surprising amount of ignorance around regarding confidence intervals. A recent interviewee (and not a bad one at that), when asked “does it even make sense to speak of a confidence interval for a correlation value”, could only suggest [-1, 1].

   So how do you estimate confidence intervals? To first approximation, which is usually good enough, its ±2 standard deviations around the value (unless your distribution is really skewed or has unusual tails). Lets look at a couple of cases I've often needed.

   The very basic one is the standard deviation of an average. Say youve got N points x1,,xN. Then the sample mean is x =   n=1NxN∕N and its standard deviation is    

        + ----------------
        ¦¦ ( 1 ?N   )   --
std(x) = °  --    x2N  - x2
           N  n=1

 

   Elementary? Perhaps. If so, what is the standard deviation of the standard deviation? This is not a strange question: if xn is a log returns series, std(x) is an estimate of its volatility – something we do want a confidence interval for. So, any suggestions on how to get it?

February 28, 2007

Egor Kraev: 1) The Moons of Jupiter and the Whale

Virtually any computational finance project at some point involves estimating some parameters from real data. Suppose that you know what data series you need, and that you have the luxury of timeseries going back for decades. How much should you use?

The major problem with a lot of estimation procedures has been summed up beautifully by Prof. Stephen Figlewski, in his treatise “Forecasting Volatility”:

”I asked [a statistician presenting a complex model at a conference] why we should expect prices observed in the market to behave according to the postulated pricing relationship when there were no actual market participants who understood the model or used it to do the trades that would push prices to the correct values. The answer was that he thought of an equilibrium pricing model for a financial market the way an astronomer thinks of the physical laws of motion that apply to a celestial body. Although the moons of Jupiter do not understand why they behave in a particular way, an outside observer who knows the laws of motion they follow can make very accurate predictions about where they will be thousands of years in the future. [...]

This is essentially the way classical statistics models an estimation problem. There is assumed to be some fixed but unknown underlying structure, or ”data generating process,” and the statistician has a set of observations produced by that process from which estimates of its parameters will be deduced. One aspect of this conceptual framework is that estimation and forecasting are very similar to each other. [...] A second feature of this framework is that one might hope to get arbitrarily good parameter estimates if one has a large enough data sample.

I believe the classical statistics framework fundamentally misrepresents the nature of a financial market [...]. Consider a different and more earthly estimation and prediction problem from that of the moons of Jupiter. Suppose we wanted to predict the movements of a whale, based on observing it over a period of time. Being a large animal with a lot of momentum, the movements of a whale must be fairly predictable over the short run simply by extrapolation. Yet we do not think of a whale as following a fixed and immutable pattern the way the moons of Jupiter do, at least not one that we could ever hope to understand completely.

In this case, we are not looking at a fixed structure with constant but unknown parameters, but rather at a system that evolves over time, and perhaps alters its behavior rapidly on occasion. [...] Prediction is possible only because the system usually evolves slowly and therefore our accumulated information from observing it only decays slowly. In this case, there may be an enormous difference between how well a model fits in-sample and how well it can forecast out-of-sample, and classical goodness of fit statistics may give little guidance about the latter. Also, [...] given that the data is generated by a structure that changes in unknown ways over time, expanding the estimation data set by adding observations from the distant past can easily make the estimates of the current state of the system worse rather than better. Finally, given that the structure does not remain constant, there is a great premium on models and estimation procedures that are robust against small changes. The more detailed and elaborate a model is, the better the fit one is generally able to obtain in-sample, but the faster the model tends to go off track when it is taken out-of-sample.[...]

As should be obvious, I believe a financial market is much more like a whale than like the moons of Jupiter.”

In this series of posts, I will be discussing parameter estimation from the above perspective - things that are simple, robust and adaptive.


Egor

February 07, 2007

Egor Kraev: Why I Hate C++ and Still Use It (Part 2)

In my last post, I talked about the main source of pain in C++, namely having to know the type of all objects at compile time.

Consider a simple example: I want to parse a comma-separated text file, with an a priori unknown number of columns and rows, and store the result in my C++ program. What data structure should I use? If all of the data in the file are integers, I could use a map<string, vector<int> >, with column headers as keys. If it’s a mixture of doubles and integers, I could store them all as doubles (though that already has its problems). What if some of the data is strings?

A response frequently heard from C++ die-hards is ‘Why would you want to store data like that?’ (an actual quote from comp.lang.c++: “By definition, an array is a contiguous sequence of objects of the same type. So what you're asking is not possible.”) – a typical example of how a language can warp your brain ;) .

The obvious brute force method is using void* everywhere, and then bravely casting every entry back to what you want it to be, but the scary thing about that is that it might work when it shouldn’t. A more promising option is using boost::any and any_cast for the same purpose, and the most elegant one I’m aware of is using map<string, vector<string>> (it came from a text file after all), and only converting the data to integers/doubles when you are about to access it. Far uglier solutions can be found by Googling “parse csv c++”, or by asking job candidates you happen to be interviewing.

All this creativity is spent to achieve something that is a non-issue in dynamically typed languages such as python or MATLAB, where you can simply have an array of heterogeneous objects.

A different gripe is that C++ objects are not self-aware. You cannot ask an arbitrary object what data members or functions it has, and if you are referring to it through a pointer to its base class, you can’t really tell what type it is either. Thus, again, the need for header files and hard-coded function names, and a need to rewrite glue code each time you add a new function.

Yet one more reason to hate C++ is that it is not at all cross-platform unless written with real understanding and intent to make it so. There is Windows vs. Unix, there is gcc vs. the three or so different versions of Visual C++ in circulation at any given moment, each of them with its unique set of quirks (and they also interact with different processors in interesting and surprising ways).

Why, then, do I find myself persistently using C++ anyway? The first obvious reason is runtime speed. However, this is not as much of an advantage as it seems, as I tend to spend way more time writing code and especially debugging it, than I spend waiting for the code to finish running – but still, sometimes you just need things to be fast, especially in realtime systems.

A more important reason in my opinion is that it allows you to exercise absolute control over memory allocation. That is one area where MATLAB, otherwise my preferred data exploration platform, runs into a brick wall (more on that in a later post). You don’t need control over memory allocation very often, but when you do, you really, really need it.

Neither of these is in my opinion nearly enough to justify using C++ as the primary language for quant work, but they do mean you have to know at least enough of it to do the heavy lifting when necessary.

Are there other reasons why you hate C++? Do you think the things I hate are actually features? Is there a more elegant solution to the comma-separated file example? Please let me know.

In spite of all my complaints, there is an undeniable beauty to C++, because of its very weirdness. Singletons, factories, recursive templates and other beasts of the abyss might not have a friendly inclination, but having tamed them yet again into actually doing something useful does give one the proverbial warm glow.

Egor

February 03, 2007

Egor Kraev: Why I Hate C++ and Still Use It

From almost the first time I tried to use C++ to get a job done, I found myself hating the language intensely. The first and biggest reason is that C++ is statically typed. That means that the type of every single variable you use must be known at compile time (thankfully, by using STL at least you don’t have to know in advance things like the sizes of arrays anymore).

When I think of C++, the first things that come to my mind are header files, templates, and inheritance. It came as somewhat of a shock to me when I realized that EVERY ONE of these three is either caused by, or is a way of working around, the static typing problem.

Header files need to be separated from source files to tell the user code just as much about the library code as it needs to know (which is a lot). Templates are basically a way of generating exactly the same code using different types, and inheritance means you can pretend that a derived object is actually the base class object, for the purposes you have in mind for it.

What that means is that at least half the actual code you write (and a substantial part of the mental energy you spend) is devoted to working around the language’s quirks, making your design just flexible enough that it can do what you want, yet specific enough that the compiler will buy it – taking precious attention away from the actual problem you’re trying to solve. A side effect is verbosity – I would say to do the same task you need about five lines of C++ to one line of MATLAB.

In my next post, we’ll see how an innocuous task of parsing and storing a comma-separated file becomes a fair-sized puzzle because of static typing.

Egor

February 02, 2007

Egor Kraev: Programming Languages, Estimation Methods and more to come...

Hello!

This is Egor Kraev and I will be covering two broad areas here in the C(omp)++ blog, namely programming languages and estimation methods, both from the perspective of getting things done.

Why talk about languages? In principle, just about any task can be accomplished in just about any language (it is easy to write bad Fortran in any language), so why discuss the tools rather than the tasks? The reason I believe such discussion is worthwhile is that the languages you use shape your thinking – you are not proficient in, say, C++, until you can think in it; and each language makes some things easier and others harder to achieve - think of the mentality difference between a chainsaw and a Swiss Army knife. Thus, I believe any competent practitioner of computational finance must be genuinely multilingual – and the merits of the different alternatives are well worth discussing.

The other broad area I will be roaming in is numerical estimation, especially on time series.
After almost 10 years of working and playing with time series of many shapes, sizes and sources (from analog electronic circuits to human development indicators, not to mention financial data), I would say about 80% of all numerical work is pure commonsense – a collection of ideas and tricks that by hindsight appear almost embarrassingly simple. Nine times out of ten, after trying out a lot of fancy tricks, from GARCH to wavelets, the trick that finally does the job at hand is the stupidest think you can think of, ex post. However, that is often little help ahead of time, as the challenge lies in doing the right stupid thing, which only the pain of experience can really teach. To quote Terry Pratchett, ‘Not using almost any magic at all is what being a wizard is all about’.

Though most of the things I plan to cover won’t look like much on first glance, a lot of them are anything but easy to find or invent on your own (as I mentioned, it took me years).

Next post: 'Why I Hate C++, and Still Use it Regularly'.

C(omp) Search


WWW
compplusplus.com

C(omp) Community

Could this be you? Thijs van den Berg Dr. Jörg Kienitz Bjarne Stroustrup Dr. Egor Kraev Daniel Duffy Andrea Germani Umberto Cherubini Luigi Ballabio

More Members

Meet the Editorial Team



C(omp) Feeds


Want to know when new posts and features are made available? Sign up to receive email notifications by entering your email address:

Delivered by FeedBurner



Any Comments?

Send in questions for our authors and bloggers: comp@wiley.co.uk



C(omp) Events

1) 13-15 November: Quant Invest 2007
Russell Hotel, London
Key speakers include Sushil Wadhhwani, Paul Wilmott and Deborah Fuhr...

2) 30 November: CCCP Mathematical Finance Conference
Princeton University
Speakers include Paul Glasserman, Peter Carr and Rama Cont.

3) 10-14 December: Risk Minds 2007
President Wilson, Geneva

4) 12-15 December: Quantitative Methods in Finance
Manly Pacific Hotel
Sydney, Australia
Speakers include Mark Joshi

5) January 2008: Distance Learning for Financial Engineers
Computational and Quantitative Finance in C++
Datasim Education BV


Recent Forum Discussions


C(omp) Calendar

June 2008
Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30