September 07, 2007

An Introduction to the Boost Library

by Daniel Duffy

C++ continues to evolve and already STL is part of the official standard. I would like to discuss some new developments that are taking place and in particular we give an overview of the boost library (www.boost.org), a suite of useful C++ libraries containing functionality in the form of template classes:

  • Multiarray: defines a generic interface to multidimensional containers
  • Random numbers: contains a number of classes for generators and statistical distributions
  • Property map: classes that embody key-value pairs and definition of the corresponding access (for example read and write)
  • Smart pointers: objects that store pointers to dynamically allocated (heap) objects
  • Graph library: a C++ library implementing graphs, networks and related graph algorithms

The authors (as well as many other software developers) have developed a subset of the functionality contained in these libraries. For example, in Duffy 2004 and Duffy 2006 we have implemented a template Property Map class with properties similar to that in boost; furthermore, we have created classes for two-dimensional matrices and three-dimensional tensors using nested STL classes. If we were to build such classes again, we would choose to use the boost library as underlying substrate.
We give a short description of the functionality in the above libraries. At this stage we avoid dealing with the C++ details on how to use these libraries in an application (they will be discussed later). We summarise each library as a list of features:

   

Multiarray
    Array classes for n-dimensional data
    Accessing the elements of array using () and [] operators
    Creating views of an array having the same or less dimensions
    Determining storage ordering of data (C-style, Fortran-style or user-defined)
    Defining or changing array index (zero is default)
    Changing an array’s shape
    Resizing an array (increasing or decreasing the extent of a dimension in an array)

   

Random Numbers
    Linear congruential generators
    Mersenne Twister generator
    Lagged Fibonacci generators
    Classes for continuous statistical distributions
    Classes for discrete statistical distributions

   

Property Maps
    Key-value pair concept
    Readable and writable data
    Support for built-in C++ data types
    Applicability to Boost Graph Library (BGL)

   

Smart Pointers
    Sole ownership to single objects and arrays
    Shared ownership of objects and arrays
    Shared ownership of objects with embedded reference counting

The gradual introduction of these features in applications will promote the reliability, maintainability and efficiency of your software. We shall see next how to use these libraries and to apply them to the Monte Carlo method and other applications in finance.

In the next blogs we shall give some examples in C++ and applications to the Monte Carlo method.

June 06, 2007

C++ Parallel Processing in Computational Finance, Part I: An Overview of OpenMP www.openmp.org

This blog is the first in a series of three blogs on the application of parallel programming techniques to Computational Finance with special attention being paid to Monte Carlo and PDE techniques. The current blog is meant as an overview of OpenMP at 30,000 feet.

OpenMP is a software product consisting of a collection of compiler directives (called pragmas), library functions and environment variables that developers can use to specify shared-memory parallelism in C. By shared memory we mean that all threads share a single address space and they communicate with each other by writing and reading shared variables. The OpenMP Application Programming Interface (API) is portable between various shared memory models. It has been tailored in such a way to support programs than run in both parallel and sequential modes. It is especially useful for large array-based applications. The main features are:

  • Support for parallelization of loops and iterations
  • Work-sharing constructs and the construction or parallel regions
  • Synchronisation constructs
  • Sharing and privatization of data

We shall discuss how to realize the above features in OpenMP in the next blog.

Many engineering, scientific and computational finance applications are expressed in the form of iterative constructs. In these cases the code consists of various kinds of loops. In general, loops are needed when we need to navigate in (hierarchical) data structures such and vectors, matrices, trees and graphs. Improving the performance of applications that rely heavily on loop constructs is a high priority. Given that many developers will choose to port their serial programs to the equivalent parallel ones using OpenMP we now discuss the forces to be reckoned with when we port loop-based code written for serial machines to parallelized loop code. There are a number of forces to examine that are related to the correctness and efficiency of the parallel code:

. Sequential equivalence: a given program should produce the same results when executed with one thread or with more than one thread. We use a series of transformations that we apply on the loops in order to achieve this end. These are called semantically neutral transformations if they leave the semantics of the program unchanged. A point to note is that due to round-off errors the results from a serial loop may be different from those in the equivalent parallel loop. For example, adding numbers in a loop will give different answers depending on the order in which they are added, for example if we add the numbers serially or in ascending order. This may be unacceptable and we should then decide not to parallelise the loop.

. Incremental parallelism: we parallelise a program by examining one loop at a time, for example. We apply a sequence of incremental transformations and we test the sequential equivalence after each increment. In this way we can be assured of correctness on the one hand and we can measure the performance improvement on the other hand.

. Memory utilization: In order to achieve good performance it is important to note that the way data is accessed (for example, using indexing operators or by dereferencing) is consistent with the memory model of the operating system.

 

Any comments or questions, I’d be happy to answer them.

Daniel

 

June 05, 2007

"Free Seminar Evening – Software Frameworks in C++ and C# for Applications in Computational Finance" by Daniel Duffy

Speaker: Daniel J. Duffy, Datasim

Date and Time: Tuesday 26 June 2007, 18.30 to 20.30

Venue: City of London

Registration: contact Ilona Hooft Graafland

In this seminar we discuss how modern software design techniques are used to create applications in Computational Finance using C++ and C# as the implementation languages. We elaborate on the object-oriented and generic programming features that we employ to create robust and maintainable software systems for a range of derivatives applications such as equities and interest rates. In particular, we concentrate on PDE/FDM and Monte Carlo methods.

Some of the topics to be considered are:

  • Designing architectural Frameworks for financial applications
  • Combing the object-oriented and generic programming models
  • Developing applications for Monte Carlo and PDE
  • Choosing between C++ and C#: creating interoperable applications

Who should attend:

  • Quants and quant developers
  • Managers who wish to gain insight into the software development process in Computational Finance 

Program:

18.00 – 18.30 Refreshments and Registration

18.30 – 19.30 C++ and C# for Computational Finance

19.30 – 1945 short break/refreshments

19.45 – 20.30 Option Pricing with PDE and Monte Carlo

20.30 end of seminar

About the Speaker: view Daniel's profile

May 22, 2007

Is Computational Finance a Multi-Disciplinary Activity? by Daniel Duffy

Software is an indispensable part of most businesses. In the case of computational finance the main challenge is to propose new financial models, define mathematical structures and algorithms for them and then map these algorithms to code. Some of the questions that arise are:

  • How do we define a repeatable process for modelling derivatives in C++ or C#?
  • How do the financial, mathematical and software models link up?
  • What are the requirements for our resulting software products in terms of maintainability, efficiency, functionality and reliability?

These questions are more easily posed than answered for a number of reasons; there are a number of challenges to be resolved. First, the ideal development team consists of quant analysts, developers and other professionals with knowledge of finance, mathematics and numerical analysis. Second, the members of the team need to have a common language (or even several languages) that they can use to promote interoperability. Finally, we need to take account of constraints such as time-to-market, project schedule and budget as well as the requirements on the resulting software product.
The techniques that we can use to promote computational finance as a multidisciplinary activity are:

  • Good mathematical and numerical models that unambiguously define derivatives products
  • Using the de facto modeling language UML (Unified Modeling Language)
  • Using proven Design and System Patterns
  • Using multi-paradigm programming languages that support the object-oriented and generic programming models

I shall be speaking about these software-related issues at the free evening seminar in the City of London on June 26 2007 at Charing Cross. More information and registration (you must register in order to attend), see http://www.datasimfinancial.com/course_detail.php?courseId=13

For queries, please contact Ilona Hooft Graafland at ilona@datasim.nl

Let me know if you have any questions.

Daniel


 

April 29, 2007

Daniel Duffy: C++ Programming for Financial Research Students

with Samvit Prakash, University of Maryland

Danielatunimd

In April 2007 I gave a C++ course and its applications to finance for a group of research students of Professor Dilip Madan at the University of Maryland UMD (see group photograph).

The students had programming experience in Matlab, C and a number had some experience of C++ (incidentally, one of these students was working on a Quantlib project). The course was based on the contents of the 2006 C++ book. The percentage theory was approximately 55:45. The first two days concentrated on the essential fundamental syntax of the C++ language, advanced inheritance and templates. On the last day we introduced design patterns with applications to PDE/FDM and a student exercise on CDO/CDS.

The motivation was high with lots of questions being thrown at the trainer. Never a dull moment!

Some of the tips and guidelines in the context of a university environment are:

Duffy’s comments

. The distinction between the compile and link phases when building a project; coming to terms with compiler and linker errors

. Setting up a C++ project and defining project settings and properties

. Realising that C++ uses data in a different way from Matlab; in particular, we do not use global data in C++. The fact that objects need member data needs to be emphasized

. Design and implement a C++ as a network of communicating objects having well-defined interfaces; this promotes maintainability and extendibility

. ‘Thinking object-oriented’ and not procedural


Samvit’s comments

We requested Daniel to come to the University of Maryland for giving us a jump-start in C++. That is exactly what we got! One can learn C++ from a book. But nothing can match the efficiency of the author guiding you through the chapters, while focusing on the most pertinent concepts. Most of us had no background in C++, though some of us had attempted to learn it from books. After the course, we were able to create our own projects, build and compile our own codes with applications in Mathematical Finance. These were simple projects, but way more advanced than “Hello World!”.

If other universities want to invite Daniel for such courses, here are some suggestions from our experience:

-Organize a 4-day course or at the minimum, a 3-day one. It is best to have spare time for topics relevant to the participants!

-Even better to split the courses into two - a beginner’s one, with no C++/OOP background, and an intermediate one where students know the basic concepts of C++ and OOP: classes, polymorphism, overloading etc.

-The intermediate class can focus more on design patterns for a larger project.

- Divide students in groups of 2 or 3. Encourage participants to debug each other’s code- it’s a great way to learn and very efficient for the pace of the class!

We wish to thank all attendees for their enthusiasm.



April 08, 2007

Daniel Duffy: C++ Frameworks for Monte Carlo, Part II - The Top-level Architecture

In the last blog I discussed the design of the SDE and FDM components that allows us to create paths for the underlying assets. In this blog I would like to describe the other components in the framework as well as alluding to some of the POSA and GOF design patterns that help to promote maintainability, suitability and efficiency of the resulting C++ implementation.

The software system consists of three major subsystems/components. Each component has a well-defined responsibility and it provides standardized interfaces to other components that require services from it:

.  SDE/FDM; simulates paths of the underlying asset and in particular provides simulated asset prices at the expiry date, for example;
.  Instrument/Payoff: models the different kinds of payoff and related instruments. In particular, this component is able to calculate prices and sensitivities. This component encapsulates the Monte Carlo engine for a single instrument;
.  Client Components: customizable and extendible functionality for a range of applications. For example, we can produce statistical reports, VAR calculations and multi-asset applications.

We model each of these components using the Mediator pattern because this ensures loose coupling. Each participant in this pattern is modeled using a Layers pattern because this allows us to separate the UI aspects of the problem from the application logic. At the code level, we use Visitor because it allows developers to extend the functionality of the framework in a non-intrusive way.

One important remark: we combine the OO way with generic programming. In general we have used a number of template classes in the framework and the class hierarchies are at most two to three levels deep. A combination of the inheritance and delegation mechanisms seems like a good choice.

In the next blog we shall discuss some more detailed design issues, for example using Property Sets for modeling payoffs.

Any comments?

Daniel

March 22, 2007

Daniel: Designing Monte Carlo Frameworks, Part 1 - Generating Sample Paths

This is the first in a series of three blogs in which we discuss the application of system and design patterns to the development of customizable C++ code for Monte Carlo applications. We focus on the issue of sample path generation (as discussed in the book by Paul Glasserman, for example) and to this end, we have designed a system that approximates the solution of n-factor stochastic differential equations (SDEs) using numerical methods, for example Finite Difference Methods (FDM).  The system consists of a set of loosely coupled components. Each component has one major responsibility and it offers services to other components in the form of standardized and unambiguous interface functions as shown in the figure that is documented as a component diagram in UML 2.0, the de facto standard in object-oriented and component analysis. In this case we use this ball-and-socket notation to externalize and specify the contracts between the different components in the system:

. SDE: models n-factor stochastic differential equations. We support GBM and jump-diffusion models based on Levy processes

. RNG: the random number component that generates pseudo-random numbers. It implements Mersenne-Twister, Box Muller and other generators

. FDM Solver: this is the component that implements a range of finite difference solvers such as Euler, Milstein, Predictor Corrector and others. It requires the services of a mesher that is responsible for the generation of suitable mesh data.

. Solver: this mediator integrates the other components. It produces multi-dimensional data for use by other systems, for example when we need to model payoffs in n-factor options models.

Cframeworksapplicationsdiagram_2

Having defined the components and their interfaces we can now concentrate on designing them. To this end, we use a combination of object-oriented and generic programming styles and in our opinion this combination is optimal in terms of adaptability, efficiency and functionality. This means that developers can define their own models and integrate them into the framework with a minimum of programmer effort. In other words, the infrastructure is in place and all that needs to be done is to insert your own specific code and C++ classes. For example, it is possible to model a wide range of problems such as baskets, stochastic volatility and fixed income applications.

Finally, one of the advantages of having loosely-coupled components is that the software can be ported to a parallel environment; each component is (potentially) a parallel thread. Then we have the possibility to employ parallel design patterns such as Master-Slave, for example.

March 18, 2007

Daniel's Vlog: Creating Application Frameworks Using C++

Daniel discusses issues involved in creating application frameworks using C++ for Monte Carlo applications.


Email Daniel or leave a comment if you have any questions/feedback.

February 24, 2007

Daniel Duffy: Using C++ and C# for Application Development

C# and C++ are descendents of the C programming language.  It is worthwhile to consider whether it is ‘better’ (in some sense) to develop new applications in C# or C++. We discuss the problem from three perspectives:

    P1: The skills and knowledge of those engineers developing QF applications
    P2: The type of application and related application requirements
    P3: The technical, organizational risks involved when we choose a given language

First, C++ is a multi-paradigm language and it supports the modular, object-oriented and generic programming models. C# is a relatively new language and it supports the object-oriented and generic programming models, but not the modular programming model.
In general, C# is much easier to learn than C++. It shields the developer from many of the low-level syntax that we see in C++, in particular the dreaded pointer mechanism, memory management and garbage collection. The C# developer does not have to worry about these details because they are automatically taken care of by the garbage collector. C++ is a vendor-neutral language (it is an ISO standard) while C# was originally developed by Microsoft for its Windows operating system.

We now discuss perspective P2. This perspective is concerned with the range of applications that C++ or C# can be applied to, how appropriate they are to these applications and how customer wants and needs determine which language will be most suitable in a particular context. In general, customers wish to have applications that perform well, are easy to use and easy to extend.

We now compare the two languages from the perspective of developer productivity. In order to answer this question we need to define what we are measuring. C# has many libraries that the developer can use, thus enhancing productivity. C++, on the other hand does not have such libraries and they must be purchased as third-party software products.

Finally, perspective P3 is concerned with the consequences and risks to the organization after a choice has been made for a particular language. C++ is a large and difficult language, it takes some time to master and C++ applications tend to be complex and difficult to maintain. But a careful design can mitigate the risks.

I'd be happy to answer any of your questions - feel free to leave a comment!

Daniel

February 19, 2007

Daniel Duffy's First Vlog Post

Watch Daniel highlight three key issues he would like to discuss over the coming weeks and invite your participation:

Applications, Design and using C++/C#

View video:

       

Email Daniel or leave a comment if you have any questions/feedback.

February 16, 2007

Daniel Duffy: Learning C++, Part II

 In my previous blog I discussed a number of fundamental techniques that should be mastered before one can make use of C++ to build an application.
In this blog I would like to introduce a number of advanced C++ syntax constructs that promote developer productivity and the reliability of code:

1. Template class and template functions: C++ supports the generic programming metaphor. This is the realisation of the Abstract Data Type (ADT) in computer science. An ADT is essentially a data structure and operations acting on that structure. But the type of the data is unspecified or generic. C++ supports a wide range of ADTs in its Standard Template Library (STL). Examples are lists, vectors, maps and sets.

The advantages of templates are:

. Once you have written and tested a template class it can be reused in many contexts by replacing the generic type by a specific type (this is called template instantiation)
. They are compile-time, hence fast
. They complement the object-oriented programming paradigm. You can inherit a template class from another template class, for example
. In some cases, using the ubiquitous inheritance mechanism is just not the correct solution. In particular, deriving all classes from the ‘cosmological’ object or Object leads to performance and reliability issues. Using templates in this way avoids dynamic casting

2. Exception handling: C++ supports try/catch/throw and the use of this technique is to be advised rather than ERRNO variable or even assert(). In general, exception handling is useful for catching and handling logic errors in your code. For example, here is a piece of code for printing arrays in Excel. The called functions checks if the input arrays from a finite difference methods are aligned:

try

{ // Print option price

printMatrixInExcel(fdm.result(), fdm.TValues(), fdm.XValues(), string("BSEulerE"));

} catch (DatasimException& e)

{ // If arrays are not compatible catch the exception here

    e.print();

    return 0;

}

In this case the function must be called with the arguments in the correct order, otherwise a run-time error will result because of array misalignment. Having an exception handler allows to pinpoint the logic error.

3. STL has a number of generic algorithms for sorting, searching and modifying STL containers. Please use these rather than creating your own.

4. The last technique is more of a tip. The misuse of the inheritance mechanism is harmful in my opinion. Many C++ applications become difficult to maintain for the following reasons:

. Using deep C++ class hierarchies (deriving a class from a class from class …)
. Multiple inheritance considered very harmful
. Creating classes with many member functions

These problems can be avoided by first designing the application and using the appropriate design patterns. 

Some C++ applications start small and in the early phases functionality, performance and accuracy are important factors for success. When the application is accepted it needs to be extended, at which time maintenance and ease of extension become important, especially when time-to-market forces come into play.

This will be the subject of the next blog when we discuss application development and design.

Anyone have any questions/comments? Feel free to reply!

Daniel


February 12, 2007

Daniel Duffy: Learning C++, Part I

Get it working

A well-kept secret concerning the taming of the C++ lion is that you must learn the fundamentals before you start using more advanced issues.
Here is my own personal list that I have used with (very many) students since 1991. It corresponds to green belt knowledge:

1.  Basic class structure; private and public members; header and code files
2.  The vital keyword ‘const’ and the 4 places where it is used
3.  Function and operator overloading; creating your own operator
4.  Memory management; STACK and HEAP
5.  Basic templates; STL, list<T> container and simple iterators

I have found that once these issues are mastered (ESPECIALLY point 4) then you will have few major problems later.

C++ is a large and flexible language. By neglecting the fundamentals we run the risk of floundering when embarking on real projects. As in judo, learn how to break fall before attempting a hip throw!

Daniel

February 05, 2007

Daniel Duffy: Monte Carlo Frameworks in C++: From Problem Domain to Working Code

In this blog I would like to give a bird’s-eye overview of some collaborative work with Dr. Jörg Kienitz of the German Postbank AG.

The main goal is to design, develop and deploy a customizable software system that can be adapted to suit different types of derivatives products and that meets certain performance criteria. In particular, we wish to calculate the price of a number of equity and interest rate products using the Monte Carlo (MC) method. The important thing to remember about the MC method is that it is robust, converges to an accurate solution in most cases and it can be applied to a range of problems that other methods are not able to solve. For this reason alone we are attempting to develop flexible frameworks using modern system and design patterns.

The language of choice is C++ and we have chosen it for a number of reasons, some of which are based on the simple fact that we know the language well, we like it and it is interoperable with many software libraries. It is also an industry standard. Finally, it supports the object-oriented, generic and modular programming techniques and we use all three metaphors to help us create software that is as flexible and malleable as possible.

Some of the major challenges (nice ones!) in this project are:

-Such a framework has not been attempted before to the best of our knowledge in the sense than we cannot see any results in published literature

-The project demands a number of skills such as finance, stochastic and MC theory, numerical analysis, design techniques, C++ and project management skills (the last in the sense that the project must be delivered 31 June 2007)

-The number of robust finite difference schemes that are able to approximate the solution of the Stochastic Differential Equations (SDE) that describe the behaviour of the underlying assets seems to be restricted to the Euler and Milstein methods and for this reason we are developing and applying other schemes that are applicable to a range of SDE. Much research needs to be done in this area

-Jörg and myself are located in different countries (Germany, Netherlands) which makes communication more difficult than if we were sitting in the same office. For this reason, we needed to develop a vocabulary which would allow us to impart ideas at a higher level than just ‘raw’ C++ code

-When the application is in the ‘get it working’ phase we wish to port it to MPI and OpenMP environments

In later blogs both JK and myself will discuss a number of related issues in more detail. For the moment, I would like to conclude the blog with a number of do’s and don’ts that we learned during this project (we are still learning):

Do’s
-Partition the problem into loosely coupled subsystems/component with well-defined interfaces
-Spend 30% of your time on design, 35% on programming. The rationale is that once you know what must be done then it is easy to do
-Adopt a multi-disciplinary and multi-paradigm approach
-Spiral, risk-driven, incremental project management model
-Make sure you learn C++ well to get optimal results

Don’ts
-Start the project by creating deep C++ class hierarchies; using inheritance is an optimization step in a sense. The hierarchy will come once you get the basic classes up and running; at all costs, avoid spaghetti code
-Try to make everything into an object. Not everything needs to be a class. And such a step may be semantically wrong. Nice class for the wrong problem
-Do not generalize until we have some specialisation (this means that were work with concrete examples at each stage of the project, it keeps us on the rails)

My next blog will be a video presentation ‘Daniel in the C++ Lion’s Den: Do’s and Don’ts when learning C++’. I dedicate it to my co-blogger Dr. Egor Kraev.

Daniel Duffy

January 29, 2007

Daniel Duffy: "Welcome to C(omp)++"

I was pleased to be asked by the team at Wiley to initiate discussions on issues that are of interest to those professionals who create, design and deploy derivatives models in Quantitative Finance (QF).  During my term as moderator, I will concentrate on a number of practical issues:

  • Numerical Methods for derivatives pricing
  • Developing robust algorithms and implementing them in modern object-oriented programming languages
  • Learning C++, design patterns (C++ is a standard in Quantitative Finance)
  • Creating multi-threaded and high-performance QF applications

As Christopher Merrill of the University of Chicago (Financial Mathematics – see his community profile) has so clearly expressed: the finance literature is heavy on mathematics and light on implementation and for this reason we hope that this site will fill this gap to some extent. In particular, I want to introduce a number of special topics to the community, which I hope will generate good discussion and debate.  Some of these are:

  • Monte Carlo Methods and C++ frameworks for multi-factor models
  • PDE methods such as Finite Differences (FDM) and Finite Elements (FEM)
  • Fundamental numerical algorithms and C++ libraries
  • High-performance and multithreaded programming for Monte Carlo and PDE models

Our objective is to be able to model derivatives from the mathematical/financial model through numerical methods, algorithms, C++/C# code and performance tuning. I think the above topics cover the objectives well and it is my hope that colleagues, friends and co-bloggers will share their expertise and real-life experiences with you. We intend to use a variety of formats and techniques to do so, such as videos, podcasts, Q&A sessions and more. 

For more detailed support and continued feedback via forums, download areas for code and training you are welcome to visit my site.

With your participation, I’m confident that C(omp)++ will be a great portal for our collective knowledge in QF.

Daniel J. Duffy
Datasim

C(omp) Search


WWW
compplusplus.com

C(omp) Community

Could this be you? Thijs van den Berg Dr. Jörg Kienitz Bjarne Stroustrup Dr. Egor Kraev Daniel Duffy Andrea Germani Umberto Cherubini Luigi Ballabio

More Members

Meet the Editorial Team



C(omp) Feeds


Want to know when new posts and features are made available? Sign up to receive email notifications by entering your email address:

Delivered by FeedBurner



Any Comments?

Send in questions for our authors and bloggers: comp@wiley.co.uk



C(omp) Events

1) 13-15 November: Quant Invest 2007
Russell Hotel, London
Key speakers include Sushil Wadhhwani, Paul Wilmott and Deborah Fuhr...

2) 30 November: CCCP Mathematical Finance Conference
Princeton University
Speakers include Paul Glasserman, Peter Carr and Rama Cont.

3) 10-14 December: Risk Minds 2007
President Wilson, Geneva

4) 12-15 December: Quantitative Methods in Finance
Manly Pacific Hotel
Sydney, Australia
Speakers include Mark Joshi

5) January 2008: Distance Learning for Financial Engineers
Computational and Quantitative Finance in C++
Datasim Education BV


Recent Forum Discussions


C(omp) Calendar

June 2008
Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30