Guiding principles for back-testing’s

When performing back-testing’s, or for that matter any statistical analysis, it is important to be consistent.

Here are some guiding principles I use:

Always use SAMOA

The SAMOA platform has been in use for many years. It has been (and constantly is) put through rigorous tests, that verify that every component produces the expected results (you can read more here). Furthermore, it unifies the research process, since it has hundreds of procedures (performance reports, statistical routines, etc) that were programmed using cutting edge technologies and state of the art mathematical methods. On the other hand, the data (historical prices, curves, etc), are always checked and rechecked using several sources and alternative ways of calculation. Thus, using SAMOA avoids reinventing the wheel and incurring in programming and data mistakes.

Avoid look-ahead bias (i.e. always out-of-sample)

Every signal or trade generated can ONLY be based on historical information (it cannot use data from the future). In other words, always use lagged historical data. In-sample analyses are only acceptable in order to get an initial idea of the results. But the out-of-sample run is a must.

Avoid data-snooping (i.e. don’t over fit)

If you over fit a back-testing dataset you run into the danger that the performance is inflated relative to the future performance of the strategy, because the parameters are calibrated on transient noise in the historical data. There are many ways to avoid over-fitting (training/testing subsets, use moving optimization for parameters, etc), and all of them are recommended. But in my opinion, the most important criterion is: “don’t work on a strategy for too long, and always follow your economic intuition”.

Consider transaction costs

Always consider transaction costs, especially for higher frequency strategies. There are various types of costs that should be considered:

  1. Commissions
  2. Liquidity cost
  3. Opportunity cost
  4. Market impact
  5. Slippage

Perform Sensitivity Analysis

Once you optimized the parameters, vary them by small qualitative changes and make sure your performance doesn’t change drastically.

Use tick data whenever possible

Closing prices are not always tradable levels. It is always preferable to use prices from actual trades, even if your strategy trades with a daily frequency. If the data source contains bids and asks, make sure that you always buy at the ask and sell at the bid. In the case that only daily data is available, use the High and the Low for worse case scenario stops or profit takings.

Back-test at least 10 years worth of data

Usually a decade will include a few recessions, crises of different types, seasonal changes, etc. It’s important to analyze how your strategy works in each regime.

Check your input data from at least 2 sources

Whenever possible verify your results using another calculation method

Whenever possible have a colleague replicate your results

Leave a Reply

CAPTCHA Image