


The first issue of Advanced Trading Strategies, our monthly newsletter, has finally arrived in your mailbox. Instructions on how to subscribe or unsubscribe are in the last section. Datashaping has significantly grown over the last few months. We are now serving several thousand unique visitors every month, and have achieved top rankings in the search engines. It is our aim to continue to grow steadily and become a favorite destination on internet for sophisticated investors and financial analysts interested in statistical trading strategies. Some of the most important requirements of statistical algorithms for stock traders have been summarized in our White Paper, available online. An Html version of the newsletter will be published.
Pitfalls in Optimizing Statistical Trading Strategies. Part I: OverParametrization. One of the common mistakes in optimizing statistical trading strategies consists of overparametrizing the problem and then computing a global optimum. It is well know that this technique provides extremely high return on historical data but does not work in practice. We shall investigate this problem, and see how it can be sidestepped. We will explain how to build a very efficient 6parameter strategy. This issue is actually relevant to many real life statistical and mathematical situations. The problem itself can be referred to as overparametrization or overfitting. The explication as to why this approach fails can be illustrated by a simple example. Let's imagine you fit data with a 30parameter model. If you have 30 data points (that is, the number of parameters is equal to the number of observations), then you can have a perfect, fully optimized fit with your data set. However, any future data point (e.g. tomorrow stock prices) might have a very bad fit with the model, resulting in huge losses. Why? We have the same number of parameters as data points. Thus, on average each estimated parameter of the model is worth no more than one data point. From a statistical viewpoint, you are in the same situation as if you were estimating the median US salary, interviewing only one person. Chances are your estimation will be very poor, even though the fit with your oneperson sample is perfect. In fact, you run a 50% chance that the salary of the interviewee will be either very low or very high. Roughly speaking, this is what happens when overparametricizing a model. You obviously gain by reducing the number of parameters. However, if handled correctly, the drawback can actually be turned into an advantage. You can actually build a model with many parameters that is more robust and more efficient (in terms of return rate) than a simplistic model with fewer parameters. How is it possible? The answer to the question is in the way you test the strategy. When you use a model with more than three parameters, the strategy that provides the highest return on historical data will not be the best. You need to use more sophisticated optimization criteria. One solution is to add boundaries to the problem thus performing constrained optimization. Look for strategies that meet one fundamental constraint: reliability. That is, you want to eliminate all strategies that are too sensitive to small variations. Thus, you focus on that tiny part of the parameter space that shows robustness against all kinds of noise. Noise, in this case, can be trading errors, spread, small variations in the historical stock prices or in the parameter set. From a practical viewpoint, the solution consists in trying million of strategies that work well under many different market conditions. Usually, it requires several months' worth of data to have various market patterns and some statistical significance. Then for each of these strategies, you must introduce noise in millions of different ways and look at the impact. You then discard all strategies that can be badly impacted by noise and retain the tiny fraction that are robust. The computational problem is complex, since it is in fact equivalent to testing millions of millions of strategies. But it is worth the effort. The end result is a reliable strategy that can be adjusted over time by slightly varying the parameters. Data Shaping's strategies are actually designed this way. They are associated with 6 parameters:
It would have been possible to reduce the dimensionality of the problem by imposing symmetry in the parameters (e.g. parameters being identical for buy and sell price). Instead, our approach combines the advantage of low dimensionality (reliability) with returns appreciably higher than you would normally expect when being conservative. A final note of advice. When you backtest a trading system, optimize the strategy using historical data that are more than one month old. Then check if the reallife return obtained during the last month (outside the historical data timewindow ) is satisfactory. If your system passes this test, then optimize the strategy using the most recent data, and use it. Otherwise, do not use your trading system in real life. More on backtesting in the next issue.
The Editor does not necessarily endorse the material being advertised. To place an ad, email us. The rate is $20 for one month, or $15 per month for six months. Book now while the fee is low. Membership is growing at a steady pace thanks to costeffective advertising. We currently have 400 subscribers. The fee may be eliminated through reciprocal advertising.
To appear in the next issues:
To subscribe or unsubscribe, go to our web site and fill out the online form. You can also use this form to refer friends, business partners, colleagues or clients. Contact:
Vincent Granville, Ph.D., Editor 


