Data Mining, Quant, Statistics, Computer Science: Jobs, Resumes, Directory

Precision Recruiting

Data Mining

Contest

Math Jobs

Site Map

[ Home ]

[ Finance ]

[ Web Audit ]

[ Consulting ]

These are difficult mathematical questions. They are arising from real applications such as fraud detection, arbitrage and scoring systems. If you have interesting answers to any questions, feel free to email us your comments or solution. The best answers will be published here. Companies and Organizations interested in submitting problems should E-mail us.

Iterative Algorithm for Linear Regression

I am trying to solve the regression Y=AX where Y is the response, X the input, and A the regression coefficients. I came up with the following iterative algorithm:

A_k+1 = cYU + A_k (I-cXU), where:

c is an arbitrary constant
U is an arbitrary matrix such that YU has same dimension as A. For instance U = transposed(X) works.
A₀ is the initial estimate for A. For instance A₀ is the correlation vector between the independent variables and the response.
Questions:

What are the conditions for convergence? Do I have convergence if and only if the largest eigenvalue (in absolute value) of the matrix I-cXU is strictly less than 1?
In case of convergence, will it converge to the solution of the regression problem? For instance, if c=0, the algorithm converges, but not to the solution. In that case, it converges to A₀.
Parameters:

n: number of independent variables
m: number of observations
Matrix dimensions:

A: (1,n) (one row, n columns)
I: (n,n)
X: (n,m)
U: (m,n)
Y: (1,m)
Why using an iterative algorithm instead of the traditional solution?

We are dealing with an ill-conditioned problem; most independent variables are highly correlated.
Many solutions (as long as the regression coefficients are positive) provide a very good fit, and the global optimum is not that much better than a solution where all regression coefficients are equal to 1.
The plan is to use an iterative algorithm to start at iteration #1 with an approximate solution that has interesting properties, then move to iteration #2 to improve a bit, then stop.
Note: this question is not related to the ridge regression algorithm described here.
Contributions:

From Ray Koopman
No need to apologize for not using "proper" weights. See
Dawes, Robyn M. (1979). The robust beauty of improper linear models in decision making. American Psychologist, 34, 571-582.

Data Mining • Machine Learning • Analytics • Quant • Statistics • Econometrics • Biostatistics • Web Analytics • Business Intelligence • Risk Management • Operations Research • AI • Predictive Modeling • Actuarial Sciences • Statistical Programming • Customer Insight • Data Modeling • Competitive Intelligence • Market Research • Information Retrieval • Computer Science • Retail Analytics • Healthcare Analytics • ROI Optimization • Design Of Experiments • Scoring Models • Six Sigma • SAS • Splus • SAP • ETL • SPSS • CRM • Cloud Computing • Electrical Engineering • Fraud Detection • Marketing Databases • Data Analysis • Decision Science • Text Mining