Regression with multiple dependent variables

  • Thread starter Thread starter senoz
  • Start date Start date
Joined
8/6/09
Messages
54
Points
18
Do we still calculate "b" coefficient matrix as

b=inv(x`x)*x`y where x is the matrix of independent variables and y is the matrix of dependent variables.
 
If you mean dependence among X elements, yes we do. If the variables (elements of X) are highly correlated, then we face "Multi-colinearity" problem which results in the coefficients "b" having large (inflated) variances, i.e., a small change in a variable may cause a big change in the corresponding "b". To detect this issue, you can use Variance Inflation Factor (VIF) = 1/ (1-R^2). VIF > 5-10 indicates multi-colinearity.

But if you mean dependence among the response y, then it is usually more complicated. If your data is a time series (which is the case in most financial applications) then you may use time-series literature, e.g., model the correlation via an AR model.

I hope I answered the question.
 
no I mean y values depend on x values. y1 + y2 + y3 = a0 + a1x1
 
I'm afraid your question is not clear. In a conventional regression problem, y values always depend on x values. So the dependence structure could be either among X or Y. Are y_i a times series?
 
Yes, both X and Y values are time series. At a given time there are more than one variables(different time series Y1 Y2 Y3) which depend on X values.
 
In answer to your first question, no you wouldn't use inv(x`x)*x`y, because obviously y is no longer an Nx1 vector but becomes a matrix.
 
Do we still calculate "b" coefficient matrix as

b=inv(x`x)*x`y where x is the matrix of independent variables and y is the matrix of dependent variables.

I don't think it was explicitly stated, so I'm assuming that this is in MATLAB? If so, then simply use the backslash operator to find the least squares fit, thus:

>> BTrue = [-1 2 3; -2 3 -4]'; X = [ones(100,1) randn(100,2)]; Y = X * BTrue; X \ Y

ans =

-1.0000 -2.0000
2.0000 3.0000
3.0000 -4.0000


-Will Dwinnell
Data Mining in MATLAB

---------- Post added at 02:06 PM ---------- Previous post was at 01:57 PM ----------

If you mean dependence among X elements, yes we do. If the variables (elements of X) are highly correlated, then we face "Multi-colinearity" problem which results in the coefficients "b" having large (inflated) variances, i.e., a small change in a variable may cause a big change in the corresponding "b". To detect this issue, you can use Variance Inflation Factor (VIF) = 1/ (1-R^2). VIF > 5-10 indicates multi-colinearity.


I suggest that the simpler alternative is to simply check the out-of-sample results. If the regression got the coefficients wrong, it should come out in the testing.


-Will Dwinnell
Data Mining in MATLAB
 
Back
Top Bottom