• C++ Programming for Financial Engineering
    Highly recommended by thousands of MFE students. Covers essential C++ topics with applications to financial engineering. Learn more Join!
    Python for Finance with Intro to Data Science
    Gain practical understanding of Python to read, understand, and write professional Python code for your first day on the job. Learn more Join!
    An Intuition-Based Options Primer for FE
    Ideal for entry level positions interviews and graduate studies, specializing in options trading arbitrage and options valuation models. Learn more Join!

ARIMA model to forecast a stock price

Joined
7/4/09
Messages
8
Points
11
I'm learning ARIMA modelling and met some practical questions:
  1. in which case should we analyze a stock price or logarithm transformation of stock price. and what about other economic/financial data such as GDP, interest rate, exchange rate? which is the rational?
  2. we should have a stationary data. I'm using Augmented Dickey-Fuller test for unit root in 3 versions: no constant, drift and drift+trend. all these tests results told me that my data is non-stationary, but 1st version estimated coefficient was positive. therefore I excluded that and other 2 remained. But how can I figure out where I have stochastic or deterministic trend? I mean, I have to know it in order to apply either first-difference rule or detrending approach. where should I look at to understand?
  3. how to properly identify lags in AR and MA?
 
Pulco,
what you say is the following:

we should work with log stock price because it closely fits to theoretical assumptions, such as distribution, iid etc.? did I correctly understand you?
so, any financial time-series data should be analysis as a logarithm of the data and not the data itself. should it be as a rule, or there is a principle to apply to?

thanks for the link!
 
When you model a time serie y(t) as an ARMA process for instance. It assumes that your

serie is stationary. Hence, this assumption needs to be true to apply further results in this

area.

The stock price x(t) is basically not a stationary quantity, but the log-return y(t) =

ln(x(t)/x(t-1)) is one. Hence, this is the quantity you will work on, and you have some

statistical tests to verify that y(t) is well a stationary serie.
 
you wanted to say return itself and not log-return is stationary. or difference in log stock price, which is return, the formula you wrote y(t) = ln(x(t)/x(t-1)) = ln(x(t)) - ln(x(t-1)).
what about other financial time series such as interest rate, exchange rate, GDP. we also use logarithm transformation of the original data and further work with percentage change (like stock return)?
 
Pulco, your help is great.
I read your material you sent me (Eviews manual). So, my interest was how to correctly identify p in AR(p) and q in MA(q). In that book is written that p is equal to significant spikes in PAC and q is equal to significant spikes in AC. If I compare with Gudjarati's Econometrics, the rule is the say. There is only one difference:
  • Eviews says that we should look at original non-stationary data
  • Gudrajari says that we should look at n-order-difference data, so stationary.
Question: which one is more correct, or they both are ok? Which rule do you use in practice?
Please explain me how to do a correlogram for MA. I mean correlogram for AR is easy: you take your time-series and look at its correlogram. How to do the same for MA process?
 
Guys, please correct me if I'm wrong. For the specialists it's a 1-minute question.
So, in order to correctly identify ARIMA(p,n,q), I should do:
  1. We should stationarize model, finding a proper n-order difference
  2. From stationary data we take correlogram and look at PAC to identify p-order of AR
  3. Having stationary data and proper AR(p), we take again a correlogram and look at AC to identify q-order for MA. So MA(q) comes just after n and p being found?
 
I think you switched something, (I'm using your notation) you find p-order of AR with autocorrelation plot, and q-order for MA with partial correlation plot.

Edit: the order of finding p and q doesn't matter.
 
Mr Doe,

I read a couple of source providers and I'm sure that AR(p) is found from PAC and MA(q) from AC. For example, if you go to http://www.duke.edu/~rnau/411arim3.htm and have a look at Rules 6 and 7, it writes:
  • The lag at which the PACF cuts off is the indicated number of AR terms.
  • The lag at which the ACF cuts off is the indicated number of MA terms.
Also, I think the order matters, because if you go again to the link, you will see that ARIMA (0,1,0) has 2 spikes in PACF and and 4 spikes in ACF. The author suggests that we look only at PACF and decide that AR(2). Then, he runs the ARIMA (2,1,0) and looks at ACF which has 0 spikes. That's why he concludes that proper ARIMA is (2,1,0).
 
Sorry for misleading you, I refreshed my knowledge in the morning and realized I was wrong.
 
I would appreciate if some experienced practicioners share with me how they identify a true ARIMA model, step by step. Because, books show 1-2 general examples from which is not possible to draw some general conclusions.
 
To find the order (p,q) of an ARMA process you need to use
the EACF (Extended Autocorrelation function). Any statistical
software that has Time Series package would have this function.
R-software has this function. It spits out the order of an ARMA
process. Using PACF and then using ACF to determine the orders
p and q, respectively would not help.
 
Suman, I pretty much disagree with you. Choosing p and q is fitting your model
so "you do what you want as far as it is consistent". Choosing p and q is more of an art.
You can't say choosing them separately wouldn't help.
Basically, you don't want to estimate to many parameters so you would like p and q small.
What one should do absolutely is testing! e.g if the residuals are white noise as expected.
 
I would like to add some thoughts on OP's 1st question:

>in which case should we analyze a stock price or logarithm transformation of stock price. and what about other economic/financial data such as GDP, interest rate, exchange rate? which is the rational?

If you wish to use other data as explanatory variables to your stock price process, you can use an ARIMA-X process. (ARIMA with exogenous variables). The tough part is getting forecasts for your other financial data.
 
ARIMA by itself won't work. Only the first lag will be significant. If there were any more persistence than that, it would be arbitraged out already.

Do it on returns, not prices. It has to be stationary.
 
1) pacf-acf. this is very strange. I was studied like ACF if for AR lags, PACF for MA lags.
2) time series models are pretty poor for forecasting and modeling financial data. you can simulate a certain ARIMA (p,r,q) model and after that try to fit different models to this data. You will be surprised that finding exact p,r,q is almost impossible. Because, for example, you can always represent ARMA (1,1) as AR(3) or another variation. More generally the difference between models with different parameters is not so evident (in case of absence of trend)
 
Back
Top