ARIMA model


  • When there is a trend in data we take differences
  • ARIMA – Auto regressive Integrated Moving average
  • Integrated term includes order of difference, In the example below it is d=2


Below is the sample github gist and output pdf is avaialble at ARIMA model.pdf


Fitting AR Processes

Yule Walker Equation in Matrix Form



  • If we write and above equation for k=1, 2, . . ., n and use the fact that ρ(k) = ρ(-k), we can write it in a matrix form.
  • Using the data we have we can estimate values of ρ  (auto correlation coefficients)
  • acf() routine in R gives us that
  • Using values of ρ we can then estimate values of Φ (parameters of AR process)



  • Above is an example for AR process
  • We can solve these equation for values of Φ1, Φ2 and Φ3



Moving average and Auto-regressive Processes

Moving Average Processes MA(q)

  • Stock price depends on announcements of last two days
  • Auto correlation function cuts off at q



Auto regressive Processes AR(p)






  • Below are the plots for AR(2) process
  • Depending upon the value of phi1 and phi2 ACF has alternative positive and negative values




Writing AR(p) process as MA process by substituting values of X(t-1). And yes phi is constant, we don’t need phi1, phi2 anymore.



Mean, variance and auto-correlation of AR(p) process, we have assumed Z = Norm(0, sigma2)




ACF of AR-p using Yule-Walker Equation

  • It is a method of solving difference equation in recursive relation
  • We first obtained auxiliary equation (also known as characteristic equation) which is polynomial and find root of that
  • Using these root we get weighted geometric series and find weights using some initial condition
  • We had learned in mathematics that this way of solving difference equation also related to solving differential equations
  • In the course they had solved it for Fibonacci series and root had come out to be golden ratio
  • For AR(p) ACF comes out to be difference equation, solving which can give us ACF for different values of lag









Stationarity Conditions for MA(q) and AR(p) Processes

Sequence and Series

Convergent Sequence

1/2, 2/3, 3/4, . . . , n/(n+1)

Divergent Sequence

3, 9, 27, . . . . , 3^n

Series => Partial Sum of sequence

Convergent Series => if sum converges

Convergence Test

  • Integral Test
  • Comparison Test
  • Limit comparison test
  • Alternating Series Test
  • Ratio test
  • Root test
Geometric Series

  • a, ar, ar^2, . . . , ar^n
  • Convergent if r < 1
Representing function as (geometric) series


Backward shift operator

  • B^kX(t) = X(t – k)



  • Two models have same ACF
  • Given ACF how to find out the model
  • We will go for model that is invertible
  • We can invert MA(1) into AR(∞)
  • Inverting is basically act of expanding function in geometric series
  • It is possible when growth r<1
  • Out of two models only one satisfies this condition
  • We will select that model given ACF

Conditions for Invertibility[MA(q)] and Stationarity [AR(p)]


How to check if series is both invertible and stationary

  • Check AR(p) polynomial for stationarity
  • Check MA(q) polynomial for invertibility
  • Both should hold



[Time Series] Correlation and Stationarity

Co-variance vs Correlation

  • Correlation is co-variance divided by standard deviation of both variables
  • Hence it is independent of units and is always between -1 and 1, which makes comparison easier
  • Formula on the right is time series specific
    • It is auto correlation coefficient at lag k
    • It is define as ration of auto-correlation at lag k divide by auto-correlation at lag 0
    • This values are plotted on correlogram  (See one for MA(2) process below)



Stationary Time Series

  • No systematic change in mean (No trend)
  • No systematic change in Variance
  • No periodic variation (Seasonality)

If time series is not stationary we apply several transformation to make it stationary.

For example applying difference operator to random walk makes it stationary.



Random Walk

  • Previous value of noise
  • If first value is zero then current value is summation of all the noises so far
  • X(t) = X(t-1) + Z(t)
  • Z(t) = Normal (mu, sigma2)
  • if X(0) = 0 then X(t) = sum(Z(k)) k form 0 to t
  • Expectation[X(t)] = t*mu   – –  Changes with time
  • Variance[X(t)] = t*sigma2   – – Changes with time
  • Not a stationary process
  • let Y(t) = X(t) – X(t-1) = Z(t)  – – Y(t) is a stationary process


Example of Stationary Process

Moving average and Auto regressive processes described here can be stationary under conditions described here.





Further reading





Iterative Method for Unconstrained Optimization

Newton’s Method



  • Based on Taylor series expansion
  • Advantages
    • Convergence is rapid in general, quadratic near optimal point
    • Insensitive to no of variables
    • Performance does not depend on choice of parameter
      • Gradient method depends on learning rate
  • Disadvantages
    • Cost of computing and storing Hessian
    • Cost of computing single newton step
      • You need double derivative (Example in note is a simple root finding problem)


Gradient Descent

  • Very popular method and does not need any write up
  • Exhibits approximately linear convergence
  • Advantage
    • Very simple to implement
  • Disadvantage
    • Convergence rate depends on number of the Hessian
    • Very slow when for large no of variables (say 1000 or more)
    • Performance depends on choice of parameters like learning rate


Golden Section Search

  • Typically applicable for one dimension only
  • We used it to calculate mobile/tablet adjustments
    • As a good practice we had avoided recursion and took at max 20 iteration breaking loop with some criterion
  • Applicable for strictly unimodal function
  • Three points that maintain golden ratio (phi)
  • Bisection method is okay to find root, but for finding extreme golden section is preferred
  • Sample code :





Ref :








Time series week 1

  • Plotting in R
  • Linear regression properly fitted or not
    • Residue are important thing to observed
    • Q-Q plots for normality test
    • Residues over time
      • Zoomed in residues over time
  • Hypothesis test
    • One, two sided t test
    • Confidence interval
      • Where we think mean lies
      • If it dose not contain 0 we tend to reject null hypothesis (Very broad statement, but I think you got the concept)
  • Correlation function
    • Which quarter data false


Ref :