Application of Robust Regression vs. Classical Regression

Introduction

In statistical modeling, Regression Analysis is employed to understand the nature and size of the relationship among selected variables. Regression Analysis is also understood as an instrument that measures the relation between the mean of one variable (dependent) and corresponding values of another variable (s) (independent) (Sen & Srivastava, 2013). There are various types of Regression Analysis, such as simple/linear-multiple-robust, which are employed for specific objectives and in particular conditions (Boslaugh, 2012). Also, the nature of data and research also strongly influences the type of regression analysis that to be employed to estimate the nature and size of the relationship among the variables. Each type of regression analysis has its own set of assumptions, which must be observed. As regression allows us to detect any significant relationship among variables (also the type and nature); therefore, regression analysis has an enormous corporate application (Ozyasar, 2018).

Classical Regression

Like any other regression model, Classical Linear Regression Model is a statistical instrument that is employed to project the future behavior or values of the dependent variable, when changes in independent (explanatory) variable(s) occur. The model assumes that the dependent variable is non-random or non-stochastic (Applied Econometrics, 2015). However, other conditions and assumptions are associated with the Classical Linear Regression Model (CLRM). For instance, Ordinary Least Square Model is used for the estimations of parameters. These assumptions are,

  • The model must be linear in parameters (not linear in the variables).
  • In repeated sampling, the values of the variable X (X-values) should be fixed. It implies that the explanatory or independent variable must not be random or stochastic.
  • For this model, the conditional value of error term (et) must be equal to zero (given the value of X).
  • For all the observations of this statistical model, the value of ei is the same. This phenomenon is understood as Homoscedasticity (equal variance of ei).
  • There should be no autocorrelation between the error terms of two X values.
  • The covariance between the error term and independent variable must be equal to zero.
  • It is necessary that the number of observations are greater than the total number of parameters (n>P).

These assumptions are vital for the application of CLRM, which emphasizes homoscedasticity.

Robust Regression

Robust Regression is designed to address and overcome the contradictions and limitations of conventional parametric and non-parametric models. It is imperative to understand that whenever any assumption of CLRM gets violated, the model produces biased or flawed results. Mostly, we apply robust regression, when there is a strong suspicion of heteroscedasticity. As heteroscedasticity permits the variance to lean on or depend upon the X variable (independent); therefore, for real scenarios, heteroscedasticity-based regression models are more appropriate. Also, it can also address the issue of corrupt variables (Bhatia, Jain, & Kar, 2015).

Data

For this academic exercise, which aims to apply Classical and Robust regression models to learn how different their outcomes are, we will data of two European Union countries (Germany and France). To carry out statistical analyses, we have opted for STATA, statistical analysis software known for its simplicity and precision. The total number of observations of each variable is 20. The starting year, of the retrieved data is 1998 and it stretches to the year 2017 (World Bank, 2018).

Tests to be performed

  • OLS
  • Robust Regression

Results/Analysis

Summary of Statistics

Variable |       Obs        Mean    Std. Dev.       Min        Max

————-+——————————————————–

INF |        20    1.349438    .7572429    .136763   2.619995

UNE |        20       9.275    1.227306       7.06      11.88

ING |        20    1.131517    .6358419  -.4497879   2.013046

UNG |        20      7.4595    2.316607        3.4      11.17

France

OLS

Source |       SS       df       MS              Number of obs =      20

————-+——————————           F(  1,    18) =    7.19

Model |  .091862426     1  .091862426           Prob> F      =  0.0153

Residual |  .230104892    18  .012783605           R-squared     =  0.2853

————-+——————————           Adj R-squared =  0.2456

Total |  .321967318    19  .016945648           Root MSE      =  .11306

 

——————————————————————————

logUNE |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

————-+—————————————————————-

logINF |  -.0799733   .0298334    -2.68   0.015     -.142651   -.0172957

_cons |   2.222781   .0253172    87.80   0.000     2.169592    2.275971

——————————————————————————

Robust

Linear regression                                      Number of obs =      20

F(  1,    18) =   12.85

Prob> F      =  0.0021

R-squared     =  0.4592

Root MSE      =  .92726

 

——————————————————————————

|               Robust

UNE |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

————-+—————————————————————-

INF |  -1.098319   .3063915    -3.58   0.002    -1.742024   -.4546142

_cons |   10.75711   .5538499    19.42   0.000     9.593518    11.92071

Germany

OLS

Source |       SS       df       MS              Number of obs =      19

————-+——————————           F(  1,    17) =    8.41

Model |  .732759321     1  .732759321           Prob> F      =  0.0099

Residual |  1.48033768    17  .087078687           R-squared     =  0.3311

————-+——————————           Adj R-squared =  0.2918

Total |    2.213097    18  .122949833           Root MSE      =  .29509

——————————————————————————

logUNG |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

————-+—————————————————————-

logING |  -.3567767   .1229905    -2.90   0.010    -.6162641   -.0972893

_cons |   1.976637   .0682407    28.97   0.000     1.832662    2.120612

——————————————————————————

Robust

Linear regression                                      Number of obs =      20

F(  1,    18) =    7.29

Prob> F      =  0.0147

R-squared     =  0.2879

Root MSE      =  2.0085

——————————————————————————

|               Robust

UNG |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

————-+—————————————————————-

ING |  -1.954766   .7241738    -2.70   0.015    -3.476199   -.4333332

_cons |   9.671352   .9647997    10.02   0.000     7.644383    11.69832

Comparison

Classical Linear Regression Model is employed when the data have homoscedasticity. There are certain assumptions (around seven) that must be observed. If these assumptions are violated, the model may produce biased results. Robust regression is employed when data have heteroscedasticity. From the series of statistical analyses, it is apparent that robust regression produces slightly different results from CLRM. For instance, in CLRMs, for which we generated log values to reduce any heteroscedasticity, had different T and Coefficient values (parameter) than Robust Regression (RR).  In the case of Germany, t-value did not change much; however, the standard error and coefficient values changed evidently. This implies that RR and CLRM produce different results. From this methodical study and scrutiny of these two different statistical models, we also learn that the decision about the opting/employment of particular model depends upon the attributes of data (homoscedasticity/heteroscedasticity). The robust regression tends to produce more convincing results than the classical linear regression model.

Conclusion

In the end, it can be concluded that CLRM and RR are two different types of regression models, which are employed on particular types of data. The use of the particular regression model is generally not the discretion of the researcher, but rather it depends upon the attributes of retrieved data. The results of CLRM could be different from RR.

References

Applied Econometrics. (2015). Macmillan International Higher Education.

Bhatia, K., Jain, P., & Kar, P. (2015). Robust regression via hard thresholding. Advances in Neural Information Processing Systems, 721-729.

Boslaugh, S. (2012). Statistics in a Nutshell. O’Reilly Media, Inc.

Ozyasar, H. (2018, June 26). Application of Regression Analysis in Business. Retrieved September 6, 2018, from https://smallbusiness.chron.com/application-regression-analysis-business-77200.html

Sen, A. K., & Srivastava, M. S. (2013). Regression Analysis: Theory, Methods and Applications. Springer.

World Bank. (2018, July 1). World Bank. Retrieved September 6, 2018, from http://databank.worldbank.org/data/reports.aspx?Code=CHN&id=556d8fa6&report_name=Popular_countries&populartype=country&ispopular=y#

You May also Like These Solutions

Email

contact@coursekeys.com

WhatsApp

Whatsapp Icon-CK  +447462439809