Pdf highdimensional data in economics and their robust analysis. Econometric analysis of large factor models jushan bai and peng wangy august 2015 abstract large factor models use a few latent factors to characterize the comovement of economic variables in a high dimensional data set. How to use the glmnet and the lars package in r to implement lassotype estimators. Journal of the american statistical association, forthcoming. Highdimensional sparse models we have a large number of parameters p, potentially larger than the sample size n the idea is that a low dimensional submodel captures very accurately all essential features of the full dimensional model. Appendices fj include additional results for sections 27, respectively. Lopez 2019, \monitoring banking system connectedness with big data, journal of econometrics, vol, pages. Existing variable selection methods can be computationally intensive and may not perform well the conditions required for those methods are very. One simple example concerns the estimation of an average treatment effect in a high dimensional regression model, where the econometrician has hundreds of.
Focusing on linear and nonparametric regression frameworks, we discuss various econometric examples, present basic theoretical results, and illustrate the concepts and methods with monte carlo simulations. The hds regression model has a large number of regressors p, possibly much larger than the sample size n, but only a relatively small number s may 22, 2017 big data in econometric modeling heres a speakers photo from last weeks penn conference, big data in dynamic predictive econometric modeling. A growing and successful new branch of econometric literature asks how unbiased estimates of key structural parameters such as average treatment effects can be obtained in big data problems. Focusing on linear and nonparametric regression frameworks, we discuss various econometric examples, present basic theoretical results, and illustrate the concepts and. Big data in dynamic predictive econometric modeling request pdf. Jun 26, 2011 in this chapter we discuss conceptually high dimensional sparse econometric models as well as estimation of these models using l1penalization and postl1penalization methods. Analysis of the most recent modelling techniques for big data with.
Big data are characterized by high dimensionality and large sample size. Some results from applications to large macroeconomic data sets. High dimensional sparse models arise in situations. This book presents the econometric foundations and applications of multi dimensional panels, including modern methods of big data analysis. Estimating and understanding high dimensional dynamic stochastic econometric models for volatility, derivatives, and more. Prakasa rao cr rao advanced institute of mathematics, statistics and computer science aimscs university of hyderabad campus gachibowli, hyderabad 500046 email. Conventional econometric models and techniques often work with many economic and financial data, but there are issues unique to big datasets that may require new tools. Fan, jianqing, wenyan gong, and ziwei zhu 2019, \generalized high dimensional trace regression via nuclear norm regularization, journal of econometrics, vol, pages. Pdf this work is devoted to statistical methods for the analysis of economic data with a large number of variables. Endogenous econometric models and multistage estimation in. The course introduces key concepts and tools demanded in the business environment. Estimating treatment effects with highdimensional data 1. Big data in dynamic predictive econometric modeling penn arts. Estimation and inference wi outline for econometric theory of big data part i.
Dec 28, 20 statistical significance in big data december 28, 20 december 28, 20 matthew harding bayesian information criterion, big data, critical values, statistical significance an interesting problem when analyzing big data is whether one should report the statistical significance of the estimated coefficients at the 1% level, instead of the. Students should be able to learning methods assessment methods. Estimating treatment effects with high dimensional data 1. Endogenous econometric models and multistage estimation. Highdimensional sparse models hdsm models motivating examples 2. Students will learn how to explore, visualize, and analyze high dimensional datasets, build predictive models, and estimate causal e ects. Prediction with a large number of covariates big p varian, hal r. Supplement to program evaluation and causal inference with highdimensional data this supplement contains 11 appendices with additional results and some omitted proofs. As n gets very large we have \ high dimensional data. Fan, jianqing, wenyan gong, and ziwei zhu 2019, \generalized highdimensional trace regression via nuclear norm regularization, journal of econometrics, vol, pages. Macroeconomic nowcasting and forecasting with big data.
Big data in dynamic predictive econometric modeling. In this chapter we discuss conceptually high dimensional sparse econometric models as well as estimation of these models using l 1penalization and post l 1penalization methods. Econometric methods for cross section and panel data fall 2018 instructor. In book contains an introduction to and a summary of the actively developing field of statistical learning with sparse models. Without a clear prior understanding of the underlying data. Estimation and inference on te in a general modelconclusion econometrics of big data. The best new econometric research on big data will be presented. New analytic approaches are needed to make the most of big data in economics. This reference gives a helicopter tour of various methods. Statistics for highdimensional data methods, theory and applications. This article is about estimation and inference methods for high dimensional sparse hds regression models in econometrics. Standard timeseries dynamic econometric modeling var estimation, forecasting, understanding, but new tools are required for bigdata environments. In this chapter we discuss conceptually high dimensional sparse econometric models as well as estimation of these models using l1penalization and postl1penalization methods. What it actually is, however, appears to differ from field to field, and even from practitioners within fields.
Click through to find the program, copies of papers and slides, a participant list, and a few more photos. L1penalized quantile regression in highdimensional sparse models, arxiv 2009, annals of statistics 2011, with a. Statistical significance in big data big data econometrics. In large data sets, however, machine learning methods shine. Estimation of regression functions via penalization and selection3. Highdimensional sparse econometric models, an introduction,springer lecture notes 2009, with a. Large k is e ectively high dimensional because endogenizing the regressors in a largek univariate. Supplement to program evaluation and causal inference with high dimensional data this supplement contains 11 appendices with additional results and some omitted proofs. Researchers and policymakers should thus pay close attention to recent developments in machine. The following list of potential topics is provided to stimulate ideas. Parameter estimates of these models without corrective measures may be inconsistent. Focusing on linear and nonparametric regression frameworks, we discuss various econometric examples, present basic theoretical results, and illustrate the concepts and methods with monte carlo simulations and an. Highdimensional sparse econometric models, an introduction. Examples of techniques include an advanced overview of linear and logistic regression.
Together with the recent developments in information technology that permit the collection of high dimensional data, this special issue will focus on econometric model selection theories and applications concerning the econometric analysis of high dimensional data. Large k is effectively high dimensional because endog. Most of the large data statistical and econometric literature attempts to reduce the data dimension by penalising the model for complexity. Big data in dynamic predictive econometric modeling university of pennsylvania. This first part of the course introduces students to contemporary methods of microeconometric. Highdimensional sparse framework the framework two examples 2. Estimation of regression functions via penalization and selection 3.
For instances, the availability of big datasets facilitates the applicability of nonlinear models and estimation methods, which normally require large sample sizes. Estimation and inference on te in a general model conclusion econometrics of big data. Estimation and inference on te in a general model conclusion vc econometrics of big. It helps us to quantifying new trends and exploiting new dimensions having timely answers on the impact of different events. This book presents the econometric foundations and applications of multidimensional panels, including modern methods of big data analysis. Estimating and understanding highdimensional dynamic. The econometrics of multidimensional panels theory and. One simple example concerns the estimation of an average treatment effect in a highdimensional regression model, where the econometrician has hundreds of. Analyzing a large panel of economic and finan cial data is. Econ 590 big data and machine learning in econometrics spring. High dimensional problems in econometrics sciencedirect. The event, hosted by the wang yanan institute for studies in economics wise at xiamen university, focused on recent developments in econometric theory with applications. I recently had the opportunity to attend a conference held in honor of the great econometrician dr. Econometrics, highdimensional data, dimensionality reduction, linear regression.
Ultrahigh dimensional modeling is a more common task than before due to the emergence of ultrahigh dimensional data sets in many fields such as economics, finance, genomics and health studies. Uniformly valid inference in highdimensional models when the number of variables is larger than the number of parameters. Big data in dynamic predictive econometric modeling of. Together with the recent developments in information technology that permit the collection of highdimensional data, this special issue will focus on econometric model selection theories and applications concerning the econometric analysis of high dimensional data. Dealing with highdimensionality in large data sets quantuniversity.
The last two decades or so, the use of panel data has become a standard in many areas of economic analysis. Regularization to assist with variable selection in highdimensional trade. Editorial big data in dynamic predictive econometric modeling. Econometric models based on observational data are often endogenous due to measurement error, autocorrelated errors, simultaneity and omitted variables, nonrandom sampling, selfselection, etc. Highdimensional sparse econometric models, 2010, advances in. Estimation of regression functions via penalization and selection methods. The r package bigvar allows for the simultaneous estimation of highdimensional time series by applying structured penalties to the conventional vector autoregression var and vector autoregression with exogenous variables varx frameworks. Our methods can be utilized in many forecasting applications that make use of timedependent data such as. Big data lecture 2 high dimensional regression with the. Estimation methods for linearnonparametric regression. Next month, the cemmap center at university college london is organizing a very exciting workshop on high dimensional econometric models. High dimensionality brings challenge as well as new insight into the advancement of econometric theory. It is expected that all students will have taken intermediate level courses covering. To understand problems related with the analysis of big data and application of high dimensional models and to get acquainted with the relevant tools coming from statistical learning, machine learning and econometrics.
You may be interested in what i had to say on this topic back in 2011. More generally we refer to situations involving large n or k, or both, as high dimensional. High dimensional sparse econometric models, an introduction,springer lecture notes 2009, with a. Focusing on linear and nonparametric regression frameworks, we discuss various econometric examples, present basic theoretical results, and illustrate the concepts. Oracle inequalities and inference in highdimensional var models. L1penalized quantile regression in high dimensional sparse models, arxiv 2009, annals of statistics 2011, with a. Oracle inequalities for high dimensional vector autoregressions.
Sparse high dimensional regression lasso estimation application motivation trouble with large dimension goals important balance. Highdimensional sparse econometric models hdsm models motivating examples for linearnonparametric regression 2. Highdimensional sparse econometric models, 2010, advances. Whether it is stock data for individual companies or economic data used for macroeconomic modeling. Appendix k gathers auxiliary results on algebra of covering entropies. Program evaluation and causal inference with highdimensional. To understand problems related with the analysis of big data and application of highdimensional models and to get acquainted with the relevant tools coming from statistical learning, machine learning and econometrics. The hds regression model has a large number of regressors p, possibly much larger than the sample size n, but only a relatively small number s data. The key assumption is that the number of relevant parameters is smaller than the sample size. Estimating and understanding highdimensional dynamic stochastic econometric models for volatility, derivatives, and more.
101 1199 564 739 459 1301 213 113 1460 1536 971 388 1388 508 911 1478 1474 1225 1114 439 1118 276 1519 345 1313 1471 1024 1517 1008 888 262 373 56 1266 849 301 1033 945 227 445 877 317 195