Factor Exposure Analysis 102
Finding the optimal regression method for factor exposure analysis
October 2024. Reading Time: 10 Minutes. Author: Abhik Roy, CFA.
- Different regression techniques can be used to measure factor exposures
- Linear regression provides the best in-sample fit
- However, regularized models like Elastic Net and Lasso provide better out-of-sample fits in general
INTRODUCTION
Since the launch of the first U.S. ETF in 1993, which was the SPDR S&P 500 ETF, the industry has seen massive inflows into these ETFs with about $9 trillion in assets as of May 2024. Investors have a wide range of funds to choose from, some of which focus on equity factors like value or momentum. But this brings another challenge as measuring the exposure to these factors becomes pivotal to evaluating these funds.
One way to measure the exposures is through a regression-based analysis using the fund returns against a set of benchmark indices. But as often with quantitative methods, the choice of technique might lead to varying conclusions so, it becomes essential to explore the options available and understand the trade-offs and complexities with each.
In this research article, we compare regression methods to determine the optimal approach for factor exposure analysis.
IN-SAMPLE FIT
We will focus on three regression methods in this article – linear, Lasso, and Elastic Net. For readers unfamiliar with these methods, linear regression is the simplest form of regression where the objective is to minimize the squared residuals. Lasso regression introduces a regularization parameter that minimizes the absolute value of coefficients, while Elastic Net regression builds on Lasso by adding an additional regularization term which minimizes the square of coefficient values (read Factor Exposure Analysis 101).
We take the example of iShares Morningstar Value ETF (ILCV) and perform a rolling regression using the daily fund returns and use four asset class indices for equities, bonds, commodities and currency. We also include five long-short factors, namely value, momentum, quality, low volatility, and size, with factor definitions in line with industry standards.
Looking at the R2 and p-values from the regressions, we find a sli