Why Most Equity Mutual Funds Underperform and How to Identify Those that Outperform
January 26, 2016
by C. Thomas Howard, PhD
Why do most active equity mutual funds underperform? I have researched this question over the last few years and have unearthed some surprising answers. It is neither because managers lack stock picking skill nor because of high fees, the two most often cited reasons. On the contrary, nearly 90% of active equity fund managers are superior stock pickers and, in addition, the funds most likely to outperform charge higher fees.
The real culprit for most underperformance is the structural decisions made by fund companies: asset bloat, closet indexing and over-diversification. Collectively, I refer to these as portfolio drag. These structural inefficiencies can be measured and ranked using a methodology dubbed the Portfolio Drag index (PDI). Once understood, it is fairly straightforward to avoid high portfolio-drag funds and reap the value add of skill.
There is a long line of research showing that stock picking skill exists among active equity mutual funds, generating returns that more than offset fees. Overall, these studies reveal a universe of investment teams who are very good at identifying profitable opportunities and portray an industry where superior skill is common. This contradicts a large body of literature that concludes, based on studies of aggregate active equity fund performance, that managers lack skill. I, along with others, argue that this underperformance is due to a variety of non-performance pressures and incentives that lead to building underperforming portfolios, rather than the lack of skill. The resulting impact of performance-destroying portfolio decisions – or, portfolio drag -- can be seen in a portfolio’s structure.
In this study, the average skill among funds is found to be 3.81%, portfolio drag averages 2.71% and fees average 1.39%. As a result, 88% of funds display positive skill, with 79% of these large enough to cover fees, the latter being virtually the same as Berk and Green’s estimate. Consequently, even though there is substantial skill among active equity funds, none of it is delivered to investors as a result of portfolio drag. This explains why, in spite of significant skill, the average alpha is -0.29% across all funds.
Given the important role of portfolio drag in determining performance, it would be useful to have a measure of the extent to which an individual fund is imposing such a drag. The PDI is introduced as this measure and is derived from measurements of size, closet indexing and conviction.
The resulting PDI is predictive of subsequent fund alpha and, in addition, future fund flows. Both alpha and flows decline precipitously and turn negative as PDI increases. A PDI of 40 or less is predictive of positive alpha and flows, 41 to 60 is predictive of near zero alpha and weak flows and 61-100 is predictive of both negative alpha and flows. Thus an effective way to identify funds with the best chance of subsequently outperforming and also generating positive flows is to focus on those with the lowest PDI, more specifically, those with a PDI of less than 40.
Disaggregating performance: Stock-picking skill and portfolio drag
A number of recent studies show that truly active funds – that is, not just calling themselves active but actually taking high-conviction positions -- outperform their closet-indexing counterparts. The three most important variables (asset bloat, closet indexing and conviction) identified in these studies are used for the purpose of disaggregating fund performance into skill and portfolio drag. Let’s look at how each of those degrades portfolio performance.
This study shows that fund performance declines with fund size as measured by AUM. This is likely the result of limiting aspects of managing an active equity portfolio. In pursuing a narrowly defined investment strategy, a fund manager ends up with a small number of “best idea” stocks (it will be shown shortly that this number is fewer than 20). Put another way, their highly developed strategy allows them to identify only a few stocks worthy of investment.
As the fund grows, it becomes increasingly difficult to effectively trade this small number of stocks. Eventually the fund reaches a size where it is not possible to limit the portfolio to best idea stocks, so a decision must be made either to limit the size of the fund or begin investing in other than best idea stocks. Unfortunately, the investment industry provides incentives to do the latter, most importantly because managers are compensated based on AUM. Thus begins the transformation from being truly active to a closet indexer.
This study shows that fund performance declines as the R-squared to its benchmark increases. That is, the more closely a fund tracks its benchmark, turning itself into a closet indexer, the worse the performance. This is painfully obvious; in order to beat the benchmark, one must differ from the benchmark. It is truly strange that the industry has evolved to the point where funds are expected to closely track their benchmark while, at the same time, beat their benchmark!
The widespread requirement that funds maintain a high R-squared to a benchmark is the result of two important drivers.
First, investors suffer from myopic loss aversion (MLA), arguably the most important cognitive error identified by behavioral science. Individuals display 2-to-1 loss aversion; That is, they react twice as strongly to losses as they do to equivalent gains. In addition, investors focus on short-term performance even when they face a long investment horizon. Combining loss aversion with a short-term focus leads to bad, myopic decisions by investors, resulting in poor long-term performance.
A fund that does not track its benchmark stirs up MLA in its investors as they emotionally react to short-term underperformance. This may prompt investors to leave the fund, thus turning investor emotion into business risk for the fund. One way to avoid those investor emotions, and the related business risk, is to closely track the fund’s benchmark. That’s why maintaining a high R-square is a common approach for catering to investor emotions.
Second, virtually every platform, broker-dealer and institutional consultant within the fund distribution system assigns a fund to a specific style box and demands that it stay there over time. In fact, style drift is often viewed as a more serious problem than underperformance. However, the assigned style box has little to do with a fund’s strategy and, in turn, purchasing stocks in order to remain style consistent means not staying true to the fund’s investment approach.
This study shows that funds with the greatest amount of style drift outperform those with the least drift by 3.00%. It also finds that a fund cannot outperform, on average, if it does not style drift.
Emotional catering and style drift avoidance encourage a fund to maintain a high benchmark R-squared.
This study shows that individual stock alphas decline as a stock’s relative portfolio weight rank declines. A fund’s best idea or high-conviction stocks can be identified by ranking stocks based on their relative weight within the portfolio. The rank of these weights is predictive of future stock returns for up to a year ahead, which means the “shelf life” of fund holdings is at least 12 months.
Surprisingly, then, high conviction stocks can be identified by means of holdings and, in turn, these high conviction stocks subsequently outperform. The corollary is that low-ranked stocks reflect a lack of conviction on the part of the manager and, in turn, underperform.
As reported in Figure 1, an increase in the top-20 stock weighting leads to an increase in a fund’s subsequent alpha, with the gain to the top-10 stocks being nearly triple the next 10 stocks (6.1 versus 2.3 basis points annually). For example, increasing the top 10 weighting by 10% improves fund alpha by 61 basis points. However, increasing the weighting of stocks ranked lower than 20 hurts fund performance, as the impact on fund alpha is negative as shown in Figure 1.
The 4,000+ funds in the sample hold an average of 113 (median of 75) stocks, which translates into four- to five-times as many low-conviction as high-conviction stocks. Although there are legitimate reasons why some funds may choose to diversify more broadly, the results show fund managers heavily dilute performance by doing so.
One possible explanation is that investors and funds falsely believe that a large number of stocks are needed to achieve proper diversification in spite of the evidence to the contrary. Another possibility is that holding a small number of stocks exposes the manager to criticism if a stock dramatically underperforms and thus has a significant impact on fund performance. As will be shown shortly, over-diversification may also be a byproduct of asset bloat and closet indexing.
Whatever the reason, investing in low-conviction instead of high-conviction stocks is a performance-destroying decision.
Based on ex-post gross fund alpha regressions on cumulative relative weights, estimated using a data set of 44 million stock-month equity fund holdings from January 2001 through September 2014.