Innumerable studies have shown that it’s well-nigh impossible to beat the averages consistently investing in equity funds. But what about bonds? Bonds, after all, have more structure – perhaps there are ways an expert fund manager could exploit that structure and gain an edge over other investors. Is it possible to predict how well a bond fund will perform relative to other funds?
In a recent article, “Predicting Bond Fund Returns,” in the Winter 2011 issue of the Journal of Investing, finance professors Dale L. Domian of York University and William Reichenstein of Baylor University pose that question. They study specifically whether the funds’ expense ratios or their previous five-year returns could have helped predict their returns over the five-year periods 1995-1999, 2000-2004, and 2005-2009.
Bond returns vary considerably, as a more-or-less predictable function of two risk measures – their durations and their quality ratings. To prevent the wide variation caused by these factors from swamping other discernible effects, Domian and Reichenstein studied separately groups of bond funds that fall into each of the Morningstar style categories.
This was not an isolated study, but rather the latest in a line of research stretching back more than a decade. Reichenstein previously studied this issue in 1999, 2002 and 2004 for both taxable and tax-exempt bond funds, collaborating both with Domian and with others. In each case, the researchers found that fund expense ratios were a good predictor of subsequent performance – the higher the expense ratio, the worse the performance.
In fact, the results failed to rule out the efficient market null hypothesis – that performance was lower by exactly the amount of the fees. This result, if perhaps counterintuitive, will not be unfamiliar to those who are knowledgeable about research on equity returns. A long string of studies, dating back to Nobelist William F. Sharpe’s paper on mutual fund performance in 1966, through an article by Mark Carhart in 1997, to recent studies by Morningstar, have shown that expense ratios are the best predictors of equity fund performance.
Domian and Reichenstein conclude from their research that for bonds too, there is “a one-to-one negative relationship between expense ratios and net returns.” The clear implication is that advisors should look first at the expense ratio of the fund and place primary importance on that variable. Those hoping that bonds can be an exception to the law of averages will have to keep searching.
The Morningstar categories
The Morningstar categories were created, in part, as a means to separate a decision that is in principle the responsibility of the investor – the risk posture – from decisions that are the responsibility of the investment managers. The recent conventional wisdom has been that an investor, or the investor’s advisor, should choose the “style” categories and the allocation to each of them, while the manager of each style should either attempt to match the average performance for that style category at a low fee or outdo it for a higher fee.
In the Morningstar mutual fund classifications, equity and fixed income funds are each divided into nine styles. Each of the two asset classes is divided into three partitions on two axes depending on the securities it holds. In the case of equities, the axes are large-small capitalization and value-growth. In the case of fixed income, the axes are long-short duration and low-high credit quality.
Hence, the “Morningstar Style Box” for fixed income has nine cubbyholes, for various permutations of short, intermediate, or long duration combined with low, medium, or high credit quality.
Morningstar Fixed Income Style Box
Average Duration |
Average
Credit Quality |
Short
|
Intermediate |
Long |
|
|
|
High |
|
|
|
Medium |
|
|
|
Low |
Short duration for taxable bond funds is defined as 3.5 or less, intermediate as 3.5 to 6, and long as more than 6. The quality rating is high if the average of S&P’s and Murphy’s bond ratings is AAA or AA, medium if the average is A or BBB, and low if the average is BB or lower.
According to Domian and Reichenstein, duration is basically self-reported. Funds report their “adjusted duration” and thereby whether they belong in the short, intermediate, or long category. There is no independent verification of this self-assessment.
The quality box, however, is determined by Morningstar, based on the holdings of the fund. Until recently, the procedure was as follows: Eight quality ratings, AAA, AA, A, BBB, BB, B, and below B, were assigned the numbers 1 to 8, respectively; the quality category of the fund could then be based on a value-weighted average of the numerical scores assigned to the bonds in the fund. Morningstar recently discontinued this approach, for reasons I’ll address later.
You need to know these details in order to understand an important result that Domian and Reichenstein cite, which I will come to shortly.
Domian and Reichenstein’s latest findings
For all the periods they studied, Domian and Reichenstein – echoing Reichenstein et al’s earlier studies before – found that a bond fund’s expense ratio was a good predictor of that fund’s subsequent five-year performance. The higher the expense ratio, the lower the subsequent five-year performance, net of fees. The prior five-year performance of the fund was also a good predictor of the subsequent five-year performance, though less so than the expense ratio. Domian and Reichenstein compare in particular the subsequent performance of a portfolio that a low-cost investor would likely choose for a specific style with the performance of other portfolios. (They assumed this frugal investor would choose either only the single lowest-expense ratio fund, or the three with lowest expenses, or the very few funds with expense ratios less than 30 basis points.) In such cases, the low-cost portfolio clearly beats the others after expenses.
Domian and Reichenstein’s analyses are short on the statistical formalities often found in highly academic financial journals. They compare averages for low-cost bond funds and for the average of all bond funds in a style category, and they cite the differences, but they don’t specify how statistically significant these differences are. They do perform regression analyses and cite those significance findings, which they say point in the same direction but not as strongly. They say little about how the regressions were constructed.
Some academic purists may find this less than scientific – and it may disqualify the articles from publication in the more technically-oriented financial journals – but, personally, I have no problem with the way in which Domian and Reichenstein have performed their research and presented their findings. The ostensibly very careful attention that is given to constructing regressions in many of the technical academic journals, including lengthy discussions of heteroschedasticity and the like, imposes, in my opinion, only a veneer of science on the underlying results. Meanwhile, it makes the analysis more esoteric, less transparent, and less accessible to the general reader, without necessarily increasing the reliability of the inferences. Furthermore – paradoxically – applying more esoteric methods of analysis can, and often does, blind the practitioner to fundamental errors in the mathematics or in the data itself. Consider, for example, the mathematically complex analyses that have been performed using hedge fund databases that are intrinsically riddled with biases and poorly constructed estimates (of valuations of illiquid assets, for example, and hence of volatility).
Domian and Reichenstein find that for the five-year period 2005-2009, while the same predictive capacity of both expense ratios and prior returns is apparent, there is less clear evidence of predictability. This, they say, is unsurprising given the wide variation of performance in bond funds during that period, which encompassed the financial crisis, and the subset of extremely poor-performing fixed income instruments that were strongly present in some funds – even highly-rated ones – while not in others.
An oversight that I find surprising in Domian’s and Reichenstein’s analyses is that although they present findings that relative performance in one five-year period net of fees is a fairly good predictor of relative performance in the subsequent five-year period, they don’t explore whether the same relationship prevails if the performance in both periods is gross of fees. The predictability they find could be, of course – and probably is – due merely to the fact that expense ratios are highly predictable. Hence, if a fund has a high expense ratio in one five-year period, it is highly likely to have a high expense ratio in the next.
Since in both periods the expense ratio reduces the fund’s performance net of fees, the first period’s net performance will be a good predictor of the subsequent period’s net performance, even if gross performance is not. In other words, Domian’s and Reichenstein’s two results – that expense ratios and historical five-year net returns are both predictors of subsequent five-year net returns – are in all likelihood both manifestations of the same phenomenon, namely that performance is, on average, only impaired, and not enhanced, by the expenditure of fees.
Gaming the Morningstar system
Domian and Reichenstein cite two earlier studies in the Journal of Investing by Geng Deng, Craig McCann, and Edward O’Neal, one on average credit quality, and one on average duration, to show how some bond funds are gaming the Morningstar fixed income style box. They also cite their own corroborating findings to this effect.
The evidence that at least some funds are falsely reporting low average durations in order to be compared with funds in the short duration boxes is overwhelming. For example, Domian and Reichenstein say, “the Evergreen Ultrashort Opportunities fund lost approximately 40% in 2008 despite being listed as an ultrashort fund. Its losses could be explained by its weighted average maturity exceeding 25 years.” This fund was not alone – many other funds in the short-term category also lost heavily in 2008. Domian and Reichenstein conclude – as did Deng, McCann, and O’Neal – that misreporting of average duration is rampant in the short-term boxes. Because of this reality, Domian and Reichenstein did not analyze predictability of returns for the short-term style categories.
On the credit quality axis, the gaming – if indeed it was gaming – took a different form. Morningstar’s past practice of obtaining average quality for a fund by taking a weighted average of the numbers 1 through 8, representing the credit qualities AAA through below B, resulted in some bond funds with a high likelihood of defaults still being categorized as high-grade. The reason for such miscategorization is simple: The probability of default increases at much greater than a linear rate as the quality falls off from AAA to below B, yet Morningstar’s weightings were only linear. Hence, a portfolio consisting of both very high and very low quality instruments will have a high probability of default, yet it would receive a medium quality rating, when it really deserves low. Domian and Reichenstein point out that Morningstar has changed its weighting system since the publication of Deng et al’s papers, but it infects the historical data nonetheless.
Boxed in
The practice of choosing an asset allocation based on Morningstar style boxes then choosing a manager or managers for each style has boxed in the money management business. “Style drift,” meaning a portfolio in which the average style moves from one box to another over time, has become a pejorative, applied to what are presumed to be sloppy or irresponsible management practices. While remaining within certain rigid boxes may impose a certain superficial order on the process of selecting, managing, and monitoring a portfolio, however, it also imposes senseless and counterproductive constraints on the process.
For example, Reichenstein points out that the best predictor of the five-year return on a bond portfolio is its average yield-to-maturity at the beginning of the period. If the expected return on a portfolio is its average yield-to-maturity, then the best way to minimize risk – not to mention transaction cost – would be to keep a portfolio of five-year bonds until maturity, then buy another. But such a portfolio would exhibit style drift – it would move from intermediate duration to short over time – and therefore would be unacceptable to anyone who imposes a rigid standard of style adherence. If there really were any advantage to be gained from expert knowledge of the structure of fixed income investments, it might well only have effect if the expert manager is allowed to let his or her style “drift.”
The counterargument is that if a manager exhibits style drift, it could threaten to move the overall portfolio outside of its established risk boundaries. But that too invokes a conventional – and senseless – concept, that risk is determined at a point in time. Risk is, of course, something that has meaning only over time; it cannot be measured by the composition of a portfolio at a single moment. That reality still seems to be – as it has been for a long time – too advanced for most of the investment management field to properly grasp.
Given managers who will stay in their style boxes, however, Domian and Reichenstein’s results provide yet more support for that old maxim in the investment management field: You don’t get what you pay for.
Or, some might say, you get what you don’t pay for.
Michael Edesess is an accomplished mathematician and economist with experience in the investment, energy, environment and sustainable development fields. He is a Visiting Fellow at the Hong Kong Advanced Institute for Cross-Disciplinary Studies, as well as a partner and chief investment officer of Denver-based Fair Advisors. In 2007, he authored a book about the investment services industry titled The Big Investment Lie, published by Berrett-Koehler.
Read more articles by Michael Edesess