Recent articles in Advisor Perspectives by Wade Pfau and David Blanchett have explained the benefits and limitations of using Monte Carlo simulations in financial planning (see here and here). Pfau and Blanchett talked about the power and limitations of the technique, compared Monte Carlo to the similar approach of using rolling historical periods and addressed the importance of return assumptions.
I'll add to the discussion by examining this technique from a different perspective, focusing on the metrics used to evaluate the outputs from Monte Carlo simulations. Popular financial-planning software packages have shortcomings in this respect, and other metrics can provide more useful information. I will address how to measure the performance of financial plans when variable investment returns and longevity are introduced and demonstrate that the most-commonly used measures have weaknesses.
Measuring performance
When we move from using deterministic projections of financial-planning outcomes to using Monte Carlo simulations, we need performance measures that can handle the variability of outcomes. The best measures will be those that appropriately summarize the outcomes so advisors and clients can communicate effectively in choosing withdrawal levels and asset allocations and deciding whether to purchase financial products such as annuities.
I'll use an example to show how the measures used with Monte Carlo simulations differ from those used with deterministic forecasts and how some of the commonly used measures can be improved. I'll also discuss measures that can be applied in more complex planning, when withdrawal amounts vary as a function of investment performance.
Example
This article is based on a 65-year-old female with a 25-year life expectancy who has reached retirement with $1 million in savings that she will use to generate retirement income. Her goal is to be able to withdraw $40,000 in the first year with inflation increases each year thereafter, following the classic 4% rule. Her investment options include stocks with an arithmetic average real return after inflation and expense charges of 6.85% and a standard deviation of 20%, and bonds with a 0.45% return and 5.5% standard deviation. She also has the option of using a portion of savings to purchase an inflation-adjusted single-premium immediate annuity (SPIA) that will pay 4.5% of the purchase price in the first year, with increases based on actual inflation each year thereafter.
My assumed fixed-income investment returns are lower than historical averages, reflecting current lower bond yields. For stocks, I've assumed the same return premium over bonds as the historical averages. The SPIA pricing reflects the current level of rates from direct-purchase sites such as Income Solutions® or the Thrift Savings Plan available to federal employees. The analysis is pre-tax.
Deterministic forecasting
With deterministic forecasting, probabilities do not come into play when measuring success — either a plan works or it doesn't. The success measures are whether the plan depletes savings over a set retirement period, and if not, how much savings remains. In the example, the plan involving 4% inflation-adjusted withdrawals will provide retirement income for 25 years as long as the compound real return is greater than zero. However, with zero return, the plan will fail if the retiree lives to more than 90 years old. A compound real return greater than 4% means that savings will actually grow over the course of retirement, no matter how long retirement lasts.
Deterministic forecasts can give rise to overconfidence. It might not seem that much of a challenge to beat 4%, although that is a real rate, not a nominal rate. However, when we apply Monte Carlo simulations, we get a clearer picture of the actual risks.
Monte Carlo forecasts
Chart 1 is based on the example of the 65-year old female and shows a variety of performance measures from Monte Carlo forecasts with differing asset-allocation assumptions. Software packages may show either average or median bequest (or both), but the choice can make a big difference. Even if annual returns are assumed to be normally distributed, projected bequest values from Monte Carlo simulations will be skewed to the right, meaning that the average bequest will exceed the median. The difference between the average and median becomes very large for asset allocations tilted heavily toward stocks. For example, with 75% stocks, the average bequest (sometimes referred to as "expected") is over $1.8 million, but there is only a 50% chance that the bequest will be greater than about $1.1 million. To quote Dick Purcell, who developed Portfolio Pathfinder's projection software, "The expected return is not expected!" Both average and median bequests are meaningful numbers, but advisors need to be aware of the differences.
Chart 1
Performance measures with differing asset allocations
Stock/bond mix |
Average bequest |
Median bequest
|
Failure rate |
5th percentile bequest |
Average failure |
0/100 |
$66,000 |
$34,000 |
46.7% |
-$565,000 |
-$293,000 |
25/75 |
$362,000 |
$360,000 |
23.4% |
-$379,000 |
-$240,000 |
50/50 |
$899,000 |
$729,000 |
14.5% |
-$322,000 |
-$275,000 |
75/25 |
$1,852,000 |
$1,066,000 |
13.0% |
-$360,000 |
-$325,000 |
100/0 |
$3,419,000 |
$1,322,000 |
14.7% |
-$454,000 |
-$366,000 |
Source: Author's calculations
The three rightmost columns of Chart 1 are downside risk measures. The measure most commonly used in planning software is the failure rate – the probability of running out of money during retirement. Most software programs use the probability of depleting ones’ assets over a set period such as 20 or 30 years, but in the figures shown here, I also apply Monte Carlo variability to longevity. Using variable longevity produces more realistic probability measures.
For example, an all-bond portfolio using my return assumptions would have a failure rate of zero at 20 years and 100% at 30 years. But with variable mortality, the probability of failure is just under 50%, reflecting the combination of a return assumption slightly above zero, a 4% withdrawal rate and a life expectancy of 25 years. On average, the money will run out just beyond 25 years, affecting most investors who live beyond life expectancy.
I also show two other measures of downside performance. The failure rate only considers the probability of running out of money, but not the dollar magnitude of the failures. Running out of money at 75 and living to 95 is much more catastrophic than depleting savings at 84 and dying at 85, but both cases are treated the same in determining failure rate. The other two measures take magnitude into account.
The measure I personally favor is the 5th-percentile bequest, because it combines probability of failure and magnitude. It's analogous to the median, except in this case, there is a 95% chance of doing better and a 5% chance of doing worse.
I show negative dollar bequests, which may seem like an odd concept. Conceptually, a negative bequest represents the amount of funds the retiree would need from relatives, friends, charities or the government to continue spending at the 4% withdrawal level until death. Some software programs show zero bequests in these cases by assuming zero withdrawals after funds are depleted, but I find showing negative bequests more meaningful. The other downside measure I show focuses on magnitude only. It's the average negative bequest for the subset of Monte Carlo simulations that run out of money.
In Chart 1, based on failure rate alone, it appears that the 75/25 allocation is the least risky. But when we look at 5th-percentile bequest, it is clear that 50/50 does better. Although the failure rate drops from 14.5% to 13% when we go to 75% stocks, the average failure goes from $275,000 to $325,000, more than offsetting the decline in failure rate. The 5th-percentile bequest incorporates both rate and magnitude and points to 50/50 as the lowest risk allocation. Unfortunately, most planning software uses failure rate as a downside measure and ignores magnitude. This becomes even more of a problem when annuities are an option, as we show in the next section.
Bringing in SPIAs
In Chart 2, I focus on the 50/50 asset allocation and show the impact on the performance measures of substituting SPIAs for bonds. The mix percentages I come up with such as 48/27/25 look a bit odd, but I'm attempting to provide an apples-to-apples comparison when substituting SPIAs for bonds. Research by Michael Kitces and Pfau has shown that part of the benefit of substituting SPIAs for bonds is that such substitutions cause the proportion of stocks in the portfolio to increase over time rather than stay level. I've adjusted for this impact by slightly lowering stock allocations to generate outcomes with roughly the same median bequests in order to focus the demonstration on a comparison of the downside measures.
Chart 2
Impact of guaranteed lifetime income (SPIAs) on performance measures
Stock/bond/SPIA mix |
Average bequest
|
Median bequest |
Failure rate |
5th percentile bequest |
Average failure |
50/50/0 |
$899,000 |
$729,000 |
14.5% |
-322,000 |
-275,000 |
48/27/25 |
$1,077,000 |
$722,000 |
10.6% |
-196,000 |
-210,133 |
46/4/50 |
$1,549,000 |
$719,000 |
9.5% |
-115,000 |
-142,000 |
Source: Author's calculations
As we move to a higher percentage of SPIAs in the allocation, we see failure rates going down, but the other downside measures that take magnitude into account go down by much more. The SPIA allocation is providing support for retirement in the form of guaranteed lifetime, so the shortfall from depleting savings is not as great. The failure rate, while giving a directional indication, does not illustrate the dramatic change in magnitude. Much of the software in use today is not built to handle annuities, but as annuity capabilities are added, it will also be necessary to reconsider the downside measures.
Variable withdrawals
So far, we have looked at performance measures where withdrawals do not vary from 4%. However, one would expect advisors to recommended adjustments from time to time based on realized investment performance. Planner Jonathan Guyton has developed decision rules for adjusting withdrawals in response to investment experience, and there other simpler methods such as calculating withdrawals as a percentage of remaining assets.
But adding this complexity creates an additional performance measurement problem. When evaluating Monte Carlo output, we need to consider not only projected bequests and failure measures but also the amount of withdrawals taken. One straightforward way to do this is to measure the average annual withdrawals produced over an average lifetime. However, variability of withdrawals also needs to be considered.
Most retirees would prefer to take withdrawals that don't bounce around a lot from year to year, but this is a matter of personal preference. If retirees were allowed to choose between high average withdrawals with a lot of volatility and lower average withdrawals with less volatility, some would choose the latter and some the former. Such considerations involve what economists refer to as the level of risk aversion, which characterizes individuals by their willingness to accept varying amounts of consumption. Economists use the risk-aversion measure to convert variable consumption or withdrawal patterns into level amounts that individuals would be willing to accept in trade. For example, an individual might be indifferent when choosing between $40,000 withdrawals or withdrawals that bounce around between $30,000 and $60,000 and average $45,000. In this case $40,000 is called the "certainty equivalent" of the variable withdrawals.
In Chart 3, I apply differing risk-aversion parameters to calculate certainty equivalents for two different withdrawal patterns. One averages $45,000 per year but varies between $30,000 and $60,000, and the other averages $40,000 but varies in a narrower range between $35,000 and $45,000. A straightforward analysis that compares average withdrawals would always favor the more-variable withdrawal pattern that averages $45,000. But this example shows that only the individual with high withdrawal flexibility (low risk aversion) would favor this pattern and those with medium or low flexibility would favor the pattern averaging $40,000 with less variation.
Chart 3
Certainty equivalent withdrawal comparison
|
Client flexibility >>>>>>>>>>> |
Withdrawal pattern |
High |
Medium |
Low |
Averages $45,000 per year, but bounces around between $30,000 and $60,000 |
$41,500 |
$38,300 |
$35,500 |
Averages $40,000 per year, but bounces around between $35,000 and $45,000 |
$39,600 |
$39,160 |
$38,600 |
Source: Author's calculations
Economists have gotten very comfortable working with consumption patterns that vary with investment performance and using risk-aversion measures and certainty equivalents, but these concepts have not crossed over to financial-planning practice. There are a few financial-planning packages in existence or under development that apply these concepts, but the Monte Carlo techniques used in the most popular packages don't delve into this area. Most simply show projected bequest values and failure rates and leave it at that. This is definitely an area for future development.
Conclusion
Monte Carlo simulations provide a powerful way for advisors to illustrate the variability in outcomes and help clients make choices involving risk and reward tradeoffs. In order to make those simulations as useful as possible, it is vital to develop and utilize appropriate performance measures. The measures being applied by researchers may be more useful than those provided in financial-planning software packages, which provides an opportunity for software developers to introduce new measures to improve the usefulness of their products.
Joe Tomlinson, an actuary and financial planner, is managing director of Tomlinson Financial Planning, LLC in Greenville, Maine. His practice focuses on retirement planning. He also does research and writing on financial planning and investment topics.
Read more articles by Joe Tomlinson