Solving for the Efficient Frontier in Stock Portfolios
The Efficient Frontier, a core concept in Harry Markowitz's Modern Portfolio Theory (MPT), is used in quantitative finance to build optimal portfolios that offer the highest expected return for a given level of risk.
It is constructed by diversifying assets to find the best risk-return trade-off, which we will do in this article, through python and free financial data.
The resulting universe of assets and portfolios should lie on the efficient frontier.
Prepare your Environment
Have a jupyter environment ready, and pip install these libraries:
You'll need access to analysis_utils library for common functions.
Before we start, let's load two uncorrelated securities from the latest post-covid market regime (2020-2023) to build a test portfolio:
tickers = ["DIS", "AAPL"]
START_DATE = "2021-01-01"
END_DATE = "2022-12-31"
tickers_orig_df = load_ticker_prices_ts_df(tickers, START_DATE, END_DATE)
tickers_df = tickers_orig_df.dropna(axis=1).pct_change().dropna() # first % is NaN
# 1+ to allow the cumulative product of returns over time, and -1 to remove it at the end.
tickers_df = (1 + tickers_df).cumprod() - 1
plt.figure(figsize=(16, 8))
for ticker in tickers_df.columns:
plt.plot(tickers_df.index, tickers_df[ticker] * 100.0, label=ticker)
plt.xlabel("Date (Year-Month)")
plt.ylabel("Cummulative Returns(%")
plt.legend()
plt.show()
For each stock above (i), we are interested in:
mean_returns = tickers_df.mean()
highest_returns = tickers_df.max()
lowest_returns = tickers_df.min()
std_deviation = tickers_df.std()
summary_table = pd.DataFrame(
{
"Highest ret (%)": highest_returns * 100.0,
"Average ret (%)": mean_returns * 100.0,
"Lowest ret (%)": lowest_returns * 100.0,
"Deviation (%)": std_deviation * 100.0,
}
)
summary_table.transpose()
Investment Preferences with Indifference Curves
To understand the efficient fontier we need to experiment with the investors choices of risk/reward, through indifference curves.
The curves model satisfaction (or utility) from investment decisions, and help to visualize the optimal positions to take. We use opportunity sets to outline viable combinations of assets' risk and returns, within linear budget constraints.
Assume the following opportunities:
Let's assume this is a risk averse investor, we model them by this utility function:
The goal of the investor is to go up and to the left curves increasing their utility. This means nothing under the low utility intersection, and nothing higher than moderate utility curve, because there is no opportunity.
The optimal choice is when the opportunity set line is tangent to a curve, in our case the moderate utility curve, giving us 70% returns for 10% risk.
def utility_fn(x, a=0.15, b=0.1, c=1):
return a * x**2 + b * x + c
plt.figure(figsize=(16, 8))
x_values_1 = np.linspace(0, 40, 100)
y_values_1 = utility_fn(x_values_1, c=10)
plt.plot(x_values_1, y_values_1, label="Low Utility", alpha=0.6)
x_values_2 = np.linspace(0, 35, 100)
y_values_2 = utility_fn(x_values_2, c=24)
plt.plot(x_values_2, y_values_2, label="Moderate Utility", color="g", alpha=0.6)
x_values_3 = np.linspace(0, 30, 100)
y_values_3 = utility_fn(x_values_3, c=35)
plt.plot(x_values_3, y_values_3, label="High Utility", color="r", alpha=0.6)
plt.plot([0, 40], [5, 140], label="Opportunity Set", color="b", linewidth=2)
plt.plot([2, 2], [0, 12], linestyle="--", color="b", alpha=0.6, linewidth=2)
plt.plot([0, 2], [12, 12], linestyle="--", color="b", alpha=0.6, linewidth=2)
plt.plot([10, 10], [0, 40], linestyle="--", color="g", alpha=0.6, linewidth=2)
plt.plot([0, 10], [40, 40], linestyle="--", color="g", alpha=0.6, linewidth=2)
plt.plot([20, 20], [0, 72], linestyle="--", color="b", alpha=0.6, linewidth=2)
plt.plot([0, 20], [72, 72], linestyle="--", color="b", alpha=0.6, linewidth=2)
plt.xlabel("Risk %")
plt.ylabel("Returns %")
plt.title("Utility Curves (Risk Free VS AAPL)")
plt.legend()
plt.grid()
plt.xlim(0, 25)
plt.ylim(0, 100)
plt.show()
Mean-Variance Portfolios
To build portfolios, we need to wieght our stocks according to their risk or dispertion. We will weight these to get the expected returns for our portfolios:
Where:
First we convert DIS and AAP to annualized returns and covariances:
We annualized to scale up to represent a full year of risk/return, e.g.:
TRADING_DAYS_IN_YEAR = 252
tickers_df = tickers_orig_df.dropna(axis=1).pct_change().dropna()
rets = ((1 + tickers_df).prod() ** (TRADING_DAYS_IN_YEAR / len(tickers_df))) - 1
cov_matrix = tickers_df.cov() * TRADING_DAYS_IN_YEAR
summary_table = pd.DataFrame(
{
"Annualized Returns (%)": round(rets * 100.0, 2),
"Annualized Covariances (%)": [
round(cov_matrix.iloc[1, 0] * 100.0, 2),
round(cov_matrix.iloc[0, 1] * 100.0, 2),
],
}
)
summary_table.transpose()
A portfolio is a weighted collection of assets. We can create some portfolios with arbitary wieghts - which we will test randomly below:
RISK_FREE_RATE = 0.05
MAX_PORTS = 10000
MAX_WEIGHT = 1.05
def port_generator(rets, cov_matrix):
port_rets = []
port_risks = []
port_sharpes = []
port_weights = []
for _ in range(MAX_PORTS):
# weights = np.random.random(len(rets))
weights = np.random.uniform(-MAX_WEIGHT, MAX_WEIGHT, len(rets))
weights /= np.sum(weights) # Normalize weights to 1
if any(weights > MAX_WEIGHT):
continue
port_weights.append(weights)
port_ret = np.dot(weights, rets)
port_rets.append(port_ret)
port_risk = np.sqrt(weights.T @ cov_matrix @ weights)
port_risks.append(port_risk)
port_sharpe = (port_ret - RISK_FREE_RATE) / port_risk
port_sharpes.append(port_sharpe)
port_rets = np.array(port_rets)
port_risks = np.array(port_risks)
plt.scatter(
port_risks * 100.0,
port_rets * 100.0,
c=port_sharpes,
cmap="viridis",
alpha=0.75,
)
plt.xlabel("Risk (%)")
plt.ylabel("Expected Returns (%)")
plt.colorbar(label="Sharpe Ratio")
plt.grid()
return port_risks, port_rets, port_sharpes
plt.figure(figsize=(16, 8))
plt.title("Random Portfolios")
port_generator(rets, cov_matrix)
plt.show()
Already we can see a frontier forming.
Here we add Sharpe Ratios to be able to compare portfolios. The ratio tell us if the portfolio is better than the risk free option, usually if it's 1 or above:
where:
The ratio was not the best benchmark here, neither the variance calculations, as most stocks in this market regime had extreme positive variance - in this case we would have done better with the Sortino Ratio and half-variance calculations that measures negative risk (less than the risk free rate).
Finding an Investable Universe
To simplify the process of building portfolios and frontiers, we will ignore riskless lending and borrowing and assume unlimited short selling is possible - this means you'll see negative weights and weights that are more than 100%. All will add up to 100% if you include the negative weights, as we will 'leverage' our long positions with income from the short selling, e.g. -20% DIS and 100 % AAPL:
The following sections are better described with a larger universe of instruments at a larger timeframe for diversification:
tickers = [
"AAPL",
"DIS",
"MSFT",
"AMZN",
"GOOG",
]
START_DATE = "2014-01-01"
END_DATE = "2022-12-31"
tickers_orig_df = load_ticker_prices_ts_df(tickers, START_DATE, END_DATE)
tickers_df = tickers_orig_df.dropna(axis=1).pct_change().dropna() # first % is NaN
rets = ((1 + tickers_df).prod() ** (TRADING_DAYS_IN_YEAR / len(tickers_df))) - 1
cov_matrix = tickers_df.cov() * TRADING_DAYS_IN_YEAR
plt.figure(figsize=(16, 8))
plt.title("Random Portfolios")
port_risks, port_rets, port_sharpes = port_generator(rets, cov_matrix)
plt.show()
Now lets build 2 portfolios as benchmarks:
To get to those portfolios, we approximate the inverse of the covariance (called precision matrix) to get the min_risk_vect and expect_ret_vect dot products, that will contribute to the risk adjustment.
The MVP solution is straightforward, all portfolio attributes are adjusted by the min_risk_vect.
The TAN is the same, but adjusted by the expect_ret_vect vector.
# Equal-weighted portfolio useful for matrix operations
equal_weights = np.ones(len(rets))
rets = ((1 + tickers_df).prod() ** (TRADING_DAYS_IN_YEAR / len(tickers_df))) - 1
cov_matrix = tickers_df.cov() * TRADING_DAYS_IN_YEAR
# Min variance weights
inv_cov_matrix = np.linalg.pinv(cov_matrix)
min_risk_vect = equal_weights @ inv_cov_matrix
expect_ret_vect = inv_cov_matrix @ rets
# Minimum variance portfolio
# Weights are normalized to sum to 1, and risk to std deviation.
mvp_weights = min_risk_vect / np.sum(min_risk_vect)
mvp_ret = mvp_weights @ rets
mvp_risk = np.sqrt(mvp_weights.T @ cov_matrix @ mvp_weights)
# Tangency portfolio
tan_weights = expect_ret_vect / np.sum(expect_ret_vect)
tan_ret = tan_weights @ rets
tan_risk = np.sqrt(tan_weights.T @ cov_matrix @ tan_weights)
summary_data = {
"Asset": tickers,
"MVP Weights": mvp_weights,
"TAN Weights": tan_weights,
}
print(f"mvp_ret: {mvp_ret*100:0.02f}%, mvp_risk {mvp_risk*100:0.02f}%")
print(f"tan_ret: {tan_ret*100:0.02f}%, tan_risk {tan_risk*100:0.02f}%")
summary_df = pd.DataFrame(summary_data)
summary_df.T
Let's create a risk curve by solving the portfolio Risk for the MVP and the expected returns:
Where:
The above is normalized by the utility function and square root.
Finally we will add a Market Capital Line (MCL) to represent the market's risk and returns at the time of this analysis, any portfolio under this line might be overpriced and suboptimal, any higher might be underpriced for its risk.
MCL is calculated on every portfolio as follows:
Where:
When plotted against the expected returns, the CML will intercept the efficient frontier at the TAN portfolio, this being the most optimal market portfolio.
MAX_RETS = 0.51
TEN_BASIS_POINTS = 0.0001 * 10
c = np.sum(equal_weights * min_risk_vect) # Constant term
b = np.sum(rets * min_risk_vect) # Linear term
a = np.sum(rets * expect_ret_vect) # Quadratic term
utility_func = (a * c) + (-(b**2)) # U(X) to penalize risk
# The frontier curve & MCL, scaled by utility function
exp_rets = np.arange(0, MAX_RETS, TEN_BASIS_POINTS)
ports_risk_frontier = np.sqrt(
((c * (exp_rets**2)) - (2 * b * exp_rets) + a) / utility_func
)
mcl_vector = exp_rets * (1 / np.sqrt(a))
plt.figure(figsize=(12, 6))
plt.plot(
ports_risk_frontier,
exp_rets,
linestyle="--",
color="blue",
label="Efficient Frontier",
linewidth=2,
alpha=0.6,
)
plt.plot(
mcl_vector,
exp_rets,
label="MCL",
linewidth=2,
alpha=0.6,
color="black",
)
plt.scatter(mvp_risk, mvp_ret, color="green", label="MVP")
plt.annotate(
f"MVP\nRisk: {mvp_risk*100:.2f}%\nReturn: {mvp_ret*100:.2f}%",
(mvp_risk, mvp_ret),
textcoords="offset points",
xytext=(-30, 10),
)
plt.scatter(tan_risk, tan_ret, color="red", label="TAN")
plt.annotate(
f"Tangency\nRisk: {tan_risk*100:.2f}%\nReturn: {tan_ret*100:.2f}%",
(tan_risk, tan_ret),
textcoords="offset points",
xytext=(10, 10),
)
plt.legend(loc="upper left", fontsize=10)
plt.xlabel("Risk %")
plt.ylabel("Returns %")
plt.tight_layout()
plt.show()
With the above formulas, we can solve the optimal portfolio for a target return, by taking the sum of the products of the expected return and the tangent portfolio contributions, and deduct the MVP contributions. We again normalize the weights with the utlity function denominator.
We assume that the target return is never less than the MVP returns:
TARGET_RET = 0.3
pt_port = None
opt_risk = None
opt_ret = None
mvp_weights = (a - (b * TARGET_RET)) / utility_func
tan_weights = ((c * TARGET_RET) - b) / utility_func
opt_port_weights = (mvp_weights * min_risk_vect) + (tan_weights * expect_ret_vect)
opt_ret = np.sum(opt_port_weights * rets)
opt_risk = np.sqrt(((c * (opt_ret**2)) - (2 * b * opt_ret) + a) / utility_func)
plt.figure(figsize=(12, 6))
plt.plot(
ports_risk_frontier,
exp_rets,
linestyle="--",
color="blue",
label="Frontier",
linewidth=2,
alpha=0.6,
)
plt.scatter(opt_risk, opt_ret, color="green", label="Min Variance")
plt.annotate(
f"Optimal \nRisk: {opt_risk*100:.2f}%\nReturn: {opt_ret*100:.2f}%",
(opt_risk, opt_ret),
textcoords="offset points",
xytext=(-30, 10),
)
plt.legend(loc="upper left", fontsize=10)
plt.xlabel("Risk %")
plt.ylabel("Returns %")
plt.tight_layout()
plt.show()
Let's add everything together in a graph so we can visualize our optimal and frontier portfolios (the lengthy code is available in the notebook):
Conclusion
This was not an easy article to digest, though it described a fundamental topic in quantitative finance.
We constructed optimal portfolios, showcased the Capital Market Line, and solved portfolios for risk and returns expectations. We understood how the Minimum Variance and Tangency Portfolios are efficient portfolios, and learned to build these from our universe of assets.
Be warned, if you run the above code on the current market regime only (post-covid 2020-2023) you'll see unexpected results or extreme MVP or TAN weights, this is because of the high volatility the market experienced in this period. This is also a spotlight on the weaknesses of the frontier, in that it is dependant on past performance, and is blind to market events and changes in the market regime.
References
Github
Article here is also available on Github
Kaggle notebook available here
Media
All media used (in the form of code or images) are either solely owned by me, acquired through licensing, or part of the Public Domain and granted use through Creative Commons License.
CC Licensing and Use
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.