Online Portfolio Optimization with Risk Control

YAMIM, J. D. M.; BORGES, C. C. H.; NETO, R. F.

doi:10.5540/tcam.2021.022.03.00475

ABSTRACT

Portfolio selection is undoubtedly one of the most challenging topics in the area of finance. Since Markowitz’s initial contribution in 1952, portfolio allocation strategies have been intensely discussed in the literature. With the development of online optimization techniques, dynamic learning algorithms have proven to be an effective approach to building portfolios, although they do not assess the risk related to each investment decision. In this work, we compared the performance of the Online Gradient Descent (OGD) algorithm and a modification of the method, that takes into account risk metrics controlling for the Beta of the portfolio. In order to control for the Beta, each asset was modeled using the CAPM model and a time varying Beta that follows a random walk. We compared both the traditional OGD algorithm and the OGD with Beta constraints with the Uniform Constant Rebalanced Portfolio and two different indexes for the Brazilian market, composed of small caps and the assets that belong to the Ibovespa index. Controlling the Beta proved to be an efficient strategy when the investor chooses an appropriate interval for the Beta during bull markets or bear markets. Moreover, the time varying Beta was an efficient metric to force the desired correlation with the market and also to reduce the volatility of the portfolio during bear markets.

Keywords:
online gradient descent; portfolio optimization; time varying CAPM

1 INTRODUCTION

The portfolio selection problem (PSP) is a decisive process in which the investor must allocate a quantity of wealth to a set of assets within a time horizon. To solve problems, the investor decides how much of his wealth will go to each of the assets available in the market. Each asset represents a distinct investment opportunity and a decision made for an allocation is a portfolio. In this problem, the investor seeks to allocate his money in a stock market to get a good relationship between expected return and risk. In general, higher return portfolios are associated with higher risks.

Choosing the optimal portfolio is as old a problem as the stock market itself. However, it was from the work of Markowitz ²³23 H. Markowitz. Portfolio selection. The journal of finance, 7(1) (1952), 77-91. in 1952 that this question became a mathematical problem. The model proposed by Markowitz, known classically as Average Variance (MV), marks the beginning of the modern Portfolio Theory, presenting risk and portfolio diversification as factors inherent in investment decisions. Based on statistical assumptions, model MV aims to maximize the return for a certain level of risk or minimize the risk for a certain level of return. However, in practical applications, it is difficult to find adequate probability distributions to describe the price of assets. As an alternative to this problem, Cover ⁸8 T.M. Cover. Universal portfolios. Mathematical finance, 1(1) (1991), 1-29. proposed a portfolio optimization model that did not rely on statistical assumptions. The Universal Portfolios (UP) algorithm, introduced by Cover, marks the beginning of a new dynamic investment strategy called Online Portfolio Selection (OPS).

Li and Hoi ¹⁹19 B. Li & S.C. Hoi. Online portfolio selection: A survey. ACM Computing Surveys (CSUR), 46(3) (2014), 35. point out that in recent decades, approaches based on machine learning techniques have been intensively applied, becoming an important and active area of research.

For Dochow ¹¹11 R. Dochow. “Online algorithms for the portfolio selection problem”. Springer (2016)., there are two communities that differ considerably on the modeling of this problem, namely: (i) the community of finance researchers, influenced mainly by Markowitz’s work ²³23 H. Markowitz. Portfolio selection. The journal of finance, 7(1) (1952), 77-91., which focused on the study of market risk, assessing performance through statistical tools; (ii) the machine learning research community, influenced by Cover’s work ⁸8 T.M. Cover. Universal portfolios. Mathematical finance, 1(1) (1991), 1-29., based on the concept of competitive analysis, focuses on maximizing wealth at the end of the investment horizon and avoiding regular statistical assumptions.

Dochow ¹¹11 R. Dochow. “Online algorithms for the portfolio selection problem”. Springer (2016). classifies the interaction between the two lines, in relation to the works in the literature, as low or nonexistent, and points out this lack of integration as the reason why risk structure analysis in portfolios built through algorithms Online optimization still find itself as an open question.

Online optimization algorithms only look at portfolio returns, but of course the decision to invest in an asset is not only related to the total return obtained, the related risk must be evaluated, since it is the combination of these two factors that have an effect on the value of the asset. In this sense, this work combines strong features of the portfolio selection method developed by the finance community, which consider risk aspects, with online methods that are by nature non-parametric, highly adaptive and computationally efficient.

The proposed solution was to implement the Online Gradient Descent (OGD) algorithm and modify the projection step of the algorithm to be considered important risk factors in the portfolio composition. In particular, we used the Capital Asset Price Model (CAPM) model with time-varying β coefficients in order to control the risk of the portfolio investments.

2 RELATED WORK

It in known ³3 P.H. Algoet & T.M. Cover. Asymptotic optimality and asymptotic equipartition properties of log-optimum investment. The Annals of Probability, (1988), 876-898.^{), (}⁵5 R. Bell & T.M. Cover. Game-theoretic optimal portfolios. Management Science, 34(6) (1988), 724-733.^{), (}⁶6 R.M. Bell & T.M. Cover. Competitive optimality of logarithmic investment. Mathematics of Operations Research, 5(2) (1980), 161-166.^{), (}¹⁸18 J.L. Kelly. A new interpretation of information rate. Bell Labs Technical Journal, 35(4) (1956), 917- 926. that CRP portfolios are a strong benchmark for portfolio optimization. Therefore, machine learning methods usually tracks the best constant rebalanced portfolio (BCRP) as proposed by Cover ⁸8 T.M. Cover. Universal portfolios. Mathematical finance, 1(1) (1991), 1-29..

Given the BCRP as a benchmark, a common metric to measure the efficiency of machine learning methods is to measure the loss in gain with resp to the BCRP. This defines an objective function denoted as Regret. The goal of an online machine learning method is to minimize the Regret over the investment horizon. According to Li and Hoi ¹⁹19 B. Li & S.C. Hoi. Online portfolio selection: A survey. ACM Computing Surveys (CSUR), 46(3) (2014), 35., under the prism of online machine learning, these strategies can be grouped into three categories, Follow-the-Winner (FTW), Pattern-Matching Approache (PMA) and Meta-Learning Algorithms (MLA), according to the approach used.

According to Hazan ¹⁶16 E. Hazan et al. Introduction to online convex optimization. Foundations and Trends in Optimization, 2(3-4) (2016), 157-325., FTW algorithms are most applicable to the portfolio selection context. Basically, the algorithms that use this strategy aim to follow the BCRP increasing the weight of the assets that concentrate the highest return. Cover and Ordentlich ⁹9 T.M. Cover & E. Ordentlich. Universal portfolios with side information. IEEE Transactions on Information Theory, 42(2) (1996), 348-363. proposed the Weighted Universal Portfolio algorithm which has Regret O(log T), and time complexity O(T ⁿ ) , where n is the amount of shares and T the number of periods analyzed. Since the complexity grows exponentially with o number of stocks, this algorithm is usually infeasible in practice.

Helmbold et al. ¹⁷17 D.P. Helmbold, R.E. Schapire, Y. Singer & M.K. Warmuth. On-Line Portfolio Selection Using Multiplicative Updates. Mathematical Finance, 8(4) (1998), 325-347. introduced the Exponential Gradient Algorithm, a variation of the gradient descent optimization method, which has linear processing complexity per asset but with Regret $O (\sqrt{T})$ .

With a strategy similar to that used in ⁸8 T.M. Cover. Universal portfolios. Mathematical finance, 1(1) (1991), 1-29., the Successive Constant Rebalanced Algorithm algorithm proposed by Gaivoronski and Stella ¹³13 A.A. Gaivoronski & F. Stella. Stochastic nonstationary optimization for finding universal portfolios. Annals of Operations Research, 100(1) (2000), 165-188. discretizes the set of viable solutions in a simplex by iteratively selecting the best portfolios each period. negotiation.

Zinkevich ²⁵25 M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In “Proceedings of the 20th International Conference on Machine Learning (ICML-03)” (2003), p. 928-936. has proposed the Online Gradient Descent algorithm which uses first order information, ie first derivatives, and therefore tries to approximate functions by linear functions, reaching Regret $O (\sqrt{T})$ . The studies ²2 A. Agarwal, E. Hazan, S. Kale & R.E. Schapire. Algorithms for portfolio management based on the newton method. In “Proceedings of the 23rd international conference on Machine learning”. ACM (2006), p. 9-16.^{), (}¹⁵15 E. Hazan. The convex optimization approach to regret minimization. Optimization for machine learning, (2012), 287.^{), (}²²22 H. Luo, C.Y. Wei & K. Zheng. Efficient online portfolio with logarithmic regret. In “Advances in Neural Information Processing Systems” (2018), p. 8235-8245. use second-order information on the current return of each asset, thus exploiting the curvature of reward functions. This additional information allows the Newton Step Online algorithm to get Regret O(log T). Works such as ¹⁰10 P. Das, N. Johnson & A. Banerjee. Online lazy updates for portfolio selection with transaction costs. In “Twenty-Seventh AAAI Conference on Artificial Intelligence” (2013).^{), (}¹⁴14 Y. Ha & H. Zhang. Liquidity risks, transaction costs and online portfolio selection. Transaction costs and Online Portfolio Selection (April 26, 2019), (2019).^{), (}²⁰20 B. Li, J. Wang, D. Huang & S.C. Hoi. Transaction cost optimization for online portfolio selection. Quantitative Finance, 18(8) (2018), 1411-1424. have focused on considering transaction costs as a way to enable brokers and investors to use online algorithms, but on the other hand, do not consider the risk of the built portfolio.

3 PROBEM SETTING

Before formulating the problem, let’s define the CRP portfolio. Due to daily financial market movements, the portfolio will have a different allocation from the original asset allocation. CRP is an investment strategy in which the portfolio is rebalanced in such a way that the proportion allocated to each of the assets in the original portfolio is preserved.

In a CRP portfolio, each time period t is a buy and sell asset transaction so that the wealth percentage remains constant. Let $b_{t} \in ℝ_{+}^{n}$ be a vector that denotes the percentage of wealth allocated in the total number of available assets n and denote b _it the i−th entry of the vector b _t , i.e., b _it is the percentage of wealth allocated in asset i at time t. Then, we have under the CRP portfolio that $b_{i t} = b_{i (t + 1)} = b$ for every i, t.

We also define $r_{t} \in ℝ^{n}$ as the vector containing the return of each of the n assets at time t. Then, under the CRP portfolio, the total wealth over an investment period T is given by,

f_{t} (C R P (b)) = ⊤_{t = 1}^{T} b^{⊥} r_{t} .

(3.1)

Note that the previous function is linear with respect to b and therefore linear and continuous. Let 𝕊ⁿ denote the unit sphere in ℝⁿ . Imposing the additional assumption that $Σ b_{i} = 1$ , i.e., there is no leverage and the total allocation must sum to one, we can formulate the problem of finding the allocation portfolio b that maximizes the wealth of the CRP portfolio as

b * = \underset{b \in ℝ_{+}^{n} \cap S^{d}}{a r g m a x} = \prod_{t = 1}^{T} (b^{⊤} r_{t}),

(3.2)

which is well defined since $ℝ_{+}^{n} \cap S^{d}$ is a compact set the the maximum is achieved by Weierstrass theorem. Moreover, The problem is a maximization problem of a concave function and can be solved as a convex optimization problem. In what follows, we denote b ^* as the Best Constant Rebalanced Portfolio (BCRP).

The challenge with the previous formulation is that it requires knowledge of all instances $r_{t}, t = 1, \dots, T$ . However, we are interested in the cases where the investment strategy is a nonantecipative policy, and we can can use the information $r_{i}, i = 1, \dots, t - 1$ in order to make a, investment decision at t. Formally, we require the policy to be masurable with respect to the information set available up to time t.

Now we introduce the investor’s (decision-maker’s) problem. In each trading period t, for $t = 1, \dots, T$ , an investor chooses an allocation strategy b _t . At the end of that period, the investor collects a return (possibly negative) of $b_{t}^{'} r_{t}$ . In order to choose the allocation b _t . We denote 𝒫 as the set of nonantecipative policies π such that π is measurable with respect to the past history of returns $r_{1}, \dots, r_{t - 1}$ . After T periods the accumulated wealth is given by $\prod_{t = 1}^{T} (b_{t}^{⊤} r_{t}) .$ .

Now we define the main objective function that we use to quantify performance of some policy $π \in P$ . As commonly done in the literature of online optimization, we define the Regret function as the suboptimality with respect to some benchmark. In our case, we proceed as ²2 A. Agarwal, E. Hazan, S. Kale & R.E. Schapire. Algorithms for portfolio management based on the newton method. In “Proceedings of the 23rd international conference on Machine learning”. ACM (2006), p. 9-16.^{), (}⁹9 T.M. Cover & E. Ordentlich. Universal portfolios with side information. IEEE Transactions on Information Theory, 42(2) (1996), 348-363. and define the benchmark is the BCRP. I.e., someone with hindsight over all the returns $r_{1}, \dots, r_{T}$ that has to choose a fixed allocation b ^* for every time period. Minimizing the Regret with respect to the BCRP is equivalent to minimize the Regret between the Log return of te BCRP and the investor’s strategy. Then, we have that

R e g r e t (A l g) = \sum_{t = 1}^{T} \log ({(b *)}^{⊤} r_{t}) - \sum_{t = 1}^{T} \log (b_{t}^{⊤} r_{t})

(3.3)

and our goal is to apply sublinear Regret algorithms for this objective function. In particular, if the proposed policy is sublinear, then, on average, the asymptotic behavior of algorithm matches the BCRP. In the next section, we review some aspect of asset pricing and risk measures before we introduce the policies that we analyze.

4 RISK AND THE CAPM MODEL

This section is divided in three parts. First we review the standard CAPM model. Second, we show how the Kalman Filter can be used to estimate time varying Betas modelling the CAPM process with a hidden factor that describes the beta of the assets. Third, we introduce the risk metrics VaR and CVaR that will be used to compare the risk among our proposed policies and alternative portfolios that will be introduced later in the text. The reader familiar with the asset pricing theory and risk metrics is invited to skip this section.

4.1 CAPM Model

Based on the classic model proposed by Markowitz in ²³23 H. Markowitz. Portfolio selection. The journal of finance, 7(1) (1952), 77-91., the works ²¹21 J. Lintner. The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. The review of economics and statistics, (1965), 13-37.^{), (}²⁴24 W.F. Sharpe. Capital asset prices: A theory of market equilibrium under conditions of risk. The journal of finance, 19(3) (1964), 425-442. started a capital asset pricing model classically called CAPM (Capital Asset Price Model). Until the mid-1950s, the consensus was that performance should be measured by return over a period without risk adjustment. The CAPM introduced two new premises to Markowitz’s classic model: that of homogeneous expectations and risk-free rate. The assumption of homogeneous expectations says that investors have the same perspectives on expected returns, standard deviation and covariance of assets (efficient market assumption). The assumption of the free rate is that there is an investment in the market in which its remuneration is assured exactly as expected, economic and cyclical factors do not have the ability to affect the liquidity of such an investment.

The model works as follows. Let r _i denote the return of an asset i and r _f denote the return of the risk-free asset and r _m denotes the return of the market. Then,

r_{i} - r_{f} = β_{i} (r_{m} - r_{f}) + ε_{i},

(4.1)

for

β_{i} = \frac{c o v (r_{i}, r_{m})}{σ_{m}^{2}},

where ε is a idiosyncratic shock with $E ε_{i} = 0$ and $E ε_{i}^{2} = σ_{i}^{2}$ characterizes the systemic risk of the asset i and σ _m is the market volatility.

This relationship between the return achieved versus risk incurred can be extended to assess portfolio performance. To analyze its performance, it is necessary to compare the portfolio in question with other alternatives available in the market.

Of course there are criticisms of the model, the most notable being the sensitivity of β in relation to the estimation period. It is reasonable to assume that the risk of a particular company changes over time, and that this investor perception is very difficult to predict.

To incorporate this aspect, some authors propose the use of structural models with β being a latent variable in time. That is, not being observed directly and need to be estimated ⁷7 R. Carmona. “Statistical analysis of financial data in R”, volume 2. Springer (2014).. To estimate such structural models, a widely used tool is Kalman filters, where under certain hypotheses of normality of observations it is possible to propose a structure for the dynamics of β of assets and extract it with optimization algorithms ¹²12 J. Durbin & S.J. Koopman. “Time series analysis by state space methods”, volume 38. OUP Oxford (2012)..

4.2 Time Varying β

In order to estimate the β _i of each asset, we used a state-space model where the β _i is a latent variable or equivalently, a hidden factor. We assume that the $β_{i}^{'} s$ at each time is a random walk with Gaussian shocks, generating a stochastic level model.

The problem is formulated as follows. The return of a specific asset i at period t is given by

r_{i, t} = β_{t}^{i} r_{m, t} + r_{f, t} + ε_{i, t},

(4.2)

where r _m,t is the return of the market at period t and r _f,t is the risk-free rate at period t and ε _i,t is an idiosyncratic shock at period t in asset i with zero mean and variance $σ_{i}^{2}$ that differs among assets. Moreover, $β_{t}^{i}$ follows a random-walk process, i.e.

β_{t + 1}^{i} = β_{t}^{i} + v_{t}^{i},

(4.3)

where ν _i is a gaussian random variable with zero mean and variance $σ_{β, i}^{2}$ .

The parameters of the model $σ_{i}^{2}, σ_{β i}^{2}$ and the initial condition $β_{0}^{i}$ can be estimated by maximum likelihood and the state vector $β_{t}^{i}$ of each asset i can be extracted using the Kalman filter recursive equations.

For the proposed model, the Kalman-filter update equations will be given by

ε_{i, t} = r_{i, t} - {\hat{r}}_{i, t},

(4.4)

F_{t} = P_{t} + σ_{i}^{2},

(4.5)

β_{t | t}^{i} = β_{t}^{i} + P_{t} r_{m, t} F_{t}^{- 1} ε_{t}^{i},

(4.6)

P_{t | t} = P_{t} - P_{t} F_{t}^{- 1} P_{t},

(4.7)

where $K_{t} - P_{t} F_{t}^{- 1}$ is the Kalman gain. The prediction equations are

β_{t + 1}^{i} = β_{t | t}^{i},

(4.8)

P_{t + 1} = P_{t | t} + σ_{β, i}^{2},

(4.9)

{\hat{r}}_{i, t + 1} = β_{t}^{i} + r_{f, t} .

(4.10)

and P _t , P _t|t are the predicted, updated covariance of the state β _t .

4.3 Risk Metric Performance

In order to compare the risk performance among different strategies, we will make use of two measures widely applied in financial studies, the VaR and CVaR.

VaR is the assessment of the maximum potential loss that an investor would be exposed to over a given time horizon, for a specified confidence interval (α confidence level), ie it attempts to summarize the maximum expected loss in only one number. within a time horizon, to a certain degree of statistical confidence. VaR can be interpreted as the amount by which losses will not exceed (1 − α)% of scenarios. Generally speaking, a portfolio’s VaR represents a higher quantile of the portfolio’s estimated loss (or a lower return quantile). Artzner et al. ⁴4 P. Artzner, F. Delbaen, J.M. Eber & D. Heath. Coherent measures of risk. Mathematical finance, 9(3) (1999), 203-228. set VaR to 100.(1 − α)% bail level as:

V a R = i n f \{r | P (R \leq r) > α\}

Where r is the return pertaining to the portfolio distribution, $i n f \{r | A\}$ is the lower limit of r given a A event, and $i n f \{r | P (R \leq r) > α\}$ indicates the smallest percentile of the return distribution.

CVaR is a measure that indicates the average loss that exceeds VaR, that is, given a probability α, CVaR is defined as the average of returns less than the (1 − α) quantile of distribution of returns.

If all scenarios have the same probability of occurrence, CVaR is computed as the expected return of the (1 − α)% worst case scenario. The α level CVaR can be formalized as ¹1 C. Acerbi & D. Tasche. Expected shortfall: a natural coherent alternative to value at risk. Economic notes, 31(2) (2002), 379-388.:

C V a R_{α} = E [r | r \geq V a R_{α} (r)]

Where r represents the portfolio’s return set, and VaR _α (r) is VaR with 100∗(1 − α)% confidence level.

5 IMPLEMENTATION DETAILS

5.1 Online Algorithms

In this work, we tested a direct implementation of the Online Gradient Descent algorithm and also the Online Gradient Descent algorithm with constrains on the Beta of the portfolio.

As defined in equation 3.3, the Regret is given by

R e g r e t (A l g) = \sum_{t = 1}^{T} \log ({(b *)}^{⊤} r_{t}) - \sum_{t = 1}^{T} \log (b_{t}^{⊤} r_{t}),

(5.1)

where b _t is the decision variable (allocation) at time t. Taking the derivative with respect to b _t we get that

\nabla R e g r e t = - \frac{1}{\log (r_{t}^{⊤} b_{t})} r_{t},

(5.2)

which shows that the gradient direction is a normalized version of the observed returns for each asset. Moreover, we define the set of possible allocations ℬ as

B = \{b \in ℝ^{n} | \sum_{i = 1}^{n} b_{i} = 1, b \geq 0\} .

(5.3)

thus, we are not allowing for short positions or leverage and all the wealth must be allocated in every period. We highlight that the algorithm always has the option to allocate wealth in the risk-free bonds, which implies that the allocation of all the wealth in all the periods is not restrictive as it seems in a first look.

Algorithm 1

The algorithm works as follows. At each step we update the gradient of the Regret function, update the allocation vector and project it into the feasible set using an appropriate metric, in this case, the Euclidean norm. The pseudo-code is presented in (1).

Note that at each time t, we iterate the current allocation by taking one step in the opposite direction of that period regret. I.e., we take one step when one seeks to minimize $\log (r_{t}^{⊤} b *) - \log (r_{t}^{⊤} b_{t})$ with respect to b _t . Note that in the next period, we are interested in minimizing the loss $\log (r_{t + 1}^{⊤} b *) - \log (r_{t + 1}^{⊤} b_{t})$ , wich has a different shape since the returns change over time in a nonantecipative fashion. Hence, it is not clear that at each time t we are interested in taking the maximum amount of steps in the descent direction. It is actually the ballance of the changes in r _t over time and the fact that we take only one step in the descent direction that allow the investor to achieve sublinear regret over time. For an in-depth discussion of the convergence rates of OGD applied to online convex optimization, we refer to ¹⁵15 E. Hazan. The convex optimization approach to regret minimization. Optimization for machine learning, (2012), 287.^{), (}²⁵25 M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In “Proceedings of the 20th International Conference on Machine Learning (ICML-03)” (2003), p. 928-936..

For the Online Gradient Descent that controls the risk using the β of the portfolio, we assume intervals [β _min , β _max ] such that the portfolio must satisfy. Since the CAPM model is linear with respect to the β of the assets, the β of the whole portfolio is a linear combination of the β of each asset weighed by the respective invested amount. Therefore,

B_{β} = \{b \in ℝ^{n} | \sum_{i = 1}^{n} b_{i} β_{t}^{i} \in [β_{m i n}, β_{m a x}], \sum_{i = 1}^{n} b_{i} = 1, b \geq 0\},

(5.4)

where $β_{t}^{i}$ is the time varying Beta of the i-th asset at time t. It is important to note that it is not clear that ℬ_β is nonempty. For instance, the intersection of the half-spaces $Σ b_{i} β_{t}^{i} \leq β_{m a x}$ and $Σ b_{i} β_{t}^{i} \geq β_{m i n}$ is empty if there is no asset i at period t such that $β_{t}^{i} \geq β_{m i n}$ or there is no asset i at period t such that $β_{t}^{i} \leq β_{m a x}$ .

The OGD algorithm with risk control is presented in (2)

In order to remove any kind of look-ahead bias, the $β_{t}^{i}$ estimate for the investment in t + 1, can use only the information available from t = 1, ..., t. Therefore, for each asset i at time t, we use only the information given by ${\{r_{t, τ}\}}_{τ = 1}^{t}$ in order to use the Kalman filter. Moreover, we used the Kalman filter with exact diffuse initiation available in the package KFAS at the CRAN repository for R. This allow us with the issue of initializing the paramaters of the filter in a efficient way. For a discussion about Kalman filter algorithms, we refer to ¹²12 J. Durbin & S.J. Koopman. “Time series analysis by state space methods”, volume 38. OUP Oxford (2012)..

As a final comment, we start each algorithm with the UCRP allocation, i.e., $b_{i, 0} = \frac{1}{n}$ and we allocate an equal amount of wealth in every asset available. The choice for the initial allocation is because it do not take any information that might create some bias in the analysis and the UCRP is a strong benchmark by itself since it eliminates the unsystematic risk by diversification.

Algorithm 2

5.2 Dataset

To evaluate the performance of the proposed algorithm, data from the Brazilian stock market were used, from companies that were part of the theoretical portfolios of the Bovespa (IBOVESPA) and Small Caps indices, collected through the Economa´tica software database. The IBOVESPA database contains observations of the stock returns of 59 companies, this number includes companies that are no longer part of the index, as well as companies that were included in the index from January 1, 1998 to December 28, 1998. 2017. Since the Small Index had no information since the beginning of the period, data were collected from 64 companies from January 2, 2009 to December 28, 2017.

The IBOVESPA is the most traditional Brazilian index, having in its theoretical portfolio stocks of companies with high trading volume. The Small Caps index is composed of shares of low capitalization companies, having a low trading volume and consequently less liquidity.

To estimate the CAPM model for both IBOVESPA and Small shares, the weighted arithmetic average of the intrinsic yield of the National Treasury Debt securities issued by the National Treasury and held by the National Treasury was considered as risk free. of the Central Bank of Brazil. As a market return, for the Small index shares, we consider the Ibovespa index, while for the IBOVESPA shares, we use the IBrX-100.

6 RESULTS

6.1 Small Caps - full period

As an initial analysis we tested the performance of the OGD algorithm without controlling for the Beta of the portfolio and compared with the index of small caps (SMALL) and the uniform CRP (UCRP). Among the sets of possible assets, only the assets that belongs to the index and the risk free asset are available. Similarly, the UCRP portfolio is constructed fixing a uniform weight for each asset that belongs to the index small caps including the risk free asset. The results are shown in Figure 1.

Figure 1:
a) Comparison among OGD portfolio, small caps Index and UCRP.; b) Time varying Beta of the OGD portfolio.

We can see that the OGD algorithm outperformed both the index of small caps and the UCRP, whereas the index had the worst performance. However, most of the time the performance of the OGD algorithm is similar to the performance of the UCRP. In addition, we can see that the Beta of the portfolio is correlated with the index IBrX-100 used as proxy for the market return, varying between approximately 0.5 and 1.5 for the period of the investment.

In terms of risk metrics, we can see the comparison between the VaR and the CVaR of the OGD portfolio and the benchmarks in Table 1. The OGD portfolio and the index small caps have a similar risk profile, both higher than the UCRP portfolio.

Thumbnail

Table 1:
Comparison among risk metrics for the OGD portfolio.

In Figure 2, we can see the results for the OGD algorithm when we control for the Beta of the portfolio. In this image, the Beta was chosen to be between −.3 and .1, forcing most of the times a position that is against the index small caps. As a result, especially between 2010 and 2012, which is a period of a high valuation for the index small caps, that the OGD algorithm with positions against the market suffers to make gains while the index values rapidly. However, during the loss period between 2012 and 2016, the OGD portfolio that is against the market is capable of maintain gains and has a higher accumulated return for the overall period.

Figure 2:
a) Comparison among OGD portfolio with Beta between -3 and 0.1 and small caps Index.; b) Time varying Beta of the OGD portfolio with Beta control.

Looking into the behavior of the Beta in Figure 2 (b), we can see that most of the time the Beta is in fact positive. Since we are not allowing the algorithm to assume short positions, it is difficult to built portfolio with negative Beta. The reason is that the assets that belongs to the index small caps have a positive correlation with the market most of the time, which makes the feasible set ℬ_β unfeasible for some periods if we require strictly negative Betas.

Therefore, in order to reduce the positive correlation with the market, the algorithm invests most of the money in the risk free asset, as we can see in Figure 3. We conjecture that allowing the algorithm to assume short positions should allow it to achieve even better results during bear markets.

Figure 3:
Portfolio evolution OGD Beta.

Table 2 shows that the risk metrics are substatially lower than the index small caps even though the returns were also higher for the overall period. This result is expected due to the high concentration of the risk free asset in the portfolio.

Thumbnail

Table 2:
Comparison among risk metrics for the OGD with Beta between -3 and 0.1.

On the other hand, when forcing a positive correlation with the market it is possible to follow the growth periods of the index, but when the index goes down, the OGD portfolio with high positive Beta cannot adapt and suffer with the Bear Market as we can see in Figure 4. This suggests that in order to achieve substantial gains with respect to the market, it is important to adapt the accepted interval for the Beta of the portfolio to capture both bull periods and bear periods.

Figure 4:
a) Comparison among OGD portfolio with Beta between 1.5 and 3 and small caps Index.; b) Time varying Beta of the OGD portfolio with Beta control.

Since we are forcing a Beta larger than one, one also could expect an increase in the portfolio risk, which indeed happened as we can observe in 3.

Thumbnail

Table 3:
Comparison among risk metrics for the OGD with Beta between 1.5 and 3.

6.2 Small Caps - Results for specific periods of time

In this subsection we are going to show that a suitable choice of the Beta interval leads to improvements in the performance in periods of bull or bear markets.

In Figure 5 we can see that the cumulative return of the OGD when forcing a Beta larger than the market leads to improvements both with respect to the index small caps and the benchmark UCRP. This result is consistent with the positive correlation between the index small caps and the market return used as reference (IBrX-100) and shows that a Beta grater than one indeed leverages the market returns in a bull market.

Figure 5:
a) Comparison among OGD portfolio with Beta between 1.9 and 3, small caps Index and UCRP.; b) Time varying Beta of the OGD portfolio with Beta control.

Forcing the Beta of the portfolio to be high during the projection step, speed-up the process of investing in companies riskier but correlated with the market, this allows the portfolio to achieve greater returns than the portfolio OGD without control of the Beta.

Since during all the time the OGD portfolio has a Beta between 0.5 and 1.5 (Figure 1), we can see that the projection step tends to find portfolios in the lower bound of the allowed limit (Figure 5b), and this leverage with respect to the market return indeed paid off during the bull market, also highlighting that the time varying Beta for each asset indeed captures the correlation with the market and is signifficant to predict future correlation.

According to the CAPM theory, if the portfolio is a linear function of the market return with angular coefficient β, then, the variance of the portfolio, should be β ² times the variance of the market. Since we are forcing a high absolute value for the Beta of the portfolio, we should expect an increase in the risk metrics used. This result is exposed in Table 4.

Thumbnail

Table 4:
Comparison among risk metrics for the OGD with Beta between 1.9 and 3.

In Figure 6, it was forced a negative correlation with the market. Since the index small caps and Ibovespa has a positive correlation, it is expected that the OGD portfolio is capable of moving most of the investment to the risk free asset, avoiding the bear market.

Figure 6:
a) Comparison among OGD portfolio with Beta between -3 and 0.1, small caps Index and UCRP.; b) Time varying Beta of the OGD portfolio with Beta control.

Moreover, as discussed previously, since the algorithm is not allowed to assume short positions, in order to satisfy the constrains of the projection step of Algorithm 2, the algorithm has to speed up the process of quit positions in the market and increase the position on the risk free asset, reducing the risk and avoid the Bear market.This behaviour is reflected in the Beta of the portfolio, that goes toward zero as showed in Figure 6 (b).

In terms of risk, we have a similar result than what was presented for the whole period. The allocation in the risk free asset reduced considerably the risk of the portfolio, outperforming the small caps index and the UCRP portfolio both in returns and in risk metrics by several times.

Thumbnail

Table 5:
Comparison among risk metrics for the OGD with Beta between -3 and 0.1.

6.3 IBOV - General Results

Next we present the results using the stocks of the index Ibovespa (IBOV) as possible assets. Again, we use the UCRP and the IBOV itself as benchmark. We kept the IBrX-100 as proxy for the market return.

In Figure 7 (a) we can see that both the UCRP and OGD portfolio outperformed the index IBOV. Moreover, the UCRP portfolio had a higher performance than the OGD portfolio. We can see that most of the time the OGD portfolio tracks the UCRP returns, which is an indicator that the method is not being capable of selecting among the assets, and on average, the gradient has zero mean in every coordinate leading to an almost constant position in the assets available.

Figure 7:
a) Comparison among OGD portfolio, bovespa Index and UCRP.; b) Time varying Beta of the OGD portfolio.

In Figure 7 (b) we can see that the Beta of the portfolio OGD oscillate approximately between 0.5 and 1.3. On average, the Beta of the portfolio is close to one. This is reflected in a similar risk between the OGD portfolio and the market itself. Since the IBOV is also a proxy for the market return, we can see in Table 6 that the risk is similar among the OGD portfolio and the benchmarks.

Thumbnail

Table 6:
Comparison among risk metrics for the OGD portfolio.

Next we evaluated the performance of our approach controlling for the level of risk of the portfolio to see if it is possible to overcome the UCRP portfolio only controlling the risk of the assets.

In 8 we can see the effectiveness of leveraging the market return forcing a high Beta for the OGD algorithm with Beta control.The algorithm was capable of provide higher returns than the UCRP or the index IBOV during the bull market and was the dominant strategy during almost all the period.

As expected, we can see in Figure 8 (b) that most of the time the projected step matches the lower bound of the Beta interval used for the constrains in the portfolio.

Figure 8:
a) Comparison among OGD portfolio with Beta between 1.4 and 3, bovespa Index and UCRP.; b) Time varying Beta of the OGD portfolio with Beta control.

As exposed in Table 7, forcing the Beta to be above 1.4 led to an increase in the risk metrics of the portfolio. We can see that the OGD portfolio was approximately twice as riskier than the benchmarks. However, as in the small caps case, the portfolio presented higher returns for all periods in the bull period selected.

Thumbnail

Table 7:
Comparison among risk metrics for the OGD with Beta between 1.4 and 3.

During the bear market period, we can see that restricting the Beta of the portfolio between -3 and 0.1 makes the portfolio to avoid loss, mainly investing in the free risk asset since short positions were not allowed (Figure 10). The results with the respective Betas are exposed in Figure 9.

Figure 9:
a) Comparison among OGD portfolio with Beta between -3 and 0.1, bovespa Index and UCRP.; b) Time varying Beta of the OGD portfolio with Beta control.

Figure 10:
Portfolio evolution OGD Beta.

Similarly to the small caps case, the main consequence is the decrease of the risk metrics of the portfolio, exposed in Table 8.

Thumbnail

Table 8:
Comparison among risk metrics for the OGD with Beta between -3 and 0.1.

7 CONCLUSION

In this work we explored the benefits of combining a risk control of the portfolio together with the OGD algorithm. Working with time-varying Betas was fundamental to capture properly the correlation of each asset with the market.

There is not much gain in fixing the Beta of the portfolio for large periods of time. However, forcing a positive Beta greater than one in bull markets or less than one in bear markets demonstrated to be an efficient way to improve the returns of the OGD algorithm since the time varying-Beta is capable of predicting short-term correlations of the assets and the market.

During bear periods, forcing a negative or slightly positive Beta forces the algorithm to invest most of the capital in the risk free asset, which avoid losses during this period.

The empirical results demonstrated robustness of the strategies when the portfolio of possible assets was built with small caps index or assets that belongs to the Ibovespa index.

As a direction of future research, one could propose time varying intervals for the allowed Beta of the portfolio and extend the feasible set to allow for short positions.

Acknowledgments

The authors are gratefully acknowledged to CNPq, CAPES and FAPEMIG by financial support.

REFERENCES

¹
C. Acerbi & D. Tasche. Expected shortfall: a natural coherent alternative to value at risk. Economic notes, 31(2) (2002), 379-388.
²
A. Agarwal, E. Hazan, S. Kale & R.E. Schapire. Algorithms for portfolio management based on the newton method. In “Proceedings of the 23rd international conference on Machine learning”. ACM (2006), p. 9-16.
³
P.H. Algoet & T.M. Cover. Asymptotic optimality and asymptotic equipartition properties of log-optimum investment. The Annals of Probability, (1988), 876-898.
⁴
P. Artzner, F. Delbaen, J.M. Eber & D. Heath. Coherent measures of risk. Mathematical finance, 9(3) (1999), 203-228.
⁵
R. Bell & T.M. Cover. Game-theoretic optimal portfolios. Management Science, 34(6) (1988), 724-733.
⁶
R.M. Bell & T.M. Cover. Competitive optimality of logarithmic investment. Mathematics of Operations Research, 5(2) (1980), 161-166.
⁷
R. Carmona. “Statistical analysis of financial data in R”, volume 2. Springer (2014).
⁸
T.M. Cover. Universal portfolios. Mathematical finance, 1(1) (1991), 1-29.
⁹
T.M. Cover & E. Ordentlich. Universal portfolios with side information. IEEE Transactions on Information Theory, 42(2) (1996), 348-363.
¹⁰
P. Das, N. Johnson & A. Banerjee. Online lazy updates for portfolio selection with transaction costs. In “Twenty-Seventh AAAI Conference on Artificial Intelligence” (2013).
¹¹
R. Dochow. “Online algorithms for the portfolio selection problem”. Springer (2016).
¹²
J. Durbin & S.J. Koopman. “Time series analysis by state space methods”, volume 38. OUP Oxford (2012).
¹³
A.A. Gaivoronski & F. Stella. Stochastic nonstationary optimization for finding universal portfolios. Annals of Operations Research, 100(1) (2000), 165-188.
¹⁴
Y. Ha & H. Zhang. Liquidity risks, transaction costs and online portfolio selection. Transaction costs and Online Portfolio Selection (April 26, 2019), (2019).
¹⁵
E. Hazan. The convex optimization approach to regret minimization. Optimization for machine learning, (2012), 287.
¹⁶
E. Hazan et al. Introduction to online convex optimization. Foundations and Trends in Optimization, 2(3-4) (2016), 157-325.
¹⁷
D.P. Helmbold, R.E. Schapire, Y. Singer & M.K. Warmuth. On-Line Portfolio Selection Using Multiplicative Updates. Mathematical Finance, 8(4) (1998), 325-347.
¹⁸
J.L. Kelly. A new interpretation of information rate. Bell Labs Technical Journal, 35(4) (1956), 917- 926.
¹⁹
B. Li & S.C. Hoi. Online portfolio selection: A survey. ACM Computing Surveys (CSUR), 46(3) (2014), 35.
²⁰
B. Li, J. Wang, D. Huang & S.C. Hoi. Transaction cost optimization for online portfolio selection. Quantitative Finance, 18(8) (2018), 1411-1424.
²¹
J. Lintner. The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. The review of economics and statistics, (1965), 13-37.
²²
H. Luo, C.Y. Wei & K. Zheng. Efficient online portfolio with logarithmic regret. In “Advances in Neural Information Processing Systems” (2018), p. 8235-8245.
²³
H. Markowitz. Portfolio selection. The journal of finance, 7(1) (1952), 77-91.
²⁴
W.F. Sharpe. Capital asset prices: A theory of market equilibrium under conditions of risk. The journal of finance, 19(3) (1964), 425-442.
²⁵
M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In “Proceedings of the 20th International Conference on Machine Learning (ICML-03)” (2003), p. 928-936.

Publication Dates

Publication in this collection
06 Sept 2021
Date of issue
Jul-Sep 2021

History

Received
23 Jan 2020
Accepted
03 Mar 2021

This is an open-access article distributed under the terms of the Creative Commons Attribution License

[1] ¹
C. Acerbi & D. Tasche. Expected shortfall: a natural coherent alternative to value at risk. Economic notes, 31(2) (2002), 379-388.

[2] ²
A. Agarwal, E. Hazan, S. Kale & R.E. Schapire. Algorithms for portfolio management based on the newton method. In “Proceedings of the 23rd international conference on Machine learning”. ACM (2006), p. 9-16.

[3] ³
P.H. Algoet & T.M. Cover. Asymptotic optimality and asymptotic equipartition properties of log-optimum investment. The Annals of Probability, (1988), 876-898.

[4] ⁴
P. Artzner, F. Delbaen, J.M. Eber & D. Heath. Coherent measures of risk. Mathematical finance, 9(3) (1999), 203-228.

[5] ⁵
R. Bell & T.M. Cover. Game-theoretic optimal portfolios. Management Science, 34(6) (1988), 724-733.

[6] ⁶
R.M. Bell & T.M. Cover. Competitive optimality of logarithmic investment. Mathematics of Operations Research, 5(2) (1980), 161-166.

[7] ⁷
R. Carmona. “Statistical analysis of financial data in R”, volume 2. Springer (2014).

[8] ⁸
T.M. Cover. Universal portfolios. Mathematical finance, 1(1) (1991), 1-29.

[9] ⁹
T.M. Cover & E. Ordentlich. Universal portfolios with side information. IEEE Transactions on Information Theory, 42(2) (1996), 348-363.

[10] ¹⁰
P. Das, N. Johnson & A. Banerjee. Online lazy updates for portfolio selection with transaction costs. In “Twenty-Seventh AAAI Conference on Artificial Intelligence” (2013).

[11] ¹¹
R. Dochow. “Online algorithms for the portfolio selection problem”. Springer (2016).

[12] ¹²
J. Durbin & S.J. Koopman. “Time series analysis by state space methods”, volume 38. OUP Oxford (2012).

[13] ¹³
A.A. Gaivoronski & F. Stella. Stochastic nonstationary optimization for finding universal portfolios. Annals of Operations Research, 100(1) (2000), 165-188.

[14] ¹⁴
Y. Ha & H. Zhang. Liquidity risks, transaction costs and online portfolio selection. Transaction costs and Online Portfolio Selection (April 26, 2019), (2019).

[15] ¹⁵
E. Hazan. The convex optimization approach to regret minimization. Optimization for machine learning, (2012), 287.

[16] ¹⁶
E. Hazan et al. Introduction to online convex optimization. Foundations and Trends in Optimization, 2(3-4) (2016), 157-325.

[17] ¹⁷
D.P. Helmbold, R.E. Schapire, Y. Singer & M.K. Warmuth. On-Line Portfolio Selection Using Multiplicative Updates. Mathematical Finance, 8(4) (1998), 325-347.

[18] ¹⁸
J.L. Kelly. A new interpretation of information rate. Bell Labs Technical Journal, 35(4) (1956), 917- 926.

[19] ¹⁹
B. Li & S.C. Hoi. Online portfolio selection: A survey. ACM Computing Surveys (CSUR), 46(3) (2014), 35.

[20] ²⁰
B. Li, J. Wang, D. Huang & S.C. Hoi. Transaction cost optimization for online portfolio selection. Quantitative Finance, 18(8) (2018), 1411-1424.

[21] ²¹
J. Lintner. The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. The review of economics and statistics, (1965), 13-37.

[22] ²²
H. Luo, C.Y. Wei & K. Zheng. Efficient online portfolio with logarithmic regret. In “Advances in Neural Information Processing Systems” (2018), p. 8235-8245.

[23] ²³
H. Markowitz. Portfolio selection. The journal of finance, 7(1) (1952), 77-91.

[24] ²⁴
W.F. Sharpe. Capital asset prices: A theory of market equilibrium under conditions of risk. The journal of finance, 19(3) (1964), 425-442.

[25] ²⁵
M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In “Proceedings of the 20th International Conference on Machine Learning (ICML-03)” (2003), p. 928-936.

	OGD		I_SMALL		UCRP
	1%	5%	1%	5%	1%	5%
VaR	-0.0465759	-0.0288123	-0.0482145	-0.0285919	-0.0451722	-0.0278665
CVaR	-0.0706191	-0.0422679	-0.0707091	-0.0424577	-0.0666187	-0.0399955

	OGD_BETA		I_SMALL
	1%	5%	1%	5%
VaR	-0.0066861	-0.0038280	-0.0482145	-0.0285919
CVaR	-0.0144839	-0.0066819	-0.0707091	-0.0424577

	OGD_BETA		I_SMALL
	1%	5%	1%	5%
VaR	-0.077088	-0.0467464	-0.0482145	-0.0285919
CVaR	-0.103223	-0.0673211	-0.0707091	-0.0424577

	OGD_BETA		I_SMALL		UCRP
	1%	5%	1%	5%	1%	5%
VaR	-0.0951314	-0.0525882	-0.0486287	-0.0237267	-0.0476879	-0.0240799
CVaR	-0.1842483	-0.0909156	-0.1001490	-0.0445529	-0.1030247	-0.0447389

	OGD_BETA		I_SMALL		UCRP
	1%	5%	1%	5%	1%	5%
VaR	-0.0082738	-0.0050617	-0.0478774	-0.0335658	-0.0477659	-0.0320533
CVaR	-0.0137089	-0.0077418	-0.0722964	-0.0440381	-0.0727766	-0.0439273

Brasil

Brasil

Online Portfolio Optimization with Risk Control

ABSTRACT

1 INTRODUCTION

2 RELATED WORK

3 PROBEM SETTING

4 RISK AND THE CAPM MODEL

4.1 CAPM Model

4.2 Time Varying β

4.3 Risk Metric Performance

5 IMPLEMENTATION DETAILS

5.1 Online Algorithms

5.2 Dataset

6 RESULTS

6.1 Small Caps - full period

6.2 Small Caps - Results for specific periods of time

6.3 IBOV - General Results

7 CONCLUSION

Acknowledgments

REFERENCES

Publication Dates

History

	OGD_BETA		IBOV		UCRP
	1%	5%	1%	5%	1%	5%
VaR	-0.0554984	-0.0332847	-0.0536764	-0.0308136	-0.0576220	-0.0333812
CVaR	-0.0853633	-0.0502624	-0.0795072	-0.0468622	-0.0886123	-0.0514335

	OGD_BETA		IBOV		UCRP
	1%	5%	1%	5%	1%	5%
VaR	-0.0848201	-0.0524012	-0.0429216	-0.0284614	-0.0521693	-0.0368487
CVaR	-0.1381093	-0.0788993	-0.0778647	-0.0428889	-0.0881920	-0.0542163

	OGD_BETA		IBOV		UCRP
	1%	5%	1%	5%	1%	5%
VaR	-0.0076789	-0.0053694	-0.0385298	-0.0260806	-0.0541532	-0.0362293
CVaR	-0.0135781	-0.0074571	-0.0695045	-0.0362081	-0.0905884	-0.0499007