This is the companion notebook for the course on the LASSO and its application for sparse portfolios and penalised predictive regressions.

\(\rightarrow\) ABOUT THE NOTEBOOK:

the code chunks are sequential. They must be executed in the correct order.
text between stars appears in bold in html format.

1 Preliminaries

First, we load the packages and import the data. The steps are the same as before. Notably, we immediately add the future/forward returns (F_Return) in addition to the past/current returns (P_Return).

if(!require(glmnet)){install.packages("glmnet", "tictoc", "scales", "ggrepel", "ggcorrplot")}
library(glmnet)                                     # This is THE package for penalised regressions
library(tidyverse)                                  # ... the usual core packages
library(lubridate)                                  # Package for date management
library(ggrepel)                                    # Package for ggplot annotations
library(scales)                                     # Package for scales in ggplot
library(tictoc)                                     # Package for computation time
library(ggcorrplot)                                 # Visualizing the correlation matrices
load("data.RData")                                  # Loading the data
data <- data %>% arrange(Date,Tick)                 # Just making sure all is in order
tick <- unique(data$Tick)                           # Set of assets
data <- data  %>% 
    select(-ESG_rank) %>%                           # Remove ESG: too many missing points
    group_by(Tick) %>%                              # Group asset by asset
    mutate(P_Return = Close / lag(Close) - 1) %>%   # Adding past returns
    mutate(F_Return = lead(P_Return)) %>%           # Adding forward returns: could to better
    na.omit()                                       # Remove missing points

Let’s have a look at correlations.

data[,4:10] %>% cor() %>% round(3) %>%              # Correlation matrix
  ggcorrplot(type = "upper", lab = TRUE)

There seems to be a positive link between Mk_Cap and Profitability. Let’s have a closer look. We build a pivot-table of averages over all firms and plot the result.

data %>%
  group_by(Tick) %>%                                # Building a pivot table
  summarise(avg_cap = mean(Mkt_Cap),
            avg_prof = mean(Prof_Marg)) %>%
  ggplot(aes(x = avg_cap, y = avg_prof)) + geom_point() +
  geom_text_repel(aes(label = Tick)) + theme_light() +
  ylab("Average Profitability Margin (%)") + xlab("Average Market Capitalization (K$)") +
  geom_smooth(method = "lm", se = FALSE)

The coefficients are available at:

data %>%
  group_by(Tick) %>%                                # Building a pivot table
  summarise(avg_cap = mean(Mkt_Cap),
            avg_prof = mean(Prof_Marg)) %>%
  lm(avg_prof ~ avg_cap, data = .)                  # Compute the regression

## 
## Call:
## lm(formula = avg_prof ~ avg_cap, data = .)
## 
## Coefficients:
## (Intercept)      avg_cap  
##   6.328e+00    3.378e-05

Indeed, even at the aggregate level, the positive relationship seems to hold! The blue line shows the best linear fit for the data.

2 Simple predictions

Next, we look at the impact of the regularisation intensities on the outcome of penalised regressions.

2.1 First, a linear model

We choose to work with a simple example. We investigate if the cross-section of characteristics can help understand the cross-section of future returns (directly, without resorting to the definition of asset-pricing factors). To do so, we normalise the firm characteristics using the scale() function. The scaling is performed each point in time. Finally, we estimate the following regression:

\[R_{t+1}=a + b^{cap}cap_t+\dots +b^{P2B}P2B_t+\epsilon_t.\]

Note that characteristics are lagged compared to returns: the scheme is clearly predictive.
Note: below, we scale returns, which may not be a great idea… In fact, it’s a pooled panel model.

data %>% 
    group_by(Date) %>%                           # Grouping to normalise on a date basis
    mutate_if(is.numeric, scale) %>%             # Scaled chars
    ungroup() %>%                                # Ungroup: global variable
    select(-Date, -Tick, -Close) %>%             # Remove irrelevant columns 
    lm(F_Return ~ ., data = .) %>%               # Perform the regression
    summary() %>%                                # Get the summary
    "$"(coef) %>%                                # Keeping only the coefs & stats
    round(3) %>%                                 # Round up to 3 digits
    data.frame()                                 # Convert to dataframe

##             Estimate Std..Error t.value Pr...t..
## (Intercept)    0.000      0.011   0.000    1.000
## Vol_1M         0.015      0.012   1.292    0.196
## Mkt_Cap       -0.044      0.012  -3.586    0.000
## P2B           -0.002      0.012  -0.164    0.870
## D2E           -0.023      0.012  -1.939    0.052
## Prof_Marg      0.024      0.012   1.967    0.049
## P_Return      -0.012      0.012  -1.021    0.307

Given our sample, the most impactful variables are the market capitalisation, profitability margin and the debt-to-equity ratio. Those are the variables for which the last column (p-value) is smaller than 5%. If we adopt a more conservative threshold (1%), only the Mkt_Cap remains significant. Notably, the past returns have little predictive power over future returns.

2.2 First step with the LASSO

We start with the general definition of penalised regression proposed by the elasticnet (Zou & Hastie (2005)). It can be characterised by its Lagrange form:

\[\underset{\mathbf{\beta}}{\text{min}} \, \left\{(\mathbf{Y}- \mathbf{X} \mathbf{\beta})'(\mathbf{Y}- \mathbf{X} \mathbf{\beta})+\lambda \alpha || \mathbf{\beta} ||_1+\lambda (1-\alpha)||\mathbf{\beta}||_2^2\right\},\] where \(\lambda>0\), \(\alpha\in[0,1]\). When \(\alpha=1\), we recover the so-called LASSO (Tibshirani (1996)) and when \(\alpha=0\), we get the classical ridge regression, which is a very smooth regularisation of the usual regression. In our case, the predicted value \(\mathbf{Y}\) will be the future return \(\mathbf{R}_{t+1}\) and \(\mathbf{X}\) the other scaled variables.

We start with the case of the LASSO. When running, glmnet() performs a series of LASSO estimations with many possible values for the penalisation constant (lambda in the function’s arguments).

data_lasso <-  data %>% 
    group_by(Date) %>%                                      # Grouping to normalise, date-by-date
    mutate_if(is.numeric,scale) %>%                         # Scaled chars
    ungroup()                                               # Ungroup: global variable
y <- data$F_Return                                          # Dependent variable: ORIGINAL returns!
x <- data_lasso %>%                                         # Independent variables
    select(-Tick, -Date, - Close, -F_Return) %>%            # Removing irrelevant columns
    as.matrix()                                             # Transform in base matrix
fit <- glmnet(x,y, alpha = 1)                               # Performing the LASSO: 1 = Lasso, 0 = Ridge
# Below, we format the results
var_names <- colnames(data)[4:10]                           # Names of independent variables
res <- summary(fit$beta)                                    # Summary of the series of LASSO regressions
lambda <- fit$lambda                                        # Values of the penalisation constant
res$Lambda <- lambda[res$j]                                 # Putting the labels where they belong
res$Char <- var_names[res$i] %>% as.factor()                # Adding names of variables to the output
res %>% ggplot(aes(x = Lambda, y = x, color = Char)) +      # Plot!
  geom_line() + theme_light()

The value of coefficients (y-axis) is a function of the penalisation intensity (lambda). As the latter increases, the magnitude of coefficient decreases. Mkt_Cap dominates and the convergence to zero of the other variables is somewhat rapid. The slope for the past return (in green) is the slowest compared to other features.

Overall, the curves are angular.

2.3 Ridge regression

We then switch to ridge regressions. The penalisation in this case is a multiple of the \(L^2\) norm of the coefficients.

fit <- glmnet(x,y, alpha = 0)                          # Performing ridge regression: 1 = Lasso, 0 = Ridge
res <- summary(fit$beta)                               # Summary of the series of ridge regressions
lambda <- fit$lambda                                   # Values of the penalisation constant
res$Lambda <- lambda[res$j]                            # Putting the labels where they belong
res$Char <- var_names[res$i] %>% as.factor()           # Adding the names of variables to the output
res %>% ggplot(aes(x = Lambda, y = x, color = Char)) + # Plot!
  geom_line() + theme_light()

res %>% ggplot(aes(x = Lambda, y = x, color = Char)) + # Logscale plot
  geom_line() + scale_x_log10() + theme_light()

On the above graph, we used a logarithmic scale to discern the relative convergence speed of each coefficient (the bulk of the first graph is unclear). In contrast to the LASSO, the convergence is very smooth.

2.4 Elasticnet: the best of two worlds?

Because the LASSO (harsh convergence) and the ridge regression (smooth convergence) have very different outcomes, there was a natural gap between the two that was filled by the elasticnet (Zou and Hastie 2005). The penalisation in this case is a linear convex combination of both types of norms: \(a||\beta||_1+(1-a)||\beta||_2^2\).

Below, we provide the corresponding plot for \(a=0.005\). The relationship is not linear: a = 0.5 is not the middle point between LASSO and ridge regression.

fit <- glmnet(x,y, alpha = 0.01)                       # The elasticnet: 1 = Lasso, 0 = Ridge
res <- summary(fit$beta)                               # Summary of elasticnet regressions
lambda <- fit$lambda                                   # Values of the penalisation constant
res$Lambda <- lambda[res$j]                            # Putting the labels where they belong
res$Char <- var_names[res$i] %>% as.factor()           # Adding the names of variables to the output
res %>% ggplot(aes(x = Lambda, y = x, color = Char)) + # Plot
  geom_line() + theme_light()

Indeed, the convergence pattern of the coefficients is somewhere between the two extremes.
It is smoother compared to the LASSO and slower compared to the ridge regression.

3 Sparse portfolios

We now turn to the application of the LASSO to sparse portfolios (see Goto & Xu 2015 for the full implementation and Stevens 1998 for the original idea).

3.1 Setup

We first initialise the variables. Sparse portfolios are based on returns only; we thus build a dedicated variable (in matrix/rectangular format).

sep_date <- as.Date("2010-01-01")           # This date separates in-sample vs out-of-sample
t_oos <- data$Date[data$Date>sep_date] %>%  # Out-of-sample dates (i.e., testing set)
    unique() %>%                            # Remove duplicates
    as.Date(origin = "1970-01-01")          # Transform in date format
returns <- data %>%                         # Computing returns, in matrix format, in 2 steps:
    select(Date, Tick, P_Return) %>%        # 1. Keep returns along with dates & firm names
    pivot_wider(names_from = Tick,          # 2. Put in matrix shape
                values_from = P_Return)    
# Below, we initialise the variables used in the backtesting loop
portf_weights <- matrix(0, nrow = length(t_oos), ncol = length(tick)) 
portf_returns <- vector(mode = "numeric", length = length(t_oos))                                                
returns                                     # A look at the returns

## # A tibble: 253 x 31
##    Date           AAPL      BA     BAC       C    CSCO      CVS      CVX       D
##    <date>        <dbl>   <dbl>   <dbl>   <dbl>   <dbl>    <dbl>    <dbl>   <dbl>
##  1 2000-01-31  0.00885  0.0739 -0.0349  0.0236  0.0222 -0.122   -0.0346   0.0637
##  2 2000-02-29  0.105   -0.167  -0.0503 -0.0896  0.207   0.00179 -0.0993  -0.106 
##  3 2000-03-31  0.184    0.0237  0.152   0.157   0.170   0.0732   0.238    0.0477
##  4 2000-04-28 -0.0862   0.0496 -0.0656 -0.0121 -0.103   0.160   -0.0791   0.171 
##  5 2000-05-31 -0.323   -0.0121  0.142   0.0540 -0.179   0        0.0936   0.0313
##  6 2000-06-30  0.248    0.0704 -0.224  -0.0312  0.116  -0.0805  -0.0825  -0.0628
##  7 2000-07-31 -0.0298   0.167   0.102   0.170   0.0295 -0.0112  -0.0687   0.0598
##  8 2000-08-31  0.198    0.102   0.142   0.107   0.0487 -0.0596   0.0785   0.179 
##  9 2000-09-29 -0.577    0.202  -0.0225 -0.0742 -0.195   0.247    0.00864  0.0982
## 10 2000-10-31 -0.240    0.0513 -0.0823 -0.0266 -0.0249  0.144   -0.0367   0.0255
## # … with 243 more rows, and 22 more variables: DIS <dbl>, F <dbl>, GE <dbl>,
## #   HD <dbl>, IBM <dbl>, INTC <dbl>, JNJ <dbl>, JPM <dbl>, K <dbl>, MCK <dbl>,
## #   MRK <dbl>, MSFT <dbl>, ORCL <dbl>, PFE <dbl>, PG <dbl>, T <dbl>, UNH <dbl>,
## #   UPS <dbl>, VZ <dbl>, WFC <dbl>, WMT <dbl>, XOM <dbl>

The core of the engine is the weighting function. Inside this function, the user will have two degrees of freedom: \(alpha\), which is the intensity between LASSO and ridge in the glmnet() function and \(\lambda\) which indicates the strength of the constraint. We recall that the approach is based on \(n\) regressions of the type: \[ \underset{\beta_{i|}}{\text{argmin}}\, \left\{\sum_{t=1}^T\left( r_{i,t}-a_i-\sum_{n\neq i}^N\beta_{i|n}r_{n,t}\right)^2+\lambda \alpha || \beta||_1+\lambda (1-\alpha)||\beta||_2^2\right\},\] where each asset’s returns are regressed against those of all other assets. The purpose is to hedge the risk of the asset thanks to position in the other assets. Due to estimation errors, the unconstrained regression performs poorly and is often very leveraged. The penalisation is intended to solve these two problems.

weights_lasso <- function(returns, alpha, lambda){  # The parameters are defined here
    w <- 0                                          # Initiate weights
    for(i in 1:ncol(returns)){                      # Loop on the assets
        y <- returns[,i]                            # Dependent variable
        x <- returns[,-i]                           # Independent variable
        fit <- glmnet(x,y, family = "gaussian", alpha = alpha, lambda = lambda)
        err <- y-predict(fit, x)                    # Prediction errors
        w[i] <- (1-sum(fit$beta))/var(err)          # Output: weight of asset i
    }
    return(w / sum(w))                              # Normalisation of weights
}

3.2 First results

Finally, we move to the main loop, which has the same structure as those seen previously.

for(t in 1:length(t_oos)){
    temp_data <- returns %>% filter(Date < t_oos[t]) %>%    # Extracting past data: expand. window 
        select(-Date) %>%                                   # Take out the date
        as.matrix()                                         # Into matrix: glmnet requires matrices
    portf_weights[t,] <- weights_lasso(temp_data, 0.01, 0.1)# Hard-coded parameters! User specified!
    realised_returns <- returns %>%                         # Realised returns:
        filter(Date == t_oos[t]) %>%                        # Filtered by date
        select(-Date)                                       # With date removed
    portf_returns[t] <- sum(portf_weights[t,] * realised_returns) # Portfolio returns
}

We then recall the perf_met() function, to which we added the turnover.

perf_met <- function(portf_returns, weights, asset_returns){
    avg_ret <- mean(portf_returns, na.rm = T)                     # Arithmetic mean 
    vol <- sd(portf_returns, na.rm = T)                           # Volatility
    Sharpe_ratio <- avg_ret / vol                                 # Sharpe ratio
    VaR_5 <- quantile(portf_returns, 0.05)                        # Value-at-risk
    turn <- 0                                                     # Initialisation of turnover
    for(t in 2:dim(weights)[1]){
        realised_returns <- asset_returns %>% filter(Date == t_oos[t]) %>% select(-Date)
        prior_weights <- weights[t-1,] * (1 + realised_returns)
        turn <- turn + apply(abs(weights[t,] - prior_weights/sum(prior_weights)),1,sum)
    }
    turn <- turn/(length(t_oos)-1)                                # Average over time
    met <- data.frame(avg_ret, vol, Sharpe_ratio, VaR_5, turn)    # Aggregation of all of this
    rownames(met) <- "metrics"
    return(met)
}

asset_returns <- filter(returns, Date>sep_date)                   # Keep out-of-sample returns
perf_met(portf_returns, portf_weights, asset_returns)             # Compute perf metrics

##            avg_ret        vol Sharpe_ratio       VaR_5       turn
## metrics 0.00919831 0.03194172    0.2879716 -0.04688039 0.04399418

Let’s have a look at the weights.

portf_weights <- bind_cols(t_oos, as_tibble(portf_weights))
colnames(portf_weights) <- c("date", tick)

portf_weights %>%
  pivot_longer(-date, names_to = "stock", values_to = "weight") %>%
  ggplot(aes(x = date, y = weight, color = stock)) + 
  geom_line() + theme_light() + 
  geom_text_repel(aes(label = stock), hjust = -0.2,
            data = portf_weights %>% 
               pivot_longer(-date, names_to = "stock", values_to = "weight") %>%
               filter(date == max(date))) +
  annotate("text", x = as.Date("2021-06-01"), y = 0.14, label = "Ticker") +
  theme(legend.position = "none")

3.3 Comparing strategies

The purpose of the sparse portfolios is to reduce the volatility. Let’s see how it compares to the other classical schemes. First, we build the weighting function. It’s just a concatenation of functions we saw previously.

weights_multi <- function(returns, j, alpha, lambda){
    N <- ncol(returns)
    if(j == 1){                     # j = 1 => EW
        return(rep(1/N,N))
    }
    if(j == 2){                     # j = 2 => Minimum Variance
        sigma <- cov(returns)
        w <- solve(sigma) %*% rep(1,N)
        return(w / sum(w))
    }
    if(j == 3){                     # j = 3 => Maximum Sharpe Ratio
        m <- apply(returns, 2, mean)
        sigma <- cov(returns)
        w <- solve(sigma) %*% m
        return(w / sum(w))
    }
    if(j == 4){                     # j = 4 => Penalised / elasticnet
        w <- weights_lasso(returns, alpha, lambda)
    }
}

Note that this does take some space in the script. A helpful shortcut is to create a separate file where you store the functions only. Below, we proceed to initialisation of variables and jump directly the main loop. Given that we compute the weights and returns for 4 portfolios, it takes a bit longer.

Tt <- length(t_oos)                                             # Nb of dates, avoid T = TRUE
nb_port <- 4                                                    # Nb of portfolios
portf_weights <- array(0, dim = c(Tt, nb_port, length(tick)))   # Store weights
portf_returns <- matrix(0, nrow = Tt, ncol = nb_port)           # Store returns

for(t in 1:length(t_oos)){
    temp_data <- returns %>% 
        filter(Date < t_oos[t]) %>% 
        select(-Date) %>%
        as.matrix()
    realised_returns <- returns %>% 
        filter(Date ==  t_oos[t]) %>% 
        select(-Date)
    for(j in 1:nb_port){                                     
        portf_weights[t,j,] <- weights_multi(temp_data, j, 0.1, 0.01)  # Hard-coded parameters!
        portf_returns[t,j] <- sum(portf_weights[t,j,] * realised_returns)
    }
}

Finally, we compute the corresponding performance metrics. Because of the complex nature of perf_met(), we use a small loop.

port_names <- c("EW", "MV", "MSR", "LASSO") # Portfolio names
met <- c()                                  # Initialise metrics
for(i in 1:nb_port){
    met <- rbind(met, perf_met(portf_returns[,i], portf_weights[,i,], asset_returns)) # Ugly!
}
met %>% data.frame(row.names = port_names)  # Display results

##           avg_ret        vol Sharpe_ratio       VaR_5       turn
## EW    0.010769125 0.03982011    0.2704444 -0.05507966 0.03872154
## MV    0.009737534 0.03241443    0.3004074 -0.04274005 0.10221530
## MSR   0.011078110 0.04361729    0.2539844 -0.06104447 0.29806042
## LASSO 0.009225296 0.03148715    0.2929860 -0.04077635 0.07107574

The LASSO portfolio has the lowest volatility! a close vol value is obtained by the Minimum Volatility (MV) model. The latter has a turnover much smaller compared to the MSR portfolio but is beaten by the LASSO strategy on this metric. Moreover, in terms of extreme risk (VaR), the LASSO portfolio is the best!

The parameters of the LASSO portfolio play a decisive role in the performance of the strategy: they should be chosen carefully after sensitivity tests show how they impact the results. The \(\lambda\) will dictate the amount of constraint on the coefficients. A strict constraint will build more stable and sparse portfolios, but the weights will be further from those of the MV portfolio. As usual, it’s all a matter of tradeoff.

4 Pure LASSO predictions

Another use of the LASSO would simply be to use it as predictive tool. In this case, the explanatory variables would be, e.g., the firms’ characteristics. Below, we show how to implement the predictive regression approach. More precisely, it is a pooled panel estimation.

We start by the weighting function. We use the predictive regression to forecast future returns (probably not the best choice). Based on the prediction, we form equally-weighted portfolios of assets with positive forecasts (in fact, above median forecasts would be more stable and a better choice: fixed number of assets in the portfolio).

weights_pure_lasso <- function(past_data, current_data, alpha, lambda){
    y <- past_data$F_Return                                 # Dependent variable
    x <- past_data %>%                                      # Independent variables
        select(-Tick, -Date, -Close, -F_Return) %>%         # Remove irrelevant columns
        as.matrix()                                         # Format to matrix shape
    fit <- glmnet(x,y, alpha = alpha, lambda = lambda)      # Performing the glmnet regression
    newx <- current_data %>%                                # Preparing the new data
        select(-Tick, -Date, -Close, -F_Return) %>%         # Remove irrelevant columns
        as.matrix()                                         # Format to matrix shape
    pred <- predict(fit, newx = newx)                       # Launching the prediction
    w <- pred > median(pred)                                # Invests only if positive prediction
    return(w/sum(w))                                        # Then, equal weights
}

Below, we launch the main loop. Because of the forward return we must go one step backward in the out-of-sample dates. Indeed, before, we took data up to \(t-1\) to decide the allocation at time \(t-1\) and apply the return observed at time \(t\). Since we resort to the forward return, we must go back to time \(t-2\). The decision is made with the values at time \(t-1\):

past up to t-2: training data
t-1: data used for the current prediction
the return from t-1 to t is the future return applied to the strategy

To ‘simplify’ the architecture, we simply define a new vector of out-of-sample dates.

t_oos2 <- data$Date[data$Date>sep_date-20] %>%          # New dates, we take one more (prev. month)!:: 
    unique() %>%                                        # Remove duplicates
    as.Date(origin = "1970-01-01")                      # Transform in date format
portf_weights <- matrix(0, nrow = length(t_oos), ncol = length(tick))   # Initialisation
portf_returns <- c()                                                    # Initialisation

for(t in 2:length(t_oos2)){                                             # Current time is t-1
    past_data <- data_lasso %>% filter(Date < t_oos2[t-1])              # Past data: expanding window
    current_data <- data_lasso %>% filter(Date == t_oos2[t-1])          # Extracting current data
    portf_weights[t-1,] <- weights_pure_lasso(past_data,current_data, 0.1, 0.01)  
    # Hard-coded parameters above! User specified!
    realised_returns <- returns %>%                     # Realised returns
        filter(Date ==  t_oos2[t]) %>%                  # Note: starts at t = 2, equal to t_oos[1]
        select(-Date)                                   # Take out date column
    portf_returns[t-1] <- sum(portf_weights[t-1,] * realised_returns) 
    # Note: t-1, like for the portfolios !!!
}
asset_returns <- filter(returns, Date %in% t_oos) # And not t_oos2!
perf_met(portf_returns, portf_weights, asset_returns)

##            avg_ret        vol Sharpe_ratio      VaR_5      turn
## metrics 0.01048623 0.04142142    0.2531597 -0.0592228 0.3806262

The performance is reasonably good compared to our previous results. The average return is quite high, but the risk has also increased (the VaR, notably). The turnover is nonetheless much too high.

5 Exercises

5.1 Simple tests

Try different values for alpha and lambda in the sparse portfolios and see if it is possible to reduce the out-of-sample volatility.

5.2 LASSO tuning

Create a weighting function and the corresponding backtesting loop that tests several parameter values for:
- the LASSO-based sparse portfolios
- the predictive LASSO strategies

6 BONUS: sensitivity analysis of LASSO - Min Var portfolios

Is it possible to improve the volatility of the penalized minimum variance portfolio?
Given the computation time of only one backtest, this will take some time.

# Below, we create a function that depends on the two parameters
lasso_sensi <- function(alpha, lambda, t_oos, tick, returns){
    portf_weights <- matrix(0, nrow = length(t_oos), ncol = length(tick)) 
    portf_returns <- c()       
    for(t in 1:length(t_oos)){
        temp_data <- returns %>% filter(Date < t_oos[t]) %>%    # Extracting past data: expnd. wndw 
            select(-Date) %>%                                   # Take out the date
            as.matrix()                                         # Into matrix
        portf_weights[t,] <- weights_lasso(temp_data, alpha, lambda)
        realised_returns <- returns %>%                         # Realised returns:
            filter(Date ==  t_oos[t]) %>%                       # Filtered by date
            select(-Date)                                       # With date removed
        portf_returns[t] <- sum(portf_weights[t,] * realised_returns) # Portfolio returns
    }
    return(sd(portf_returns))
}

alpha <- c(0.001, 0.01, 0.1)             # alpha values
lambda <- c(0.001, 0.01, 0.1, 1)         # lambda values
pars <- expand.grid(alpha, lambda)       # Exploring all combinations!
alpha <- pars[,1]
lambda <- pars[,2]
tic()                                    # Launch the clock
vols <- pmap(list(alpha, lambda),        # Parameters for the grid search
             lasso_sensi,                # Function on which to apply pmap
             t_oos = t_oos,              # The other inputs below
             tick = tick,
             returns = returns) 
toc()                                    # Stop the clock!

## 116.769 sec elapsed

vols <- vols %>% 
    unlist() %>%
    cbind(as.matrix(pars)) %>%
    data.frame()
colnames(vols) <- c("vol", "alpha", "lambda")
vols %>% ggplot(aes(x = as.factor(alpha), y = vol, fill = as.factor(alpha))) + 
  geom_col() + facet_grid(~lambda) + theme_light() +
  scale_y_continuous(limits = c(0.02, 0.04), oob = rescale_none)

The original parameters were a rather good choice.

7 Full code for a neat regression graph.

First, let’s extract the coefficients of the regression.

data %>%
  mutate(Year = year(Date)) %>%                # Adding a column thanks to year() from lubridate
  group_by(Tick) %>%                           # Building a pivot table
  summarise(avg_cap = mean(Mkt_Cap),
            avg_prof = mean(Prof_Marg)) %>%
  lm(avg_prof ~ avg_cap, data = .)

## 
## Call:
## lm(formula = avg_prof ~ avg_cap, data = .)
## 
## Coefficients:
## (Intercept)      avg_cap  
##   6.328e+00    3.378e-05

Next, we can move forward to the plot.

data %>%
  group_by(Tick) %>%                                # Building a pivot table
  summarise(avg_cap = mean(Mkt_Cap),                # The average cap
            avg_prof = mean(Prof_Marg)) %>%         # The average prof
  mutate(pred = 6.302 + 3.368e-05 * avg_cap,        # The model values (see above)
         col = if_else(pred > avg_prof, 
                       "Positive error",
                       "Negative error")) %>%    
  ggplot(aes(x = avg_cap, y = avg_prof)) +          # The plot!
  geom_point() +
  geom_text_repel(aes(label = Tick)) + theme_light() +
  ylab("Average Profitability Margin (%)") + xlab("Average Market Capitalization (K$)") +
  stat_smooth(geom = 'line', method = "lm", alpha = 0.4, se = F, linetype = 2) +
  geom_point(aes(x = avg_cap, y = pred), shape = 1, color = "#1166EE") +
  geom_segment(aes(xend = avg_cap, yend = pred, color = col), alpha = .4) +
  annotate(geom = "text", x = 320000, y = 18, label = "prediction line", color = "#1166EE", angle = 18) +
  annotate(geom = "text", x = 229000, y = 9.5, label = "positive error", color = "#99BBBB", angle = 90) +
  annotate(geom = "text", x = 334000, y = 12, label = "positive error", color = "#99BBBB", angle = 90) +
  annotate(geom = "text", x = 420000, y = 24.5, label = "negative error", color = "#DD8888", angle = 90) +
  annotate(geom = "text", x = 125000, y = 15, label = "negative error", color = "#DD8888", angle = 90) +
  theme(legend.position = "none") +
  annotate(geom = "text", x = 335000, y = 16.8, label = "prof = 7.15+2.16e-05 * mkt_cap", color = "#1166EE", angle = 18)

# ggsave("reg3.pdf", width = 8, height = 5)

Lasso & sparse portfolios