# Risk Management Of Stocks Using Python

## Learn different ways to evaluate the Value At Risk(VaR) of a stock

--

**What is Value at Risk(VaR)?**

Value at Risk (VaR) is the most common metric used by an individual, firms, and banks to determine the extent of potential financial loss for a given period of time. Financial institutions use VaR to assess the risk associated with an investment to evaluate whether they have sufficient funds to cover potential losses, it also helps risk managers to re-evaluate and adjust their investments to reduce the risk of higher losses. The typical question that VaR helps answer is what is the maximum loss on investment of 100,000 over one year with a confidence of 95% ? or What is the probability that losses will be more than 3% next year? The standard way of representing these scenarios is in the form of a normal distribution which makes the analysis and interpretation much easier.

There are three important factors at play — the confidence interval, the time period, and the investment amount or expected percentage of loss.

In this blog, we will cover the below topics

- Exploratory data analysis(EDA)
- Calculation of returns
- Methodologies to calculate VaR

- Historical method

- Bootstrap method

- Decay factor method

- Monte Carlo simulation method

# Exploratory data analysis (EDA)

**Load the libraries**

These are standard libraries for data loading, analysis, and visualization. The key point is to note the use of nsepy for getting the stock data. The other alternative is to use yahoo finance.

`import numpy as np`

import pandas as pd

import warnings

import matplotlib.pyplot as plt

from datetime import date

from tabulate import tabulate

from nsepy import get_history as gh

**Load the data and initial settings**

We will be using the last year of data of the TATA MOTORS. This can be any name of interest like ICICI or DABUR etc. This gives the daily value of the stock price.

`initial_investment = 100000`

startdate = date(2022,2,2)

end_date = date(2023,2,2)

stocksymbols = ['TATAMOTORS'] # This can be any stock

def load_stock_data(self):

df = pd.DataFrame()

for i in range(len(self.ticker)):

....

...

return df

OUTPUT:

TATAMOTORS

Date

2022-10-14 396.25

2022-10-17 396.10

2022-10-18 404.25

2022-10-19 399.05

2022-10-20 398.10

....

....

2023-01-27 445.60

2023-01-30 443.65

2023-01-31 452.10

2023-02-01 446.65

2023-02-02 444.80

**Calculate the returns**

The calculation of returns is a simple process, we take the percentage change between the current value and the previous value.

P(t+1) — P(t) / P(t)

This can be achieved easily in python using *pct_change *method and let's visualize the returns.

`def stock_returns(self):`

df = self.load_stock_data()

df.columns = ['Stock']

returns = df.pct_change()

returns.dropna(inplace=True)

return returns

OUTPUT:

Stock

Date

2022-10-17 -0.000379

2022-10-18 0.020576

2022-10-19 -0.012863

2022-10-20 -0.002381

2022-10-21 -0.000126

We have accessed the data, loaded it, calculated the returns, and visualized the trends. Let us now try the three methods to calculate the VaR in the next section

**Historical method**

This is the simplest of all the methods and the most basic as it doesn’t give importance to the distribution and tails. In this method, we take the return values from the previous section and sort the values. As a standard, we take a confidence level of 95% i.e, our focus shifts to the bottom 5%. Also, the number of trading days in a year is 252, so all that we need to do is calculate 5% of 252 days which is 12.6 meaning we will have to take the 13th lowest return which turns out to be **-0.039306 **as shown below.

`returns.sort_values('Stock').head(13)`

OUTPUT:

2022-02-24 -0.102830

2022-09-26 -0.060506

2022-03-07 -0.055722

2022-02-14 -0.054926

2022-06-16 -0.051075

2022-06-13 -0.049877

2022-11-10 -0.048367

2022-03-04 -0.045413

2022-05-06 -0.041637

2022-05-12 -0.040835

2022-12-23 -0.040816

2022-05-19 -0.039745

2022-10-10 -0.039306

Python has a more elegant way to get to this value using NumPy's percentile function.

`np.percentile(returns['Stock'], 5, interpolation = 'lower')`

OUTPUT:

-0.039306077884265433

Here is the complete method that loads the data, and calculates the VaR. The end result is a nice tabular format.

`def var_historical(self): `

returns = obj_loadData.df_returns.copy()

....

....

returns.sort_values('Stock').head(13)

var_hist = np.percentile(returns['Stock'], 5, interpolation = 'lower')

print(tabulate([[self.ticker,avg_rets,avg_std,var_hist]],

headers = ['Mean', 'Standard Deviation', 'VaR %'],

tablefmt = 'fancy_grid',stralign='center',numalign='center',floatfmt=".4f"))

return var_hist

def plot_shade(self, var_returns):

....

plt.text(var_returns, 25, f'VAR {round(var_returns, 4)} @ 5%',

horizontalalignment='right',

size='small',

color='navy')

....

plt.gca().add_patch(rect)

OUTPUT:

╒════════════════╤════════╤══════════════════════╤══════════╕

│ │ Mean │ Standard Deviation │ VaR │

╞════════════════╪════════╪══════════════════════╪══════════╡

│ ['TATAMOTORS'] │ 0.0003 │ 0.0225 │ -0.039306│

╘════════════════╧════════╧══════════════════════╧══════════╛

The value at risk of -0.039306 indicates that at a 95% confidence level, there will be a maximum loss of 3.9%, or there is a 5% probability that the losses will exceed 3.9%. In monetary terms, for an investment of 100,000, we are 95% confident that the maximum loss will be 3,930.

# B**ootstrap method**

The Bootstrap method is similar to the historical method but in this case, we sample the returns multiple times like 100 or 1000 times or more, calculate the VaR and in the end take the average VaR. This is similar to the resampling that is done in the data science space where a dataset is resampled many times, the model is retrained to predict the value.

`def var_bootstrap(self,iterations: int):`

def var_boot(data):

...

...

return np.percentile(dff, 5, interpolation = 'lower')

def bootstrap(data, func):

sample = np.random.choice(data, len(data))

return func(sample)

def generate_sample_data(data, func, size):

bs_replicates = np.empty(size)

....

....

returns = obj_loadData.df_returns.copy()

....

return np.mean(bootstrap_VaR)

We have resampled the data over 500 iterations and calculated the VaR on each run and as the process is random, it is expected that the values should be in the same vicinity as that of the historical method i.e, -0.039306. Let us plot this distribution to validate our understanding.

The gray rectangle represents the return values and it falls in the range of -0.033 to -0.042 which is good. Now, let's take the mean of these values to arrive at VaR and also visualize the significant level by highlighting the area.

`var_bootstrap = np.mean(bootstrap_VaR)`

print(f'The Bootstrap VaR measure is {np.mean(bootstrap_VaR)}')

return np.mean(bootstrap_VaR)

OUTPUT:

╒════════════════╤══════════════╤

│ Stock │ Bootstrap │

╞════════════════╪═══════════════

│ ['TATAMOTORS'] │ -0.0369 │

╘════════════════╧══════════════╧

The value at risk of -0.0369 indicates that at a 95% confidence level, there will be a maximum loss of 3.69%, or there is a 5% probability that the losses will exceed 3.69%. This is 0.21% lower than the historical method and could be possibly due to the randomness introduced as part of resampling.

# D**ecay factor method**

In both the previous methods, there is no consideration for highs, lows, or market fluctuations which meant there was inherently a big assumption that the future trend would also be similar to the past year. The decay method addresses this issue by placing a higher weightage on the recent data. We will use the decay factor which will be between o and 1 i.e assign the lowest weightage to the farthest data point and higher weightage to the most recent data point.

`decay_factor = 0.5 #we’re picking this arbitrarily`

n = len(returns)

wts = [(decay_factor**(i-1) * (1-decay_factor))/(1-decay_factor**n)

for i in range(1, n+1)]

OUTPUT:

0.5,

0.25,

0.125,

0.0625,

0.03125,

....

....

2.210859150104178e-75,

1.105429575052089e-75,

5.527147875260445e-76]

we will create a data frame that has weights assigned to each data point.

`wts_returns = pd.DataFrame(returns_recent_first['Stock'])`

wts_returns['wts'] = wts

OUTPUT:

Stock wts

Date

2023-02-03 0.001461 0.50000

2023-02-02 -0.004142 0.25000

2023-02-01 -0.012055 0.12500

2023-01-31 0.019047 0.06250

2023-01-30 -0.004376 0.03125

....

....

2022-02-07 -0.011986 2.210859e-75

2022-02-04 -0.007730 1.105430e-75

2022-02-03 -0.003752 5.527148e-76

In the previous methods, we sorted the return values in ascending order and took the 13th lowest return value as VaR. We were able to do it because each of the data points had the same weightage of 1 but in the decay factor method, we have assigned different weights for each point so we cannot directly take the lowest 13th return value instead we will sum the weights till we hit 0.05 mark which is the 5% significant level and to make it easier we will use cumulative sum.

`sort_wts = wts_returns.sort_values(by='Stock')`

sort_wts['Cumulative'] = sort_wts.wts.cumsum()

sort_wts

sort_wts = sort_wts.reset_index()

idx = sort_wts[sort_wts.Cumulative <= 0.05].Stock.idxmax()

sort_wts.filter(items = [idx], axis = 0)

OUTPUT:

Date Stock wts Cumulative

63 2022-06-02 -0.012258 6.681912e-52 7.488894e-04

64 2023-02-01 -0.012055 1.250000e-01 1.257489e-01

We find that the cumulative value 0.05 falls between rows 63 and 64. We will have to interpolate to get the value which turns out to be -0.0121

`xp = sort_wts.loc[idx:idx+1, 'Cumulative'].values`

fp = sort_wts.loc[idx:idx+1, 'Stock'].values

var_decay = np.interp(0.05, xp, fp)

OUTPUT:

-0.01217808614447785

Here is the complete method that loads the data, generates and assigns weights, interpolates, and calculates the VaR.

`def var_weighted_decay_factor(self): `

returns = obj_loadData.df_returns.copy()

decay_factor = 0.5 #we’re picking this arbitrarily

n = len(returns)

wts = [(decay_factor**(i-1) * (1-decay_factor))/(1-decay_factor**n) for i in range(1, n+1)]

....

....

return var_decay

OUTPUT:

╒════════════════╤══════════════╤

│ Stock │ Decay │

╞════════════════╪═══════════════

│ ['TATAMOTORS'] │ -0.0122 │

╘════════════════╧══════════════╧

The decay method indicates that at a 95% confidence level, there will be a maximum loss of 1.22%, or there is a 5% probability that the losses will exceed 1.22%. This is significantly lower than the other two methods and this is due to the assignment of weights. The decay rate is set as 0.5, we can increase or decrease the decay rate to check for the most reasonable VaR. One approach would be to take a range of decay rates and run a simulation to get a range of VaR.

# Monte Carlo simulation method

This method is similar to the Bootstrap method but the only difference is instead of choosing a data point from the existing set of return values, we generate a new set of return values within the same distribution. Let us understand it step by step.

We will calculate the mean and the standard distribution.

`returns_mean = returns['Stock'].mean()`

returns_sd = returns['Stock'].std()

We will write a method to generate the set of values from the distribution with the mean of *returns_mean* and the standard distribution of *returns_sd*

`iterations = 1000`

def simulate_values(mu, sigma, iterations):

try:

result = []

for i in range(iterations):

tmp_val = np.random.normal(mu, sigma, (len(returns)))

var_hist = np.percentile(tmp_val, 5, interpolation = 'lower')

result.append(var_hist)

return result

except Exception as e:

print(f'An exception occurred while generating simulation values: {e}')

Let us now execute the method and VaR for each of the 1000 iterations.

`sim_val = simulate_values(returns_mean,returns_sd, iterations)`

tmp_df = pd.DataFrame(columns=['Iteration', 'VaR'])

tmp_df['Iteration'] = [i for i in range(1,iterations+1)]

tmp_df['VaR'] = sim_val

tmp_df.head(50)

print(f'The mean VaR is {statistics.mean(sim_val)}')

OUTPUT:

Iteration VaR

1 -0.034532

2 -0.035278

3 -0.034831

4 -0.033859

....

....

997 -0.035699

998 -0.038877

999 -0.038362

1000 -0.035165

╒════════════════╤══════════════╤

│ Stock │ Monte Carlo │

╞════════════════╪══════════════╪

│ ['TATAMOTORS'] │ -0.03716 │

╘════════════════╧══════════════╧

The VaR is less than the historical method (-0.0393) and greater than the decay method (-0.0122). This result is more or less the same as the bootstrap method (-0.0366).

It will be good to have a function that would give consolidated results from all the approaches in a tabular form.

` def show_summary(self):`

try:

var_hist = self.var_historical()

var_bs = self.var_bootstrap()

var_decay = self.var_weighted_decay_factor()

var_MC = self.var_monte_carlo()

print(tabulate([[self.ticker,var_hist,var_bs,var_decay, var_MC]],

headers = ['Historical', 'Bootstrap',

'Decay', 'Monte Carlo'],

tablefmt = 'fancy_grid',stralign='center',

numalign='center',floatfmt=".4f"))

except Exception as e:

print(f'An exception occurred while executing show_summary: {e}')

OUTPUT:

╒════════════════╤══════════════╤═════════════╤═════════╤═══════════════╕

│ Stock │ Historical │ Bootstrap │ Decay │ Monte Carlo │

╞════════════════╪══════════════╪═════════════╪═════════╪═══════════════╡

│ ['TATAMOTORS'] │ -0.0393 │ -0.0366 │ -0.0122 │ -0.0373 │

╘════════════════╧══════════════╧═════════════╧═════════╧═══════════════╛

The complete code can be found on the GitHub

**Advantages of VaR**

- It is easy to understand and interpret as a single metric for risk assessment and it is used as a first-line analysis to gauge the pulse of the investment for a given period.
- It can be used for risk assessment of bonds, shares, or similar asset classes.

## Disadvantages of VaR

- The accuracy of the assessment depends on the quality of the data and the assumptions. eg: it is assumed that the data is normally distributed, the economic factors are not considered, etc.
- The VAR gives the maximum potential loss for a given period at a specific confidence interval but it doesn’t tell us how big or small will be the potential losses beyond a point.
- It is difficult to use VaR for large portfolios as risk calculation should be done for each of the assets and there is a correlation angle to be factored in as well.

**Conclusion**

VaR is a widely used metric for buying, selling, and recommending universally because of its simplicity. There are multiple approaches to calculating the VaR and we learned the four basic methods using python. As there is no protocol or standard for the calculation of VaR, different methods result in different results as we saw in this blog. It is a good tool as a first-line analysis to gauge the risk involved for an asset class and it will be more effective when coupled with sophisticated methods that factor in the market trends, economic conditions, and other financial factors.

I hope you liked the article and found it helpful.

You can connect with me — on ** Linkedin** and

*Github*## Disclaimer:

The blog is only for educational purposes and should not be used as the basis for making any real-world financial decisions.