01 - Histograms and scatterplots
data:image/s3,"s3://crabby-images/6a45b/6a45b69c7a25ad72f6c2b8cea4196ec742761895" alt=""
- One of the most informative ways to consider daily returns is when we compare the returns of one stock with another.
Time: 00:00:19
02 - A closer look at daily returns
data:image/s3,"s3://crabby-images/ce90a/ce90a6978653bf0f0dd8f1d467121dadd0b3a5d8" alt=""
- starting with a price time series.
- we build daily returns, this daily return data is not too revealing as time-series.
histograms
- A histogram is a kind of bar chart where we plot the number of occurrences of each item versus the value.
- split up the range of data into lots of little bins.
- and count up how many times the data matches the range across that bin.
- a bar of the appropriate height in the histogram that represents how many times the data matched that value.
Time: 00:02:12
03 - Quiz What would it look like
What the histogram of S&P 500 daily return over many years look like?
data:image/s3,"s3://crabby-images/94748/94748ea6ef9f2965749374b0c4757001d2d7da9a" alt=""
The correct answer: bell curve.
Time: 00:00:16
04 - Histogram of daily returns
data:image/s3,"s3://crabby-images/e8ff0/e8ff0ca85950d4a4785685774d0258b662608c86" alt=""
Statistics we can run on it to characterize histograms.
- mean.
- standard deviation: how far do individual measurements deviate from the mean.
- Kurtosis (means curved or arching): it tells us about the tails of the distribution.
The measure of kurtosis tells us how much different our histogram from that traditional Gaussian distribution.
- Positive Kurtosiswe indicate fat tails, Meaning that there are more occurrences out in these tails than would be expected if it were a normal distribution.
- Negative kurtosis indicates skinnytails, meaning that there are many fewer occurrences than would be expected if it were a normal distribution on the tails.
Time: 00:02:25
05 - How to plot a histogram
data:image/s3,"s3://crabby-images/035bb/035bb1ece3087560923ab16c340e3552346ed7a5" alt=""
daily_returns.hist(bin=20)
will plot daily_return as histogram with 20 bins. the default bin
parameter is 10.
data:image/s3,"s3://crabby-images/2b541/2b5410ebc3cd16cc84ac55938f3c900f66575884" alt=""
Time: 00:02:03
06 - Computing histogram statistics
Calculate mea and deviation and kurtosis:
mean = daily_returns['SPY'].mean()
std = daily_returns['SPY'].std()
kurtosis = daily_returns.kurtosis()
Plot mean and diviation using axvline()
in the Matplotlib library .
plt.axvline(mean, color='w', linestyle='dashed', linewidth=2)
plt.axvline(std, color='r', linestyle='dashed', linewidth=2)
plt.axvline(-std, color='r', linestyle='dashed', linewidth=2)
plt.show()
data:image/s3,"s3://crabby-images/2c970/2c97073b7b8d9594026a4a1686910ed0831cb875" alt="image.png"
positive kurtosis for the SPY stock, which means we have fat tails.
Note: bincounts()
using numpy.histogram
function.
Time: 00:02:11
07 - quiz: Compare two histograms
data:image/s3,"s3://crabby-images/0965b/0965bff0d60142f372439b00ad27003e8a86ef23" alt=""
Quiz: Select the option that best describes the relationship between XYZ and SPY.
Note:
- These are histograms of daily return values, i.e. X-axis is +/- change (%), and Y-axis is the number of occurrences.
- We are considering two general properties indicated by the histogram for each stock: return and volatility (or risk).
correct answer: XYZ has a lower return and higher volatility than SPY.
- mean of XYZ, is lower than the mean of SPY.
- XYZ got a larger standard deviation (broader shoulders), therefore, higher volatility.
8 - Plot two histograms together
data:image/s3,"s3://crabby-images/c5a65/c5a6584d758b28e5bc5bfa62dfa722f1163b7880" alt=""
Since the daily_returns data frame has data for two stocks, daily_returns.hist(bin=20)
will plot the data in two subplots.
data:image/s3,"s3://crabby-images/acced/acced3fa4f9145ae49bafd3b8edc8415bad1daed" alt=""
daily_returns['SPY'].hist(bin=20,label="SPY")
daily_returns['XOM'].hist(bin=20,label="XOM")
...
- To get two histograms on the same x and y axis, call the histogram functions separately on each of the stocks daily return values.
- also add the label parameter so that we can differentiate between the histogram of the SPY and XOM.
Time: 00:01:31
9 - Scatterplots
data:image/s3,"s3://crabby-images/6149e/6149e28c56264dc9e66216c9e3a133b72eb7a765" alt=""
A scatterplot is another way to visualize the differences between daily returns of individual stocks. The left graph is daily return of two stocks. S&P 500 and XYZ.
- On a scatterplot, there are a number of individual points or dots represents the daily returns of two stocks that happened on a particular day.
- the dots are somewhat scattered. They don't form a perfect line.
Time: 00:02:02
10 - Fitting a line to data points
data:image/s3,"s3://crabby-images/be53e/be53e08d9f5350d86018d2573f3b3c7467c59469" alt=""
- we can fit a line to it using linear regression.
slope, in financial terminology, is usually referred to as beta which means is how reactive is the stock to the market.
- e.g. Beta = 1 then on average, when the market goes up 1%, that particular stock also goes up 1%.
- if beta = 2, then if the market were to go up 1%, we'd expect on average for that stock to go up 2%.
intercepts, also called alpha. Positive alpha means that this stock is actually on average performing a little bit better than the S&P 500 every day. If it's negative, it means on average it's returning a little bit less than the market overall.
Time: 00:01:53
11 - Slope does not equal correlation
data:image/s3,"s3://crabby-images/827de/827de9fe305016c680f1cab7144bda9d9a6ef5ec" alt=""
- The slope is no correlation.
- Correlation is a measure of how tightly do these individual points fit that line. the range of correlation is from 0 to 1.
Time: 00:01:15
12 - Quiz: Correlation vs slope
data:image/s3,"s3://crabby-images/7aa1f/7aa1fbbe2fd22c42846082df772dcab0f5b459c5" alt="quiz"
Select the option that best compares ABC against XYZ, in terms of beta (slope of linear fit) and correlation with the market (represented by SPY).
13 - Scatterplots in python
Key codes
daily_returns.plot(kind='scattr',x='SPY', y='XOM') # scatterplot
beta_XOM,alpha_XOM=np.polyfit(daily_returns['SPY'],daily_returns['XOM'], 1)
plt.plot(daily_returns['SPY'],beta_XOM*daily_returns['SPY'] + alpha_XOM, '-',color='r')
plt.show()
- Kind parameter of the plot function of the data frame will help us plot scatterplots.
- NumPy's
ployfit()
function can fit a line to scatterplots and get alpha and beta of the regression line. the parameter "1" means the fitting is linear, y = mx + b.Here m is the coefficient and b is the intercept.
data:image/s3,"s3://crabby-images/1e9df/1e9df65ec569fa9b04594b17525fdbc7cdc1e35d" alt=""
- beta values for the XOM is greater as compared to that of GLD so that XOM is more reactive to market as compared to GLD.
- the alpha values denote how well it performs with respect to SPY and Numbers indicate that GLD performed better.
One last thing is to find the correlation yet again.
daily_returns.corr(method='pearson')
will output in the correlation matrix with the correlation of each column with each other column.
- high correlation means the dots fit the line closely.
Time: 00:04:45
14 - Real world use of kurtosis
data:image/s3,"s3://crabby-images/291f6/291f62ad6b7e9a165df7f8516470fa52c188ed08" alt=""
- the distribution of daily returns for stocks and the market looks very similar to a Gaussian.
- but it is dangerous to assume that financial returns are normal distributions because it ignores kurtosis or the probability in the tails.
- In the early 2000s investment banks built bonds based on mortgages and assumed that the distribution of returns for these mortgages was normally distributed.
- Their model failed because of the assumption of normal distribution
Time: 00:01:06
Total Time: 00:24:11
2019-01-12 初稿