# Visualising Intraday Market Correlation

I stumbled across a great post on MKTSTK about visualising volatility and correlations of multiple timeseries with streamgraphs, and it got me thinking about where else a streamgraph might be useful to visualise financial data.

Rather than looking at an individual assets, I thought it might be interesting to explore the behaviour of the market at times throughout the day, and in turn see how this changes over time. To do this, I select one month of data (Feb 01 '17 to Mar 03 '17), consisting of 1-minute frequency closing prices for all equities in the ASX200.

The packages required are: `dplyr`

,`tidyr`

,`PerformanceAnalytics`

,`streamgraph`

.

Specifically, I want to use a 5 minute rolling window to find the average of all pairwise correlations across the ASX200. Then I want to downsample that into 10-minute blocks of average correlation (10:00, 10:10, 10:20, ... 3:40, 3:50, 4:00), and see how the correlation value in each block changes with time. The motivation is that it should be clear to see which times exhibit high correlation.

I'm using R for this instead of my usual go-to choice of Python, because graphing tends to be a bit nicer, and there's a great streamgraph library available.

The input data is a dataframe of the format:

```
25 2017-02-02 09:54:00 A2M 2.18000
26 2017-02-02 09:55:00 A2M 2.18500
27 2017-02-02 09:56:00 A2M 2.18999
28 2017-02-02 09:57:00 A2M 2.18500
29 2017-02-02 09:58:00 A2M 2.18999
... ... ... ...
1615048 2017-03-03 15:11:00 WSA 2.39000
1615049 2017-03-03 15:12:00 WSA 2.39000
1615050 2017-03-03 15:13:00 WSA 2.39500
1615051 2017-03-03 15:14:00 WSA 2.39000
1615052 2017-03-03 15:15:00 WSA 2.39000
```

*(Note my data being in the wrong timezone. Always store as UTC, so that doesn't happen!)*

First, I need to find the 5-minute rolling mean correlation over all the stocks.

```
# Spread from long format to wide format, then find returns.
prices <- spread(df, symbol, close)
# Convert to an XTS timeseries.
prices <- xts(prices[,-1], order.by=as.POSIXct(prices$end_timestamp))
returns <- Returns.calculate(prices, method='log')
# Func to find mean correlation of a correlation matrix.
# Takes lower triangle of correlation matrix, which contains
# all pairwise correlations, and calculate the mean.
meanCor = function(returnMatrix=NULL) {
corMatrix = cor(returnMatrix)
meanCorValue = mean(corMatrix[lower.tri(corMatrix)],na.rm=TRUE)
return(meanCorValue)
}
# Calculate mean correlation on a moving window
rollingMeanCorrelations <- rollapply(returns, width=5, FUN=function(x) meanCor(x), by=1, fill=NA, align="right", by.column=FALSE
```

Plotting `rollingMeanCorrelations`

(but with a 30 minute rolling window to reduce noise) gives us this;

*Click for fullsize*

Just by eyeballing, it's easy to see that spikes seem to occur most often at the start of each trading day.

To use streamgraph, we need to munge our timeseries slightly. `rollingMeanCorrelations`

is simply a list of times and correlation values, but we need to turn this into bins by time of day in order to create the 'streams' in the streamgraph. It's simply a matter of turning the XTS object back into a dataframe.

```
# Need dataframe with date column (X axis), minute-bin (Y axis), meanCorr (stream width)
binnedCorrelations <- period.apply(rollingMeanCorrelations, endpoints(rollingMeanCorrelations, "minutes", 10), mean)
corrDf <- data.frame(time=index(binnedCorrelations), coredata(binnedCorrelations))
corrDf$date <- as.Date(corrDf$time, format="%m-e")
corrDf$time <- strftime(corrDf$time, format="%H:%M")
```

Then, to graph:

```
corrDf %>%
streamgraph("time", "coredata.chunkedCorrelations.", "date", interpolate="cardinal") %>%
sg_axis_x(1, "date", "%b-%e") %>%
sg_fill_brewer("Spectral")
```

*Click for fullsize*

So what are we actually looking at?

Each stream in the graph represents the average pairwise correlation over each of the 10-minute blocks that can be fit into the trading day. Market open is at the bottom, and market close is at the top. Along the X-axis, we have the dates, so we can see exactly how correlation in each intraday block evolves over time.

While the sample size of roughly 20 odd trading days is far too small to extrapolate anything meaningful, it's interesting to see that an unusually highly correlated market open seems to be often followed by a highly correlated market close. It's tempting to reason that this could be partially caused by large money managers using the extra liquidity available near market close to adjust any positions that were caught offside during a violent open, or speculate about the following overnight EU/US market movement.

If it did turn out that highly correlated opens led to correlated closes with some non-random expectancy, and if there exists a relation between the implied volatility of an index and the correlation of all constituents of that index, this idea could form part of a simple index option strategy.