One of the key concepts used by many successful investors is diversificationThe reduction in volatility created by combining two or more processes (such as the prices of financial instruments) that do not have 100% correlation.. In this post, I’ll define diversificationThe reduction in volatility created by combining two or more processes (such as the prices of financial instruments) that do not have 100% correlation. and explain how it works conceptually. I explain different ways you can diversify your investments and provide illustrations of its benefits in this post.

# What is Diversification?

DiversificationThe reduction in volatility created by combining two or more processes (such as the prices of financial instruments) that do not have 100% correlation. is the reduction of riskThe possibility that something bad will happen. (defined in my post a couple of weeks ago) through investing in a larger number of financial instruments. It is based on the concept of the Law of Large NumbersA statistical property of random variables. The law states that the more times you draw or generate a random variable, the more likely the results you generate will resemble the true characteristi... More in statistics. That “Law” says that the more times you observe the outcome of a random process, the closer the results are likely to exhibit their true properties.

## Coin Flip Illustration

For example, if you flip a fair coin twice, there are four sets of possible results:

First flip | Second flip |

Heads | Heads |

Heads | Tails |

Tails | Heads |

Tails | Tails |

## Estimating the True Probability of Heads

The true probabilityA percentage or the equivalent fraction that falls between 0% and 100% (i.e., between 0 and 1) that represents the ratio of the number of times that the outcome meets some criteria to the number of po... More of getting heads is 50%. In two rows (i.e., two possible results), there is one heads and one tails. These two results correspond to the true probabilityA percentage or the equivalent fraction that falls between 0% and 100% (i.e., between 0 and 1) that represents the ratio of the number of times that the outcome meets some criteria to the number of po... More of a 50% chance of getting heads. The other two possible results show that heads appears either 0% or 100% of the time.

If you repeatedly flip the coin 100 times, you will see heads between 40% and 60% of the time in 96% of the sets of 100 flips. Increasing the number of flips to 1,000 times per set, you will see heads between 46.8% and 53.2% of the time in 96% of the sets. Because the range from 40% to 60% with 100 flips is wider than the range of 46.8% to 53.2% with 1,000 flips, you can see that the range around the 50% true probabilityA percentage or the equivalent fraction that falls between 0% and 100% (i.e., between 0 and 1) that represents the ratio of the number of times that the outcome meets some criteria to the number of po... More gets smaller as the number of flips increases. This narrowing of the range is the result of the Law of Large NumbersA statistical property of random variables. The law states that the more times you draw or generate a random variable, the more likely the results you generate will resemble the true characteristi... More.

Following this example, the observed result from only one flip of the coin would not be diversified. That is, our estimate of the possible results from a coin flip would be dependent on only one observation – equivalent to having all of our eggs in one basket. By flipping the coin many times, we are adding diversificationThe reduction in volatility created by combining two or more processes (such as the prices of financial instruments) that do not have 100% correlation. to our observations and narrowing the difference between the observed percentage of times we see heads as compared to the true probabilityA percentage or the equivalent fraction that falls between 0% and 100% (i.e., between 0 and 1) that represents the ratio of the number of times that the outcome meets some criteria to the number of po... More (50%). Next week, I’ll apply this concept to investing where, instead of narrowing the range around the true probabilityA percentage or the equivalent fraction that falls between 0% and 100% (i.e., between 0 and 1) that represents the ratio of the number of times that the outcome meets some criteria to the number of po... More, we will narrow the volatilityThe possibility that something will deviate from its expected or average value, including both good and bad results. of our portfolioA group of financial instruments. by investing in more than one financial instrumentAny investment that you purchase. Examples include an exchange-traded fund, a mutual fund, stock in an individual company, a bond and a money market fund. There are also many more complex financia... More.

# What is Correlation?

As discussed below, the diversificationThe reduction in volatility created by combining two or more processes (such as the prices of financial instruments) that do not have 100% correlation. benefit depends on how much correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. there is between the random variables (or financial instruments). Before I get to that, I’ll give you an introduction to correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time..

CorrelationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. is a measure of the extent to which two variables move proportionally in the same direction. In the coin toss example above, each flip was independent of every other flip.

### 0% Correlation

When variables are independent, we say they are uncorrelated or have 0% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time.. The graph below shows two variables that have 0% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time..

In this graph, there is no pattern that relates the value on the x-axis (the horizontal one) with the value on the y-axis (the vertical one) that holds true across all the points.

### 100% Correlation

If two random variables always move proportionally and in the same direction, they are said to have +100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time.. For example, two variables that are 100% correlated are the amount of interestA charge for borrowing money, most often based on a percentage of the amount owed. you will earn in a savings account and the account balance. If they move proportionally but in the opposite direction, they have -100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time.. Two variables that have -100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. are how much you spend at the mall and how much money you have left for savings or other purchases.

The two charts below show variables that have 100% and -100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time..

In these graphs, the points fall on a line because the y values are all proportional to the x values. With 100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time., the line goes up, whereas the line goes down with -100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time.. In the 100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. graph, the x and y values are equal; in the -100% graph, the y values equal one minus the x values. 100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. exists with any constant proportion. For example, if all of the y values were all one half or twice the x values, there would still be 100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time..

### 50% Correlation

The graphs below give you a sense for what 50% and -50% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. look like.

The points in these graphs don’t align as clearly as the points in the 100% and -100% graphs, but aren’t as randomly scattered as in the 0% graph. In the 50% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. graph, the points generally fall in an upward band with no points in the lower right and upper left corners. Similarly, in the -50% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. graph, the pattern of the points is generally downward, with no points in the upper right or lower left corners.

# How Correlation Impacts Diversification

The amount of correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. between two random variables determines the amount of diversificationThe reduction in volatility created by combining two or more processes (such as the prices of financial instruments) that do not have 100% correlation. benefit. The table below shows 20 possible outcomes of a random variableA quantity whose possible values are the outcomes of a random process. For example, the result of rolling a dice is a random variable with possible outcomes of 1, 2, 3, 4, 5 and 6. If the dice is ... More. All outcomes are equally likely.

The average of these observation is 55 and the standard deviationA standard deviation is a (slightly messy) statistical calculation that results in a positive number that measures how much the possible results differ from the average result. For those of you who ... More is 27. This standard deviationA standard deviation is a (slightly messy) statistical calculation that results in a positive number that measures how much the possible results differ from the average result. For those of you who ... More is measures the volatilityThe possibility that something will deviate from its expected or average value, including both good and bad results. with no diversificationThe reduction in volatility created by combining two or more processes (such as the prices of financial instruments) that do not have 100% correlation. and will be used as a benchmark when this variable is combined with other variables.

## +100% Correlation

If I have two random variables with the same properties and they are 100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time., the outcomes would be:

Remember that 100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. means that the variables move proportionally in the same direction. If I take the average of the outcomes for Variable 1 and Variable 2 for each observation, I would get results that are the same as the original variable. As a result, the process defined by the average of Variable 1 and Variable 2 is the same as the original variable’s process. There is no reduction in the standard deviationA standard deviation is a (slightly messy) statistical calculation that results in a positive number that measures how much the possible results differ from the average result. For those of you who ... More (our measure of riskThe possibility that something bad will happen.), so there is no diversificationThe reduction in volatility created by combining two or more processes (such as the prices of financial instruments) that do not have 100% correlation. when variables have +100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time..

### -100% Correlation

If I have a third random variableA quantity whose possible values are the outcomes of a random process. For example, the result of rolling a dice is a random variable with possible outcomes of 1, 2, 3, 4, 5 and 6. If the dice is ... More with the same properties but the correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. with Variable 1 is -100%, the outcomes and averages by observation would be:

The average of the averages is 0 and so is the standard deviationA standard deviation is a (slightly messy) statistical calculation that results in a positive number that measures how much the possible results differ from the average result. For those of you who ... More! By taking two variables that have ‑100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time., all volatilityThe possibility that something will deviate from its expected or average value, including both good and bad results. has been eliminated.

### 0% Correlation

If I have a fourth random variableA quantity whose possible values are the outcomes of a random process. For example, the result of rolling a dice is a random variable with possible outcomes of 1, 2, 3, 4, 5 and 6. If the dice is ... More with the same properties but it is uncorrelated with Variable 1, the outcomes and averages by observation would be:

The average of the averages is 54 and the standard deviationA standard deviation is a (slightly messy) statistical calculation that results in a positive number that measures how much the possible results differ from the average result. For those of you who ... More is 17. By taking two variables that are uncorrelated, the standard deviationA standard deviation is a (slightly messy) statistical calculation that results in a positive number that measures how much the possible results differ from the average result. For those of you who ... More has been reduced from 27 to 17.

### Other Correlations

The standard deviationA standard deviation is a (slightly messy) statistical calculation that results in a positive number that measures how much the possible results differ from the average result. For those of you who ... More of the average of the two variables increases as the correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. increases. When the variables have between -100% and 0% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time., the standard deviationA standard deviation is a (slightly messy) statistical calculation that results in a positive number that measures how much the possible results differ from the average result. For those of you who ... More will be between 0 and 17. If the correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. is between 0% and +100%, the standard deviationA standard deviation is a (slightly messy) statistical calculation that results in a positive number that measures how much the possible results differ from the average result. For those of you who ... More will be between 17 and 27. This relationship isn’t quite linear, but is close. The graph below shows how the standard deviationA standard deviation is a (slightly messy) statistical calculation that results in a positive number that measures how much the possible results differ from the average result. For those of you who ... More changes with correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. using random variables with these characteristics.

# Key Take-Aways

Here are the key take-aways from this post.

- CorrelationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. measures the extent to which two random processes move proportionally and in the same direction. Positive values of correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. indicate that the processes move in the same direction; negative values, the opposite direction.
- The lower the correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time. between two variables, the greater the reduction in volatilityThe possibility that something will deviate from its expected or average value, including both good and bad results. and riskThe possibility that something bad will happen.. At 100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time., there is no reduction in riskThe possibility that something bad will happen.. At -100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time., all riskThe possibility that something bad will happen. is eliminated.
- DiversificationThe reduction in volatility created by combining two or more processes (such as the prices of financial instruments) that do not have 100% correlation. is the reduction in volatilityThe possibility that something will deviate from its expected or average value, including both good and bad results. and riskThe possibility that something bad will happen. generated by combining two or more variables that have less than 100% correlationA number between -100% and +100% that represents the extent to which two processes or variables move in the same direction at the same time..

Do you mind if I add to this? Your reasoning should be accepted and normal for all…it’s just that..I’d add some other points that fit this. With respect- thanks for reading.

We welcome additional points about our posts.