Home > Articles > Finance & Investing

Detecting Linked Behavior in Stocks

  • Print
  • + Share This
Jeff Augen demonstrates a simple data visualization strategy that can be used by investors to identify linked behavior among groups of stocks.
From the author of

Detecting Linked Behavior in Stocks

Stocks that seem to display relatively strong parallel behavior across extended timeframes can have dramatically different responses to specific market conditions. Tools for predicting these differences can become a source of value creation for anyone attempting to balance risk by hedging one investment against another. Option traders can use the information to identify trading candidates for complex positions. This article reviews a simple data visualization strategy that can be used by investors to identify linked behavior among members of a large, randomly chosen group of stocks.

A meaningful experiment can be constructed by comparing the responses of individual stocks to two different sets of market conditions. For this discussion, we'll select two days when the markets experienced dramatic moves in opposite directions. The goal is to group stocks according to their responses to both events.

The first day (7/24/2007) was characterized by the steepest equities market decline in four years—a collapse driven by fears that a subprime lending crisis was beginning to spill over into the broad economy. Countrywide Financial Corp., the largest independent U.S. mortgage lender, reported that quarterly profits fell 33%. The company also slashed its full-year earnings outlook as a result of rising default rates caused by a housing market slump. At the same time, E. I. du Pont de Nemours and Company, the third-largest American chemical manufacturer, reported its steepest decline in two years, after sliding home sales reduced demand for paint and countertops.

The second day (9/18/2007) was a mirror image of the first. U.S. equity markets rallied sharply, with the Dow Jones Industrial Average rising more than 200 points after the Federal Open Market Committee reduced the federal funds rate a surprising 50 basis points. Financial analysts and economists had generally anticipated a 25 basis point cut; the larger-than-expected adjustment was widely viewed as positive for both the economy and financial markets. Moreover, the impact on financial markets was dramatic because the federal funds rate had not been reduced in more than four years. Coincident with the rate change surprise was news that the House of Representatives had just voted to allow the Federal Housing Administration, which insures mortgages for low- and middle-income borrowers, to back refinanced loans for tens of thousands of borrowers who were delinquent on payments. The delinquencies resulted from mortgage resets to sharply higher rates. Both news items were clearly inflationary—the dollar fell sharply and gold futures soared to a 27-year high. Finally, the positive news of the day was capped by stronger-than-expected earnings from Lehman Brothers.

The following table contains price change data for 17 stocks for both days. Although comparisons will be based on standard deviations, percent change for each record is also displayed. As we'll see, the stocks can be segmented into four distinct subgroups.

Price change responses of 17 stocks/exchange traded funds to two sets of market conditions. Changes are displayed in both standard deviations and percent.

Standard Deviations

Percent (%) Change



























































































We can create a visual representation of this table by mapping the data to a two-dimensional grid. The results are displayed in Figure 1. For each row in the table, the 7/24/2007 price change is measured on the x axis and the 9/18/2007 price change on the y axis. Intersections are recorded as points on the grid.

Scatter plot comprising price changes for 17 stocks. Each point represents the intersection of an x axis value (7/24/2007 price change), and a y axis value (9/18/2007 price change). All values are measured in standard deviations against the most recent 20-day volatility window.

Four distinct clusters are visible in Figure 1. More than half of the records are clustered in the upper-left corner, where 7/24/2007 involved a price decrease greater than 2 standard deviations and 9/18/2007 had an equivalently large increase. The lower-left corner contains two symbols (Apple Computer and Marathon Oil) that experienced larger decreases on 7/24/2007 but only modest increases (less than 1 StdDev) on 9/18/2007. These stocks significantly underperformed the market during the July-September timeframe. More significant are the four stocks in the upper-right corner that were immune to the large 7/24/2007 drawdown and rallied substantially on 9/18/2007 with the rest of the market (Google, Eli Lilly, Wal-Mart, and United Parcel Service). Bullish investors would be most interested in these stocks, which apparently required little hedging during this timeframe. Finally, the two stocks in the lower-right corner (Boeing and United Health Group) were relatively immune to the 7/24/2007 drawdown and unresponsive in the 9/18/2007 market rally. Short straddles placed on these stocks during the 7/24/2007 market volatility spike would have been highly profitable.

The importance of comparing price changes in standard deviations rather than the more popular percent change is underscored by the image in Figure 2. All other parameters being equal, the figure displays the intersection of the 7/24/2007 and 9/18/2007 price changes measured in percent. Gone are the four distinct clusters of Figure 1; in their place is a relatively diffuse continuum of points spanning all areas of the chart.

Repeat of Figure 1, with price changes measured as a percentage of the previous day's closing price.

This effect is a reflection of the disparities in volatility that are unavoidable in a large population of stocks. A large price change for a volatile stock can be less meaningful than a comparatively smaller price change for a less volatile stock. For example, a 5% price change for a $100 stock with 40% volatility is much less significant (2 StdDev) than a 3% price change for a $100 stock that exhibits only 10% volatility (4.8 StdDev). Ignoring underlying volatility effectively scrambles the results by destroying the meaning of the magnitude of the change. Conversely, price changes measured in standard deviations are, in effect, normalized so that direct comparisons have meaning. Readers are encouraged to build their own examples using heterogeneous mixtures of stocks that display widely varying volatilities.

Event-based clustering is especially relevant when the goal is to develop a database of historical correlations for predictive purposes. This approach serves as a sharp contrast to more traditional methods such as fundamental analysis or technical charting. An interesting comparison can be made with the science of weather prediction, where two basic strategies dominate. The first involves analyzing basic physical principles—cloud physics, thermals, temperature gradients, etc. The second involves building a database containing historical information about atmospheric parameters and the weather conditions that followed. Using this approach to predict the weather involves searching the database for a set of parameters that correspond closely to those currently being observed. If the theory is correct, the weather will follow the previously observed pattern. Both techniques have some relevance to predicting the performance of stocks. Proponents of the first method often refer to financial metrics, price-earnings ratios, 50-day moving averages, relative strength, stochastics, and the like. The second approach typically employs data mining strategies to identify repeating patterns in historical stock market data.

The previous examples were simplified for clarity. Using the same approach, we could have added many more stocks and at least one additional event. Higher-dimensional models representing more than three events cannot be visualized easily, and more complex analytical tools must be used to measure the geometric distances between data points. However, it's wise to limit the scope to a small number of well-characterized events and to construct a database containing a large number of distinct experiments. This approach optimizes the effectiveness of the technique for creating hedges and well-structured options positions.

  • + Share This
  • 🔖 Save To Your Account