Correlation doesn’t always equal causation, but often correlation can serve as a signal. The collection and analysis of data in some areas of the world is messy and slow. Often times this means the data can only tells us what happened in the past. What we would ideally like is a snapshot of events and trends as they are unfolding. Many opportunities are missed because we simply couldn’t cut through the noisiness of the real world in a timely fashion.
Some folks are leveraging newly available data in urban areas as analytical shortcuts:
Ted Egan, chief economist in the San Francisco Controller’s Office, said he could wait six months for California to release the detailed sales-tax data he needs for city revenue projections. But it’s quicker to look at passenger tallies from the station closest to the Union Square shopping district, which generates roughly 10% of the city’s sales-tax revenue. The Bay Area Rapid Transit District releases the data within three days, he said: “Why should I have to wait?”
Mr. Egan is among a growing number of economists and urban planners who scour for economic clues in unconventional urban data—oddball measures of how people are moving, spending and working.
Egan has essentially found an analytical shortcut which allows him to see the world as it is more quickly. Rather than wait for the actual tax receipts, Egan can look at a related measure and extrapolate from there about tax revenues. Notice, Egan isn’t interested in why tax revenues and sales activity is low or high, he’s just trying to quickly get a handle on what the amount is.
This is illustrates the utility of correlational analysis and the promise of Big Data and data mining. The danger, of course, is that many of these new indicators may be untested and more volatile. On the other hand, new data may actually do a better job of capturing people’s patterns and motivations under certain circumstances and such a preview allows for business and policy makers to adjust more quickly to a rapidly changing environment. For example:
Mr. Leamer discovered that truckers’ diesel purchases on Interstate Highway 5 from California to Oregon, a major timber-trucking route, are a leading indicator of construction employment in California. Diesel sales on Interstate Highway 80 from Sacramento to Salt Lake City, a trucking route for the San Francisco Bay area’s manufactured goods, can help predict California’s manufacturing employment, he said.
If only he had the diesel-fuel data in the first half of 2008, when major government-issued indicators failed to hint at the U.S. economy’s impending downward spiral. At the time, Mr. Leamer said, UCLA forecasters chose not to announce a recession because GDP was still growing and the Bureau of Labor Statistics was reporting relatively mild job losses.
Bad call. The government later revised the GDP and jobs data downward, and the National Bureau of Economic Research concluded that the recession started in December 2007. The jobs data are unreliable because they are based on sample surveys and don’t adequately capture company openings and closings, Mr. Leamer said in hindsight.
When the UCLA economists reviewed the fuel-purchases data late last year, they saw diesel buying had peaked in mid-2007, indicating that fewer goods were being made and moved across the country in the months after. “Had we been aware of that data in 2008,” Mr. Leamer said, “we would have made a different call.”
In business and government, sometimes quickly knowing “what” is happening is more valuable than waiting to know precisely “why” it is happening.