With the college football in full swing, it is worth a look to see how the teams stack-up, not based on rankings, but based on fan loyalty. Big Data and Data Visualization help us with this and lead to some interesting insights. College football also offers us a great opportunity to look at the Big Data of fandom and the business of college football and even that of universities.
Building on my post that looked at NFL fan allegiance by location (see first graphic above), I thought looking at some data visualization of college football allegiance would tell us a bit about who roots for which team and a bit more about how Big Data, Data Science, and data visualization are helping us understand complex problems in life and business.
The above graphic comes from the New York Times, using Facebook data to determine the dominant college football team by location. Lighter shades of the same color suggest a lower level of fan following than those of a darker shade (and thus the presence of some fan mixing). There are many interesting revelations. Fans are quite loyal to state boundaries. Schools like Auburn, Texas A&M, and even the University of Virginia have very small dominant footprints, in spite of their fiercely loyal fans and storied histories. The Ivy League and lower tiered football teams do not even register. Perhaps academic allegiance is different from and unrelated to football allegiance – but maybe not entirely. More generally, each state is dominated by one team, except for California, Florida and perhaps Texas (three of the most populous states, by the way). Although this data looks only at football fan allegiance, it says something about the recognition of universities by location and even their prowess in athlete and student recruiting. It says something (in part) about who might want to attend a university, too. That is big business, too!
Some other attempts to measure fan interest were attempted with a survey a few year ago. The following graphic was produced from such a survey via commoncensus.org.
Although some similarities are found in the two graphics, the sampling by commoncensus clearly included some really passionate fans of Michigan State and even Tulane fans living in (or was it just visiting?) upstate NY. And nobody seemed to awake the ‘Bama fans from their National Championship hangover of a few years ago to take the survey, showing the state of Alabama to be shared among other schools, which is surely a huge error.
Samples pose problems and often very big ones in Data Science. Increasing sample size is not always the answer, nor does it even reduce the problems with the data collected. We must look at how data is gathered and what really our sample measures. There are many Big Data lessons in this.
First, active sampling is inherently flawed in very dangerous ways. I consider active sampling or measurement the act of asking a person to respond. It is hard to get a broad and fair representation of the population. Many people do not participate due to the effort involved; reaching people can be hard; participants may not trust the process (and thus change their answers in various ways), answers are often skewed by some enthusiastic or motivated participants, and the resulting data is not representative of the overall population, accordingly. The sample reflects those that heard of the sample and cared to participate. The Facebook data did not involve asking people to state their favorite team. Instead, it simply looked at “likes” and what millions of Facebook users were doing as part of their normal course of life and business. Nobody even knew they were in a sample. Such measurement, I call passive data capture. Facebook passively measures college football fan allegiance; and the result is not only a bigger sample, but more confidence in the measures, and a better understanding of the phenomena of fan allegiance, because fan allegiance is not sought but found. Although also a sample, passively captured data is not contaminated by the act of actively asking and human reactions to that. Systems that passively measure our behavior are now commonplace in our lives. Smart phones measure our location without us barking it out. Smart phones also measure within a few minutes the time we awake (as many people check their smart phones within minutes of waking up). Nest thermostats passively measure our presence in our homes, and well the list is growing with wearables, cameras and senors in more of the things we operate and use. This is creating not just bigger data, but it is creating more powerful data, for the same reasons we saw in the above graphics on football allegiance.
Dubs has Washington wrapped-up. That’s right, the University of Washington dominates the Northwest! I don’t think those non-UW fans are even from Washington State. I believe they might be ‘Bama fans. See below:
In the Bay Area, it is very complicated, with big schools just miles from each other. Maybe you should keep a Stanford and Berkley sweater ready, just to play it safe and not create a trigger event when you cross the bay.
So, What is up with Chicago?
College football allegiance in the Chicago area is complicated. Check out this graphic.
Even dear Northwestern University struggles to get a dominance of fan west of US 41 or maybe that is Sheridan Road (not good news!). Michigan fans are even on the doorstep of Ryan Field! It does not help that Northwestern has a large graduate student body, bringing fans with undergraduate fan baggage. Notice how Michigan, Notre Dame, and even Wisconsin have encroached on not just Northwestern but also Illinois. Chicago is a tough place to grow a collegiate fan allegiance, given the demographic of people that move to Chicago. The hodgepodge of allegiance in Southside Chicago is even more perplexing. Maybe the fans in Hyde Park are still missing the University of Chicago Maroon football program.
Looking at college football fan allegiance shows there is more to the problem than just beating Michigan and winning over fans with on the field performance. The problem is more complicated and related to overall college football interest in Chicago.
The above graphic shows us where college football is popular overall. Deepest colors correspond to about 30-35% of the Facebook users in that area saying they like college football (of any school). This data is built on passively captured data, too. Although it is not perfect (and no sample ever is) the limitations and manipulation issues inherent with active data capture are significantly reduced. Notice that college football fever has a hold over Sweet Alabama and the fever has apparently spread across the Red River to Oklahoma and north to the Cornhuskers of Nebraska, too. The Oregon Duck fans are rapid with it, showing the biggest interest in college football in all of the west. The hotspots jump out and explain where college football is most popular. Amazing. Notice, the big void around Chicago. Amazing, too!
College football is just not as popular in the Chicago area as it is in the South. Is it cultural? Might the highly international and cosmopolitan cities of San Francisco, Chicago, New York, and even Miami have something in common about a lower interest in college football? Probably. Advertisers, take note! The people in these cities are from all over the world and college football is new to many of them. This is an opportunity as much as a perceived weakness. So, this last graph puts things into perspective.
These ideas on the importance of passive data capture over active data capture and the use of Data Science to create value from Big Data are some of those that I explore in great detail in my recent book, From Big Data to Big Profits: Success with Data and Analytics.
Professor Walker provides keynote talks, seminars presentations, executive training programs, and executive briefings.
About Russell Walker, Ph.D.
Professor Russell Walker helps companies develop strategies to manage risk and harness value through analytics and Big Data.
His most recent book, From Big Data to Big Profits: Success with Data and Analytics is published by Oxford University Press (2015), which explores how firms can best monetize Big Data. He is the author of the text Winning with Risk Management (World Scientific Publishing, 2013), which examines the principles and practice of risk management through business case studies.
He greatly enjoys the college football season and roots for the Florida teams (Florida being his home state), UW, Cornell, and UVA.