When Good Data Goes Bad

In Big Data, Data Governance, Data Management, Data Quality, Data Warehouse by IRM UKLeave a Comment

Print Friendly, PDF & Email

Data doesn’t really “go” bad, of course. At least, not in the way that leftovers in the refrigerator do. Sometimes data just starts off bad and gets worse. Other times, it’s people or processes that do bad things to data or with it. In contrast to all the articles that celebrate data goodness, this series points out what can go wrong with data—and what to do about it.

Dr. Barry Devlin, Founder and Principal, 9sight Consulting
Barry will be presenting the courses via Live Streaming,
 Essentials of Data Warehouses, Lakes and BI in Digital Business‘ 1-2 November 2021
and ‘From Analytics to AI: Transforming Decision Making in Digital Business 29 November 2021
Barry will be speaking at the IRM UK Virtual Data Governance & Master Data Management Conference 15-17 November 2021. He will be speaking on the subject, ‘Plenary Keynote: When Good Data Goes Bad‘.

Published by Cutter Consortium, March 2, 2021, as the first part of a series. See link for full article and links to further parts of the series. Reprinted with permission.

“Don’t be evil” was Google’s original motto, now deprecated like some past feature of a previous version of corporate software. As the quintessential data company, Google offers a good starting point for an exploration of data at its worst and, indeed, its best. Like many quotable mottoes, its origins are disputed but its most memorable moment came in Google’s founders’ letter[1] for its 2004 IPO, “Don’t be evil. We believe strongly that in the long term, we will be better served—as shareholders and in all other ways—by a company that does good things for the world even if we forgo some short-term gains.”

The problem for Google and, since 2015, parent Alphabet is that—in the eyes of many observers—its use of data strays far from good. Larry Page and Sergey Brin’s lofty IPO ideals “[our search results] are unbiased and objective, and we do not accept payment for them or for inclusion or more frequent updating” may be well met in principle. However, they hide a deeper philosophical dilemma for a company whose major source of income is advertising and whose USP is that it knows its audience far better than any other company. That unique selling proposition comes courtesy of the treasure trove of behavioral data it has amassed on the vast majority of internet users.

The Extractive Imperative

As big data proliferated, the phrase “data is the new oil” became wildly popular with marketing executives. Bernard Marr explains the many ways in which the phrase is “lazy and inaccurate” in his 2018 article[2], while admitting it’s also a good analogy: “it’s easy to draw parallels due to the way information (data) is used to power much of the transformative technology we see today.” However, as the climate emergency has grown and oil has come to be seen as one of the main culprits, a darker side of the analogy emerges. The drive to squeeze every last, viable drop of oil from every reservoir and tar sand has resulted in enormous climate and environmental impacts. This profit-driven extractive imperative means that natural resource extraction must continue and expand regardless of changing circumstances, ultimately undermining civilization, and the planet.

Now, in the 21st century, this same extractive imperative has been applied to data, with Google as its first and major proponent, followed by Facebook and others such as Amazon, Oracle, Apple, and Microsoft.  Behavioral data, mined from every online and, increasingly, offline interaction and transaction, has become the main, mandatory raw resource for such businesses.

The old saw that “if you’re not the paying customer, you’re the product” is, in fact, too generous. You are, in reality, the pay dirt from which is extracted the raw resource that drives industrial scale prediction and prescription of your own future behavior for the real customers of Google and its cohorts: advertisers. Scary.

Google’s original rationale for collecting and analyzing search behaviors was to the benefit of users of its search product by delivering intuitively correct results. Its subsequent expansion of collection efforts emerged from the realization that its main and, perhaps, only business was as an advertisement broker, enabling it to launch itself as a very profitable public company. What better illustration of good data gone bad?

And yet, good vs. bad is not always so clear. In the outrage—of some of us—about the alleged use of Facebook data (via Cambridge Analytica) by Donald Trump and the Brexit campaigns in 2016[3], it is often forgotten that analytics of vast troves of personal data, supported by Google CEO Eric Schmidt, was credited as a major contribution to Barack Obama’s success in 2008 and 2012[4].

Conclusion

The data didn’t go bad, but the use to which it was put was dramatically changed and that new use drove the collection of even more categories and volumes of personal behavior data. In any digital transformation program, the lesson is to define early on the purpose of your data collection, consider and approve its ethics, and then strongly resist any temptation to expand the scope of use without a comprehensive ethical review.

Dr. Barry Devlin is among the foremost authorities on business insight and one of the founders of data warehousing, having published the first architectural paper in 1988. With over 30 years of IT experience, including 20 years with IBM as a Distinguished Engineer, he is a widely respected analyst, consultant, lecturer and author of the seminal book, “Data Warehouse—from Architecture to Implementation” and numerous White Papers. His 2013 book, “Business unIntelligence—Insight and Innovation beyond Analytics and Big Data” is available in both hardcopy and e-book formats. As founder and principal of 9sight Consulting (www.9sight.com), Barry provides strategic consulting and thought-leadership to buyers and vendors of BI solutions. He is continuously developing new architectural models for all aspects of decision-making and action-taking support. Now returned to Europe, Barry’s knowledge and expertise are in demand both locally and internationally.

Copyright Dr. Barry Devlin, Founder and Principal, 9sight Consulting


[1] Ovide, Shira, “What Would 2004 Google Say About Antitrust Probe?”, 23 June 2011, The Wall Street Journal

[2] Marr, Bernard, “Here’s Why Data Is Not The New Oil”, 5 March 2018, Forbes

[3] Cadwalladr, Carole & Graham-Harrison, Emma, “How Cambridge Analytica turned Facebook ‘likes’ into a lucrative political tool”, 17 March 2018, The Guardian

[4] Scherer, Michael, “Inside the Secret World of the Data Crunchers Who Helped Obama Win”, 7 November 2012, Time

Read more from IRMConnects and subscribe to the monthly newsletter here.

Leave a Comment