All this hoopla about data, data science, data lakes and data protection leave the impression that data is a new thing. That data somehow popped up at the beginning of the digital revolution and became the most important thing in the world. This misperception prevents us from seeing and understanding the big picture. Including scams lurking just around the corner…
We always had data. Even before humans could write and read, we had data – in our personal ‘local memory’ and communicated between us. We even had backups by means of community sharing.
Fast forward to 2000 years BC, the Greeks and Assyrians had ‘data lakes’ (libraries), the most famous being The Great Library of Alexandria (approximately 280BC). These (analog) data collections were small in data, huge in physical size by our metrics, but indisputably the data lakes of their time.
Technology advanced – sheepskin, parchment and papyrus became paper, the flow of data increased, data lakes grew and proliferated – as did our dependency on them. The Gutenberg printing press accelerated the growth immensely, then came computers (aka ‘data processors’) and actually broke a formerly continuous chain of development: Data went non-physical and custody changed hands.
An easily ignored but very important point: Change of custody from those who cared about the data to those who cared about the machines – the calculators (computers) and the storage (disk drives). The new custodians had different priorities.
The high priests behind the glass walls of the early computer rooms knew what made them powerful – and it wasn’t data. The computer age had started and the new kings played their hand well. Their machines were big, expensive and impressive. The data was just a bunch of cards and tape reels, not impressive at all, more like a given. Focus is always on the money.
They – I might as well say ‘we’ since I was one of them – managed to keep it that way for several decennia, until technology became cheap and commonplace. The kings of the humming server room lost their status and data was rediscovered. The world realized that technology is just a means, that the real value is and always had been the data.
With data back in focus, we discovered that the difference between the old (pre-digital) and the digital world was much smaller we’d thought, the relation between data and business in particular. Not even our dependency on data is very different from 100 years ago. Not data, no business – then and now. What is vastly different is the volume, the visibility and the fact that many products and services ARE data, making data not only the enabler but the product.
There are many interesting paths of discussion from this point on – all very tempting. Such as the wide-ranging consequences of data loosing their physical representation, becoming virtual and no longer obeying the laws of physics. Or how data have – or lack – value depending on their associated metadata. These and many other elements of the digital transformation have been analyzed in detail in books and articles over the years – including my recent post Corporate Survival: Delete MORE Data, about the real and present danger of businesses literally drowning in their own data, most of which is garbage. A poorly understood challenge with potentially severe consequences.
Along an entirely different axis is an old favorite, Nicholas Carr’s The Big Switch, which isn’t about data but about disruption and the bigger picture: A great read for understanding the dramatic and inevitable switch from the old, analog world to the current almost-entirely-digital world.
My point this time though, is dataflow and data brokers – the middlemen enabling a huge and almost invisible part of the digital economy. Like data, the brokers have been around for hundreds of years, but their role has recently changed dramatically. Today’s data brokers do much more than connect buyers and sellers of data, negotiate prices and payment and simplify business. They more often than not also offer data storage and manipulation: Adaptions, conversion, mixing, blending, compressing, filtering and much more.
These activities change the brokers’ role from passive conduits to active proxies, which effectively promotes them from assistants to business partners and service providers. The change has turned ‘data broker’ into a business segment of its own, known by many different names such as ‘data providers’, ‘information brokers’ and more – interestingly discussed by Michal Wlosik of Clearcode in the post What is a Data Broker and How Does It Work?
Some years ago I worked with an apartment rental agency (a brokerage on its own by the way) to move them into the digital world. While mapping out the dataflow they already had and the one they needed, we identified more than 20 external data sources required to create a smoothly running operation. Which was beyond what this small company could handle in terms of contracts, negotiations, followup and complexity. Eventually we located two brokers which together covered more than 80% of the resources (‘feeds’) we needed – problem solved.
What we didn’t realize at the time was that some data brokers are ‘broken’. Not physically or technically, but ethically and practically (as in security, data protection). In particular but not solely related to their handling of personal information. Unsurprising in hindsight, but the extent of which not-so-serious data brokerages have proliferated over the past 5-8 years is outright scary – a piece of reality that has been revealed by law enforcement, cyber security companies and interested parties over the last few years (including professional data brokers who see their business threatened by rogue actors).
To get an idea about the ‘what and how’ in this shady segment, read Bob Sullivan’s post Data brokers, in bed with scammers, aimed their algorithms at millions of elderly, vulnerable on The Red Tape Chronicles (via Substack). The creativity and tenacity of the scammers is either scary or impressive, depending on your point of view. Which shouldn’t surprise us, but still does (for a different example of the same type of creativity, check out the post The Customer Service Disaster).
My message is this: Don’t implicitly trust data brokers. They may have been generally trustworthy at some point, but no more. Check out their reputation, routines, practices, sources, references, history, certifications (if applicable) etc., like you would any other business partners that you will be relying on.
Quality has many facets. Digital ethics, transparency and reliability may not be on our old checklist. They should be on the new one. Too many data brokers are broken.