Some basic thoughts on open data
The founder of all that we survey on the web, Tim Berners-Lee, said on this site a few years ago that, “Data is the New Links.”
He argued that with the ever-expanding world of data-driven products, and the explosion of graphs and social media, the benefits would only be realized by a positive attitude to sharing the data in an accessible way, without expecting too much in return.
I’ve often been reminded of a quote from the film Threads, which sums up the topic extremely well:
“In a modern society, everything connects. Each person’s needs are fed by the skills of many others. Our lives are woven together in a fabric — but the connections that make society strong, also make it vulnerable”.
Wind forward seven years from the Berners-Lee interview. We now live in an advanced Internet economy where developers are strongly encouraged to participate in the “hackathon” scene — such as the excellent Disrupt series — and adopt APIs offered by many and varied companies, bringing their 48-hour products to life.
This modern Internet trend depends on the principle of linking and sharing, but further to that, large-scale API providers — either part of the Four Horsemen or aspiring to that level — are now guardians of a fast-growing economy entirely dependent on safe, continued delivery.
Just as the world has feared the oncoming of “peak oil” and tried to defend itself against the day society and machinery grind to a halt — or wars over dwindling resources — we must defend ourselves against a notion of “peak data,” where the data powering millions of aspiring companies is so valuable it has to be kept inside walled gardens.
I’m reminded of a talk by Parse last year that referred back to a Mark Zuckerberg F8 U-turn about Facebook’s attitude toward rapid growth and service provision.
After initially adopting the “move fast and break things” mantra — a behaviour encouraged, and well described in a Ben Horowitz article about “Programming Your Culture” — it was quickly apparent that this would no longer be tolerated when the platform grew to underpin more than 500,000 developers, from fledglings to established businesses.
They are now working under the banner “move fast with stable infra[structure].” Whilst it’s not as sexy as the old method, it’s a sign of the times — and a very positive step many would do well to follow.
The Internet economy is littered with examples of bad citizenship — from APIs that are staggeringly hard to understand to the land of mobile apps, where SDKs that ship as closed-source often mask problems that cause build problems, crashes and user instability. These are especially difficult to recover from in the land of the snap-judgement 1-star review.
If an organization provides access to data, but closes off the means of understanding and processing that data — such as good documentation suite, self-service tools and SDKs — the value of the original access is lost.
It would be understandable if there was some IP in how the data was collected or aggregated, but that shouldn’t be exposed via an API or SDK anyway. Without a good understanding of what, why and how to use data, there’s not going to be much faith in its supplier.
Jeff Bezos once sent a memo to Amazon developers that “all service interfaces, without exception, must be designed from the ground up to be externalizable… the team must plan and design to be able to expose the interface to developers in the outside world.” But it’s something you can positively choose rather than being railroaded into doing.
Companies such as Stripe and Layer have taken this principle and really worked hard to make themselves both extensible and available. They might not be amongst the Four Horsemen of the Internet just yet, but their expansion depends on adoption, so clearly they understand how to get there.
It’s all about getting the basics right — understandable but highly functional APIs, reusable quick-start UI components and an effective support portal. Elsewhere, app analytics provider AdJust set out with an open-source SDK — not as a reaction, but as a positive step toward trusted adoption.
I was delighted when Parse extended its principle of stable infrastructure by open-sourcing all its SDKs, and even more delighted when the entire platform was open sourced after being shut down. Removing the mystery of what something’s doing inside your own product, and the added benefit of half-a-million people making the product better, proves that Sir Tim’s goal of sharing and trust is still possible in a fragmented age.
I would encourage all data publishers to go down this road; drive adoption of their products through trust and community — in the end, everyone wins.
Originally published at techcrunch.com on September 25, 2015. Minor edits made for context since the original publish date.