Whether it’s through active participation in advisory groups, in-person events, sharing our work with their wider networks or helping us connect to industry experts, our constellation members are an integral part of Icebreaker One.
Aligned with our ethos of collaboration: ‘to go far, we go together’, they contribute to our mission of making data work harder to reach net zero. Now we want to highlight some of the important work they do for both people and the planet.
Icebreaker One employees make up the foundation of our constellation and in this week’s Q&A, I speak with our very own Chris Pointon, Product Manager, Data Services. We delve into the world of Open Net Zero, before gaining a deeper understanding of assurance and its impact in securing confidence and trust along the net zero value chain.
Ross: Hi Chris, thanks for taking the time to do this. It would be great to start by talking about Open Net Zero. Could you explain what it is and how it’s developed over the last year?
Chris: Of course. So Open Net Zero is an index. You could think of it like Google in the sense that Google doesn’t contain all of the content of all of the websites. It simply turns them into an index so you can find them. And that’s exactly what Open Net Zero does. It doesn’t store any data, it just finds it and points to it.
Last year, the IB1 data services team spent a lot of time on Open Net Zero, dramatically increasing the number of organisations and datasets that it can cover. This was because we added some capabilities to our indexing that allowed us to harvest the data catalogues of other people’s data portals and add them to our index, so you can find them all in one place. This meant that we went from a few hundred datasets at the beginning of 2023 to about 55,000 data sets, currently. Now one of the transitions we’re focused on is looking at how we join up these 55,000 datasets. One dataset is not usually the answer to a question. It usually takes several datasets that are connected together, for instance, because they’re in the same geography or they’re related to the same theme.
Ross: Why is it so important to have these datasets in a machine readable format?
Chris: We’ve still got a backlog of dozens of really good data websites that we can’t index using standard mechanisms. The reason we can’t index them is because they’ve built a website with data on rather than building a data portal. So when it comes to best practice in publishing data, organisations should provide a machine readable catalogue, meaning a file that a computer can parse which describes the data in their data portal or on their website. Typical formats for this file include DCAT, INSPIRE and GEMINI, but there are other options. When they publish that, we have the simple job of including the URL of the catalogue in the index and keeping up to date with it over time. We’ve been prioritising finding websites that publish data relevant to people who want to get to net zero, but also have this well-structured publishing technique.
Ross: You also launched ‘assurance’ last year, can you explain exactly what this is and why it’s important?
Chris: Assurance gives confidence to people inside companies that they’re allowed to share data externally. It also gives confidence to external users, as they know that they have permission to access what is being supplied, and can rely on the data management practices of the publisher. There are two types of assurance. First, we assure organisations. As a minimum, we ask, does this organisation meet some basic identity requirements and have they signed an agreement that means people can rely on the assurance they provide? Basically, it means that they’ve got some skin in the game.
The second level of organisational assurance is about entering into a secure data publishing environment. Level three and four are about indicating greater certainty of the identity of who’s involved. So instead of them self-asserting who they are and providing their own assurance measures, you bring in third-party auditors to provide additional assurance about how the organisation is managing its data.
Ross: What about assurance on the data side?
Chris: This is the other type of assurance. Again we have multiple levels. At level one, the data’s got to have a licence. The metadata has to be machine readable, publicly on the web, with no personal data and for open data there needs to be a clear open data licence. Level two starts to bring in tighter requirements on the licensing. The licensing has to have certain features that outline what people’s rights to reuse are. Level two also requires additional metadata to cover the timeframe and geographical aspects of the datasets, and the publication of documentation for them.
Level three builds on the above by requiring documentation on the data provenance and processing, open standards for data formats and machine-readable definitions of the fields in the data. Finally, at level four, we require that all of the information above – licences, provenance, quality parameters – are provided in machine-readable formats, so they can be passed along a chain.
We had been consulting about assurance over the summer last year and came up with these definitions. Then in September we launched an update to Open Net Zero that showed the assurance level of assured datasets. At roughly the same time, SSE launched their open data portal, showing assurance on their site. And so when we index their data portal, it shows their organisational assurance and their dataset assurance. From a technical perspective this is just a little data field, but from an organisational and internal processes point of view, it’s really an important step.
There will be a new version of these assurance levels coming soon. We’ve had some feedback and we’ll incorporate that before putting it out for another public consultation. It’s important to note that the purpose of assurance is not ‘we’ve got assurance and everyone has to put up with it’. It’s that we’ve got assurance and we want to know whether it works for people, whether it actually helps them feel more confident, have a greater trust in data and to what extent can it be improved. We’re all ears to hear ideas for making it better and better over time.
Ross: So how does assurance tie into the wider goal of net zero?
Chris: One of the fundamental things we talk about is the data value chain. And, the data value chains we want to bolster are the ones where the data becomes more and more valuable to getting to net zero. The way the data value chain works is data gets generated from all sorts of activities and sources and once it’s published, it’s then available for people to process and turn into information.
Take our Perseus project for instance. It covers the data governance needed to access half hourly electricity readings, and turn them into the carbon footprint of your electricity consumption using grid factors. So in that journey from electrical consumption to carbon emissions, you’re moving along the value chain. Next up in the chain the banks say ‘we’re going to use this information to prioritise lending to organisations that take steps to reduce their carbon footprint’ and that data has become more valuable because it’s been aggregated by a lending institution. Then finally the banks use all of that and another level of aggregation to say to the government – ‘we’re on track as a public limited company in the UK to meet the UK’s net zero requirements’.
Now going back to the question of assurance, you need to have confidence at each step of the value chain that the data is being handled properly and that the organisations are reputable, and by reputable it also means they are part of a legal framework that allows you to hold them to account. You can audit them too, which gives everybody confidence along the value chain. Compare this to the current reality, in which the government asks the banks ‘what are you doing to reduce the carbon in your lending portfolio’ and they can only talk in sectoral language, such as we’ve got 50,000 small office-based businesses on our books. The average footprint for an office-based business based on ONS is this and so they provide a rough estimate.