“We’d like to develop a standard for our data”


Image: https://xkcd.com/927/

As always, the thing about standards is that there are so many to choose from. And, in the context of the work we do at Icebreaker One, these span domains and sectors. 

So, instead of developing standards, we help people come together (through our Icebreaking process) to choose what works for their use cases, then help align and implement rules that mandate their utilisation, as part of a Trust Framework. Based on patterns in those implementations we look for what is repeatable and can scale.

Below is a primer on some of the (many) ways to think about the words ‘data’ and ‘standards’.

  • Classification Standards define a systematic way of categorizing data or objects based on predefined criteria, facilitating consistency and organization.
  • Data File Format Standards specify the structure and encoding of files to ensure compatibility and interoperability between different software applications and platforms.
  • Data Format Standards encompass guidelines for structuring data, including the arrangement of data elements and their representation, ensuring uniformity in data representation.
  • Data Inflow Standards establish protocols for the reception and integration of data from external sources into an organization’s systems or databases.
  • Data Management Standards provide best practices and guidelines for the effective handling, storage, and maintenance of data throughout its lifecycle.
  • Data Organization Standards outline principles for arranging data elements, records, or datasets in a structured and logical manner for easy access and retrieval.
  • Data Provider Standards set expectations and requirements for entities or systems that supply data to ensure data quality, consistency, and reliability.
  • Data Sharing Standards define protocols and rules for securely exchanging data between organizations or systems, often emphasizing privacy, security, and consent.
  • Geospatial Data Standards govern the format, content, and interoperability of geographic or spatial data, enabling consistent representation and analysis of location-based information.
  • Governance Standards encompass a framework of policies, processes, and controls that guide and oversee data-related activities to ensure compliance, quality, and accountability.
  • Linking and Matching Standards provide guidelines for identifying and connecting related data records or entities, often used in data integration and deduplication processes.
  • Metadata Standards specify how descriptive information about data, such as data definitions, source details, and usage instructions, should be structured and formatted.
  • Standardized Variable Standards define conventions for naming, measuring, and representing data variables to enhance consistency and comparability in data analysis and reporting.

As you can see, these embody many questions. It is easy to get lost in trying to define these in a generic manner (although there are often holistic models that can be utilised) and it can turn into quite an academic exercise. Instead, focusing on use cases helps drive towards what can be made consistent out of the chaos.

Here’s another useful view on creating consistency before standards: https://blog.ldodds.com/2023/09/18/consistency-before-standards/

And this ODI guide: https://standards.theodi.org/introduction/types-of-open-standards-for-data/