A framework for data discoveries

Why we need data discoveries

We get a lot of requests for data standards.

A data discovery helps us to do 2 key things:

  • Understand the data need for a planning concern
  • Gather enough information to start working on a data model to satisfy the needs

Understanding the data need for a planning concern allows us to determine if we are the right team to help. It gives us a clue as to whether a data standard is required and it gives us a good ‘definition of done’ to work with.

We need to gather enough information to start working on a data model. We have a set of questions we try to answer in any data discovery, if we have the answers to these questions we are confident we know enough to start modelling how the data should be shaped.

What we need to start a data discovery

Before we can start a data discovery we need:

  • to know who the point of contact is for the planning concern
  • to know if there is primary legislation and regulations for the planning concern, and where we can find them
  • access to any previous research work
  • access to any existing data related to the planning concern
  • if there are any friendly/interested publishers we should speak to

Having this gives us a good place to start but we’d like as much information upfront as we can get. We currently ask policy teams to fill in a template document for us. We’re working on an improved list of questions.

What a data discovery helps us to learn

We use data discoveries to help us learn more about a planning concern and whether there is a need for data. There are specific things we need to know to be able to design a suitable data solution.

A data discovery will help us answer these questions:

  • What is this planning concern?
  • What name do people use for it? Does that change in different contexts and for different users?
  • Is there a need for data?
  • Do we need a data standard?
  • Who is the user or users and what do they need from the data?
  • How do we expect them to use the data?
  • Will the data be used to make decisions? What are those decisions?
  • What will the data unlock?
  • Are there any risks to making the data available?
  • Who is the authoritative source? Who, in law, is responsible for the thing or makes decisions about the thing?
  • How many of these things are there?
  • How frequently do these things change?
  • What is the high-level lifecycle of the thing?

What we produce from a data discovery

We’ll produce a number of outputs during a data discovery. These will help us record what we have learned and will also help us during the data modelling phase if needed.

We will produce:

A summary report

This should include:

  • a description and high level explanation of the planning concern
  • an answer to each of the questions under “What a data discovery helps us to learn”
  • a decision on what we think is needed

This summary should provide enough information for us to start modelling the data.

We should be able to share this summary publicly.

We should share this information in the github issue thread for the planning concern.

A set of data needs

We will be able to use these needs to test any data model and data we produce.

These should make sense to the policy team that asked for a data standard and any potential users of the data.

We want to be able to share this list and refer back to individual needs.

A fact sheet

Part of the data discovery process is collecting useful information and links. We should include this in an easily digestible form.

Specific questions about the planning concern

At the end of the data discovery we should have enough information to start modelling the data, however we will still need to find out more domain specific information. We will maintain a list of questions that we still need to discover the answers for.

Some of the activities we’ll do in a data discovery

The activities we’ll need to do during a data discovery will be different for each planning concern. We’ll choose the right activities to help us learn what we need to.

The activities we might do include:

Questionnaire/templates

Providing the policy teams with a pre-designed template to gather their insights on their particular concern. This will help us to understand their scope on the project from the policy perspective and help shape our discovery.

Desk research

This will be key in areas where there has not been any previous research or understanding of the data need. This should give us clarity on the concern and also generate further questions.

Interviews

Speaking with local authorities to discuss their experiences with the specified planning concern.

Field studies

Where applicable, looking at real data and real examples to help inform our understanding of the data need and also getting a better idea of what is currently happening in the specified planning area as well and marking areas of success and where we need to make improvements.