Standard process
The trigger for this process should be a new Jira ticket on the data managment board for adding data. This ticket should link to a Digital Land Service Desk ticket which contains the correspondence with the customer.
When you pick up the ticket, follow the steps below.
1. Validate endpoint
Follow the validate an endpoint process to check whether the data meets the specifications. If you find any issues you should respond the the data provider and ask if they can fix them.
Before adding new data you should also check whether there is already data for this provision on the platform. You can do this using the LPA dashboard in the publish service, our config manager reports, or by using the search page on planning.data.
If there is existing data you may need to retire an old endpoint alongside adding the new one. The scenarios in the maintaining data tutorials will help you work out the right process to follow.
NOTE - If adding a national dataset:
The validation process will not be so standard. We should receive a description of the dataset from data design in the ticket. Check that the data on the endpoint matches the description.
2. Add endpoint
Follow the add an endpoint process to set up the configuration for the new endpoint in the config repo.
Push your changes but do not merge them before moving on to the next step.
3. QA endpoint configuration
In order to make sure the configuration for the new endpoint is checked properly you should raise a PR for your changes and fill out the template that is automatically generated. This will make it clear what sort of change is being made, and also give you a QA checklist to fill out.
You should share your PR with a colleague in data management team to review, and they should follow the same checks in the checklist.
4. Merge changes (and run workflow)
Once your PR is approved the changes can be merged. At this point you could also run the action workflow to build the updated dataset (see the last step of the add an endpoint process).
NOTE - If adding a national dataset:
You should first run the action workflow in the development Airflow environment. This will publish the new data on the development planning data site so that the data design team can review it before going live.Currently, the live Airflow workflow runs each night and will pick up your changes, so you should run the dev workflow as early as possible in the day and let design know they have the rest of the day to review them. If you need to make significant changes, revert your commit from config
main
so they don’t go live.
5. Review new data on platform
Once the workflow has been run (either manually, or in the automated overnight process) you should carry out some last QA checks:
- Check that there are the expected number of new entities on the platform (you can use the search page to do this, either with the location parameter or the
organisation_entity
parameter if the new data doesn’t have a location). - Check if the new data is on the LPA dashboard in the publish service.
6. Close ticket
Once the final checks are complete you can close the tickets:
- Reply to the customer in Digital Land Service Desk using the canned responses to let them know data is live.
- Move Jira tickets to done
New endpoint scenarios
When adding data from a new endpoint you may need to follow slightly different steps based on the context, like exactly what data is being provided, how it’s being provided, and whether data for the provision already exists.
The sections below aim to explain some common scenarios to make it clear what steps should be followed, and what the expected outcome should be.
New endpoint for a single, new ODP provision
Scenario: Supplier has published an endpoint for a dataset for the first time.
E.g., Barnet has published their article-4-direction-area dataset.
Resolution: Follow the Validate Endpoint process.
Follow the Add Endpoint process.
Outcome:
Configuration -
New entries in endpoint.csv, source.csv, and lookup.csv
New entries in pipeline files based on requirement.
Platform -
Entities associated with the endpoint appear on the site.
New endpoint for multiple, new ODP provisions
Scenario: Supplier has published an endpoint for multiple datasets for the first time.
E.g., Barnet has published their tree-preservation-order and tree datasets in one endpoint.
Resolution:
- Follow the Validate Endpoint process.
- Follow the Add Endpoint process.
- Carefully follow the Handling Combined Endpoints section in the Add Endpoint process, which covers configuring multiple endpoints on the same link.
Outcome:
Configuration -
New entries in endpoint.csv, source.csv, and lookup.csv
New entries in pipeline files based on requirement.
Platform -
Entities associated with the endpoint appear on the site.
New endpoint with geographical duplicates
Scenario: A provider has shared a new endpoint to be added to the platform. During the endpoint validation step, geographical duplicates with existing data on the platform are identified.
E.g., North Somerset LPA share an endpoint for conservation-area data for us to add to the site. The endpoint checker flags that there are geographical duplicates in this dataset with conservation-area entities from Historic England that are already on the platform.
Note: this is most likely at the moment to happen for the conservation-area dataset, where we have both authoritative (LPAs) and alternative (Historic England) data providers.
Resolution:
- Follow the Add Endpoint process as normal up to step 5
- After step 5, once the new entries for the lookup.csv have been generated, use the outputs from the “Duplicates with different entity numbers” section of the endpoint checker to replace the newly generated entity numbers for any duplicates, with the entity numbers of the existing entity that they match.
Outcome:
Configuration -
New entries in endpoint.csv, source.csv, and lookup.csv
New entries in pipeline files based on requirement.
Platform -
Facts from the resources for the two separate records mapped to the same entity will appear on the platform under the same entity number