digital_land.expectations.operations package

Submodules

digital_land.expectations.operations.csv module

digital_land.expectations.operations.csv.check_no_overlapping_ranges(conn, file_path: Path, min_field: str, max_field: str)

Checks that no ranges overlap between rows.

Two ranges [a_min, a_max] and [b_min, b_max] overlap if: a_min <= b_max AND a_max >= b_min

Parameters:
  • conn -- duckdb connection

  • file_path -- path to the CSV file

  • min_field -- the column name for the range minimum

  • max_field -- the column name for the range maximum

digital_land.expectations.operations.csv.check_no_shared_values(conn, file_path: Path, field_1: str, field_2: str)

Checks that no value appears in both field_1 and field_2.

Parameters:
  • conn -- duckdb connection

  • file_path -- path to the CSV file

  • field_1 -- the first column name

  • field_2 -- the second column name

digital_land.expectations.operations.csv.check_unique(conn, file_path: Path, field: str)

Checks that all values in a given field are unique.

Parameters:
  • conn -- duckdb connection

  • file_path -- path to the CSV file

  • field -- the column name to check for uniqueness

digital_land.expectations.operations.csv.count_rows(conn, file_path: Path, expected: int, comparison_rule: str = 'greater_than')

Counts the number of rows in the CSV and compares against an expected value.

Parameters:
  • conn -- duckdb connection

  • file_path -- path to the CSV file

  • expected -- the expected row count

  • comparison_rule -- how to compare actual vs expected

digital_land.expectations.operations.dataset module

digital_land.expectations.operations.dataset.check_columns(conn, expected: dict)
digital_land.expectations.operations.dataset.count_deleted_entities(conn, expected: int, organisation_entity: int | None = None)
digital_land.expectations.operations.dataset.count_lpa_boundary(conn, lpa: str, expected: int, organisation_entity: int | None = None, comparison_rule: str = 'equals_to', geometric_relation: str = 'within')

Specific version of a count which given a local authority and a dataset checks for any entities relating to the lpa boundary. relation defaults to within but can be changed. This should only be used on geographic datasets :param conn: sqlite connection used to connect to the db, wil be created by the checkpoint class :param lpa: The reference to the local planning authority (geography dataset) boundary to use :param expected: the expected count, must be a non-negative integer :param organisation: optional additional filter to filter by organisation_entity as well as boundary :param geometric_relation: how to decide if the data is related to the lpa boundary

digital_land.expectations.operations.dataset.duplicate_geometry_check(conn, spatial_field: str)

Compares all the geometries or points of entities in a dataset to find duplicates. Geometries are classed as duplicates if they have > 95% intersection, points are classed as duplicates if they are an exact match :param conn: spatialite connection used to connect to the db, wil be created by the checkpoint class :param spatial_field: the field to be used for comparison, either 'point' or 'geometry'

Module contents