digital_land.expectations.operations package
Submodules
digital_land.expectations.operations.csv module
- digital_land.expectations.operations.csv.check_no_overlapping_ranges(conn, file_path: Path, min_field: str, max_field: str)
Checks that no ranges overlap between rows.
Two ranges [a_min, a_max] and [b_min, b_max] overlap if: a_min <= b_max AND a_max >= b_min
- Parameters:
conn -- duckdb connection
file_path -- path to the CSV file
min_field -- the column name for the range minimum
max_field -- the column name for the range maximum
Checks that no value appears in both field_1 and field_2.
- Parameters:
conn -- duckdb connection
file_path -- path to the CSV file
field_1 -- the first column name
field_2 -- the second column name
- digital_land.expectations.operations.csv.check_unique(conn, file_path: Path, field: str)
Checks that all values in a given field are unique.
- Parameters:
conn -- duckdb connection
file_path -- path to the CSV file
field -- the column name to check for uniqueness
- digital_land.expectations.operations.csv.count_rows(conn, file_path: Path, expected: int, comparison_rule: str = 'greater_than')
Counts the number of rows in the CSV and compares against an expected value.
- Parameters:
conn -- duckdb connection
file_path -- path to the CSV file
expected -- the expected row count
comparison_rule -- how to compare actual vs expected
digital_land.expectations.operations.dataset module
- digital_land.expectations.operations.dataset.check_columns(conn, expected: dict)
- digital_land.expectations.operations.dataset.count_deleted_entities(conn, expected: int, organisation_entity: int | None = None)
- digital_land.expectations.operations.dataset.count_lpa_boundary(conn, lpa: str, expected: int, organisation_entity: int | None = None, comparison_rule: str = 'equals_to', geometric_relation: str = 'within')
Specific version of a count which given a local authority and a dataset checks for any entities relating to the lpa boundary. relation defaults to within but can be changed. This should only be used on geographic datasets :param conn: sqlite connection used to connect to the db, wil be created by the checkpoint class :param lpa: The reference to the local planning authority (geography dataset) boundary to use :param expected: the expected count, must be a non-negative integer :param organisation: optional additional filter to filter by organisation_entity as well as boundary :param geometric_relation: how to decide if the data is related to the lpa boundary
- digital_land.expectations.operations.dataset.duplicate_geometry_check(conn, spatial_field: str)
Compares all the geometries or points of entities in a dataset to find duplicates. Geometries are classed as duplicates if they have > 95% intersection, points are classed as duplicates if they are an exact match :param conn: spatialite connection used to connect to the db, wil be created by the checkpoint class :param spatial_field: the field to be used for comparison, either 'point' or 'geometry'