digital_land.pipeline package

Submodules

digital_land.pipeline.main module

class digital_land.pipeline.main.EntityNumGen(entity_num_state: dict | None = None)

Bases: object

next()
class digital_land.pipeline.main.Lookups(directory=None)

Bases: object

add_entry(entry, is_new_entry=True)

is_new_entry is an addition to allow for backward compatibility. Older lookups may not be valid in accordance with the current minimal column requirements :param entry: :param is_new_entry: :return:

get_max_entity(prefix, specification) int
load_csv(lookups_path=None)

load in lookups as df, not when we process pipeline but useful for other analysis

save_csv(lookups_path=None, entries=None, old_entity_path=None)
validate_entry(entry) bool
class digital_land.pipeline.main.Pipeline(path, dataset)

Bases: object

columns(resource='', endpoints=[])
combine_fields(endpoints=None)
static compose(phases)
concatenations(resource=None, endpoints=[])
default_fields(resource=None, endpoints=[])
default_values(endpoints=None)
file_reader(filename)
filters(resource='', endpoints=[])
get_pipeline_callback()
load_column()
load_combine_fields()
load_concat()
load_default_fields()
load_default_values()
load_filter()
load_lookup()
load_migrate()
load_patch()
load_redirect_lookup()
load_skip_patterns()
lookups(resource=None)
migrations()
patches(resource='', endpoints=[])
reader(filename)
redirect_lookups()
run(input_path, phases)
skip_patterns(resource='', endpoints=[])
digital_land.pipeline.main.chain_phases(phases)
digital_land.pipeline.main.run_pipeline(*args)

digital_land.pipeline.process module

digital_land.pipeline.process.convert_tranformed_csv_to_pq(input_path, output_path)

function to convert a transformed resource to a parrquet file.

Module contents

sub package containing code for processing resources into transformed resources