digital_land.pipeline package
Submodules
digital_land.pipeline.main module
- class digital_land.pipeline.main.EntityNumGen(entity_num_state: dict | None = None)
Bases:
object
- next()
- class digital_land.pipeline.main.Lookups(directory=None)
Bases:
object
- add_entry(entry, is_new_entry=True)
is_new_entry is an addition to allow for backward compatibility. Older lookups may not be valid in accordance with the current minimal column requirements :param entry: :param is_new_entry: :return:
- get_max_entity(prefix, specification) int
- load_csv(lookups_path=None)
load in lookups as df, not when we process pipeline but useful for other analysis
- save_csv(lookups_path=None, entries=None, old_entity_path=None)
- validate_entry(entry) bool
- class digital_land.pipeline.main.Pipeline(path, dataset)
Bases:
object
- columns(resource='', endpoints=[])
- combine_fields(endpoints=None)
- static compose(phases)
- concatenations(resource=None, endpoints=[])
- default_fields(resource=None, endpoints=[])
- default_values(endpoints=None)
- file_reader(filename)
- filters(resource='', endpoints=[])
- get_pipeline_callback()
- load_column()
- load_combine_fields()
- load_concat()
- load_default_fields()
- load_default_values()
- load_filter()
- load_lookup()
- load_migrate()
- load_patch()
- load_redirect_lookup()
- load_skip_patterns()
- lookups(resource=None)
- migrations()
- patches(resource='', endpoints=[])
- reader(filename)
- redirect_lookups()
- run(input_path, phases)
- skip_patterns(resource='', endpoints=[])
- digital_land.pipeline.main.chain_phases(phases)
- digital_land.pipeline.main.run_pipeline(*args)
digital_land.pipeline.process module
- digital_land.pipeline.process.convert_tranformed_csv_to_pq(input_path, output_path)
function to convert a transformed resource to a parrquet file.
Module contents
sub package containing code for processing resources into transformed resources