datapunt_processing.load package

Submodules

datapunt_processing.load.load_csv_to_postgres module

datapunt_processing.load.load_csv_to_postgres.load_csv_to_postgres(datadir, filename, table_name, schema, config_path, config_name, all_csv=None)

Load csv into postgres for single & multiple files Args:

datadir: data directory where file to be uploaded is stored. f.i. data/ filename: name of the csv table_name: name of the tbale in postgresql schema: the schema in postgreSQL where data file should land config_path: path to the config file. f.i. auth/config.ini config_name: name of the databse config. f.i. ‘postgresql’ all_csv = default false. If True will upload all the csv files in the datadir.

example: load_csv_to_postgres(datadir=PATH, filename=’afvalcontainers_munged.csv’,
table_name = ‘afvalcontainers_munged’, schema = ‘service_afvalcontainers’, config_path= ‘../config.ini’, config_name=’postgresql’)

datapunt_processing.load.load_file_to_ckan module

datapunt_processing.load.load_file_to_ckan.find_resource_id_if_exists(url, dataset_name, file_name)
datapunt_processing.load.load_file_to_ckan.main()
datapunt_processing.load.load_file_to_ckan.parser()

Parser function to run arguments from commandline and to add description to sphinx docs. To see possible styling options: https://pythonhosted.org/an_example_pypi_project/sphinx.html

datapunt_processing.load.load_file_to_ckan.upload_file_to_ckan(url, dataset_name, file_path)

Upload a file to the CKAN datastore.

Args:
  1. url: url of the catalog:

    https://api.data.amsterdam.nl/catalogus
    
  2. dataset_name: name of the dataset, which can be found on the ckan page url:

    https://api.data.amsterdam.nl/catalogus/dataset/afvalcontainers
    
  3. api_key: your private user key, which can be found on the user profile page.

  4. file_path: location of the file including filename:

    /path/to/file/to/upload.csv
    
Returns:
An uploaded file to the CKAN datastore.

datapunt_processing.load.load_file_to_dcatd module

datapunt_processing.load.load_file_to_dcatd.find_resource_id_if_exists(url, dataset_name, file_name)
datapunt_processing.load.load_file_to_dcatd.main()
datapunt_processing.load.load_file_to_dcatd.parser()

Parser function to run arguments from commandline and to add description to sphinx docs. To see possible styling options: https://pythonhosted.org/an_example_pypi_project/sphinx.html

datapunt_processing.load.load_file_to_dcatd.upload_file_to_ckan(url, dataset_name, file_path)

Upload a file to the CKAN datastore.

Args:
  1. url: url of the catalog:

    https://api.data.amsterdam.nl/catalogus
    
  2. dataset_name: name of the dataset, which can be found on the ckan page url:

    https://api.data.amsterdam.nl/catalogus/dataset/afvalcontainers
    
  3. api_key: your private user key, which can be found on the user profile page.

  4. file_path: location of the file including filename:

    /path/to/file/to/upload.csv
    
Returns:
An uploaded file to the CKAN datastore.

datapunt_processing.load.load_file_to_objectstore module

datapunt_processing.load.load_file_to_objectstore.check_existence_object(connection, container_path, filename)

Check if the file is present on the objectstore container_path,

Args:
  1. connection = Objectstore connection based on from helpers.connection import objectstore_connection
  2. container_path = Name of container/prefix/subfolder
  3. filename = Name of file, for example test.csv
Returns:
  • ‘The object was successfully created’
  • ‘The object was not found’
  • ‘Error finding the object’
datapunt_processing.load.load_file_to_objectstore.get_object(connection, container_path, filename, output_folder)

Download file from objectstore container.

Args:
  1. connection: Objectstore connection based on from helpers.connection import objectstore_connection
  2. container_path: Name of container/prefix/subfolder
  3. filename: Name of file, for example test.csv
  4. output_folder: Define the path to write the file to for example app/data when using docker.
Returns:
A file from the objectstore into the specified output_folder.
datapunt_processing.load.load_file_to_objectstore.main()
datapunt_processing.load.load_file_to_objectstore.parser()

Parser function to run arguments from commandline and to add description to sphinx docs.

datapunt_processing.load.load_file_to_objectstore.put_object(connection, container: str, filename: str, contents, content_type: str) → None

Upload a file to objectstore.

Args:
  1. container: path/in/store
  2. filename: your_file_name.txt
  3. contents: contents of file with use of with open(‘your_file_name.txt’, ‘rb’) as contents:
  4. content_type:’text/csv’,’application/json’, … Is retrievd by using the mime package.
Returns:
A saved file in the container of the objectstore.
datapunt_processing.load.load_file_to_objectstore.upload_file(connection, container_path, filename_path)

Upload file to the objectstore.

Args:
  1. connection = Objectstore connection based on from helpers.connection import objectstore_connection
  2. container_path = Name of container/prefix/subfolder, for example Dataservices/aanvalsplan_schoon/crow
  3. filename_path = full path including the name of file, for example: data/test.csv

Uses mime for content_type: https://stackoverflow.com/questions/43580/how-to-find-the-mime-type-of-a-file-in-python

Result:
Uploads a file to the objectstore and checks if it exists on in the defined container_path.

datapunt_processing.load.load_wfs_to_postgres module

exception datapunt_processing.load.load_wfs_to_postgres.NonZeroReturnCode

Bases: Exception

Used for subprocess error messages.

datapunt_processing.load.load_wfs_to_postgres.load_wfs_layer_into_postgres(pg_str, url_wfs, layer_name, srs, retry_count=3)

Get layer from a wfs service. Args:

  1. url_wfs: full url of the WFS including https, excluding /?:

    https://map.data.amsterdam.nl/maps/gebieden
    
  2. layer_name: Title of the layer:

    stadsdeel
    
  3. srs: coordinate system number, excluding EPSG:

    28992
    
Returns:
The layer loaded into postgres
datapunt_processing.load.load_wfs_to_postgres.load_wfs_layers_into_postgres(config_path, db_config, url_wfs, layer_names, srs_name)

Load layers into Postgres using a list of titles of each layer within the WFS service.

Args:

pg_str: psycopg2 connection string:

'PG:host= port= user= dbname= password='
Returns:
Loaded layers into postgres using ogr2ogr.
datapunt_processing.load.load_wfs_to_postgres.main()
datapunt_processing.load.load_wfs_to_postgres.parser()

Parser function to run arguments from commandline and to add description to sphinx.

datapunt_processing.load.load_wfs_to_postgres.run_command_sync(cmd, allow_fail=False)

Run a string in the command line.

Args:
  1. cmd: command line code formatted as a list:

    ['ogr2ogr', '-overwrite', '-t_srs', 'EPSG:28992','-nln',layer_name,'-F' ,'PostgreSQL' ,pg_str ,url]
    
  2. Optional: allow_fail: True or false to return error code

Returns:
Excuted program or error message.
datapunt_processing.load.load_wfs_to_postgres.scrub(line)

Hide the login credentials of Postgres in the console.

datapunt_processing.load.load_xls_to_postgres module

datapunt_processing.load.load_xls_to_postgres.load_xls(datadir, config_path, db_config_name)

Load xlsx into postgres for multiple files

datapunt_processing.load.load_xls_to_postgres.main()
datapunt_processing.load.load_xls_to_postgres.parser()

Module contents