datacatalog package

Modules

application

class Application(*args, middlewares=None, **kwargs)[source]

Bases: aiohttp.web_app.Application

The Application.

Todo

Inheritance from web.Application is discouraged by aiohttp. This class, with its initializer, must be replaced by an application factory method that builds an application object from web.Application.

_abc_cache = <_weakrefset.WeakSet object>
_abc_negative_cache = <_weakrefset.WeakSet object>
_abc_negative_cache_version = 82
_abc_registry = <_weakrefset.WeakSet object>
_initialize_sync()[source]
_load_plugins()[source]
config
hooks
pm
pool
_on_cleanup(app)[source]
_on_startup(app)[source]
_resolve_plugin_path(fq_name)[source]

Resolve the path to a plugin (module, class, or instance).

Parameters:

fq_name (str) – The fully qualified name of a module, class or instance.

Raises:

authorization

_enforce_all_of(request, security_requirements)[source]
Return type:bool
_enforce_one_of(request, security_requirements)[source]
_extract_api_key_info(request, security_scheme)[source]
Return type:Any
_extract_authz_info(request, security_definitions)[source]
_extract_scopes(request)[source]
Return type:Set[~T]
_get_path_spec(paths, path, method=None)[source]

Adapted from swagger-parser library.

Return type:Optional[Tuple[str, str]]
middleware(app, handler)[source]

config

Module that loads the configuration settings for all our services.

CONFIG_PATH

If set, the configuration is loaded from this path.

Example usage:

from . import config
CONFIG = config.load()
os.chdir(CONFIG['working_directory'])
DEFAULT_CONFIG_PATHS
Vartype:list[pathlib.Path]

By default, this variable is initialized with:

  • /etc/dcatd.yml
  • ./config.yml
class ConfigDict[source]

Bases: dict

validate(schema)[source]

Validate this config dict using the JSON schema given in schema.

Raises:ConfigError – if schema validation failed
exception ConfigError[source]

Bases: Exception

Configuration Error

Todo

Documentation: When is this error raised?

DEFAULT_CONFIG_PATHS = [PosixPath('/etc/dcatd.yml'), PosixPath('config.yml')]

List of locations to look for a configuration file.

class _TemplateWithDefaults(template)[source]

Bases: string.Template

String template that supports Bash-style default values for interpolation.

Copied from Docker Compose

idpattern = '[_a-z][_a-z0-9]*(?::?-[^}]+)?'
pattern = re.compile('\n \\$(?:\n (?P<escaped>\\$) | # Escape sequence of two delimiters\n (?P<named>[_a-z][_a-z0-9]*(?::?-[^}]+)?) | # delimiter and a Python identifier\n {(?P<braced>[_a-z][_a-, re.IGNORECASE|re.VERBOSE)
substitute(**kws)[source]
_config_path()[source]

Determines which path to use for the configuration file.

Raises:FileNotFoundError – if no config file could be found at any location.
Return type:Path
_config_schema()[source]
Return type:Mapping[~KT, +VT_co]
_interpolate(config)[source]

Substitute environment variables.

Recursively find string-type values in the given config, and try to substitute them with values from os.environ.

Note

If a substituted value is a string containing only digits (i.e. str.isdigit() is True), then this function will cast it to an integer. It does not try to do any other type conversion.

Parameters:config (dict) – configuration mapping
Return type:dict
_load_yaml(path)[source]

Read the config file from path.

Raises:
  • yaml.YAMLError – syntax error in YAML.
  • KeyError – Required environment value not found.
Return type:

dict

load()[source]

Load and validate the configuration.

Return type:ConfigDict

dcat

class Date(*args, format=None, pattern=None, **kwargs)[source]

Bases: datacatalog.dcat.String

canonicalize(data, **kwargs)[source]
Return type:Optional[str]
class Direction[source]

Bases: enum.Enum

An enumeration.

GET = 0
PUT = 1
class Enum(values, *args, allow_empty=None, **kwargs)[source]

Bases: datacatalog.dcat.String

full_text_search_representation(data)[source]
schema
class Integer(*args, multipleOf=None, maximum=None, exclusiveMaximum=None, minimum=None, exclusiveMinimum=None, **kwargs)[source]

Bases: datacatalog.dcat.Type

canonicalize(data, **kwargs)[source]
full_text_search_representation(data)[source]
schema
class Language(*args, format=None, pattern=None, **kwargs)[source]

Bases: datacatalog.dcat.String

class List(item_type, *args, required=False, default=None, allow_empty=True, unique_items=None, **kwargs)[source]

Bases: datacatalog.dcat.Type

canonicalize(data, **kwargs)[source]
Return type:Optional[list]
full_text_search_representation(data)[source]

We must check whether the given data is really a list, jsonld may flatten lists.

schema
class Object(*args, **kwargs)[source]

Bases: datacatalog.dcat.Type

add(name, value, before=None)[source]
canonicalize(data, **kwargs)[source]
full_text_search_representation(data)[source]
property_names
schema
class OneOf(*types, **kwargs)[source]

Bases: datacatalog.dcat.Type

canonicalize(data, **kwargs)[source]
full_text_search_representation(data)[source]
schema
validate(data)[source]

Validate the data.

Returns:the Type for which validation succeeded. See also OneOf.validate()
Return type:Type
class PlainTextLine(*args, pattern=None, **kwargs)[source]

Bases: datacatalog.dcat.String

class String(*args, pattern=None, max_length=None, allow_empty=False, **kwargs)[source]

Bases: datacatalog.dcat.Type

canonicalize(data, **kwargs)[source]
full_text_search_representation(data)[source]
schema
class Type(*args, title=None, description=None, required=False, default=None, examples=None, format=None, read_only=None, write_only=None, sys_defined=None, **kwargs)[source]

Bases: object

canonicalize(data, direction=<Direction.GET: 0>)[source]
full_text_search_representation(data)[source]
Return type:Optional[str]
schema
validate(data)[source]

Validate the data.

Returns:the Type for which validation succeeded. See also OneOf.validate()
Return type:Type

jwks

Helper module to handle JWKS stuff.

This module provides only one method: load(), which may raise a JWKError.

exception JWKError[source]

Bases: Exception

Error raised when parsing a JWKSet fails.

MappingProxyType

alias of builtins.mappingproxy

_Key

Immutable type for key storage

alias of datacatalog.jwks.Key

_KeySet

Immutable type for key sets

alias of datacatalog.jwks.KeySet

_load_ecdsa(key, is_verifier)[source]
default_backend()[source]
load(jwks)[source]

Parse a JWKSet and return a dictionary that maps key IDs on keys.

Parameters:jwks (str) –
Raises:JWKError – when parsing fails

main

main()[source]

openapi

plugin_interfaces

add_startup_action(app, name)[source]

Set action to done

check_startup_action(app, name)[source]

Check if action has been done

Return type:bool
deinitialize(app)[source]

Called when the application shuts down.

Parameters:app – the Application object.
Return type:None
get_old_identifiers(app)[source]

Get old identifiers

health_check(app)[source]

Health check.

Return type:Optional[str]
Returns:If unhealthy, a string describing the problem, otherwise None.
Raises:Exception – if that’s easier than returning a string.
initialize(app)[source]

Called by the plugin-manager when the event loop starts.

If your plugin needs to do some initialization even before the event loop starts, you’ll need to do this in initialize_sync().

Parameters:app – the Application object.
Return type:None
initialize_sync(app)[source]

The first method to be called by the plugin-manager.

If your plugin needs to do some asynchronous initialization, try initialize()

Parameters:app – the Application object.
Return type:None
mds_canonicalize(data, id=None, direction=<Direction.GET: 0>)[source]

Canonicalize the given document according to this schema.

Return type:dict
Returns:dict with canonicalized entries
mds_context()[source]

Context of the metadata schema.

Return type:dict
mds_full_text_search_representation(data)[source]

Full text search representation of the given data.

Return type:str
mds_json_schema(app)[source]

The json schema.

Return type:dict
mds_name()[source]

The schema this plugin provides.

Return type:str
Returns:a string that is safe for use in a URL segment; ie. every string that matches regular expression ^(?:%[a-f0-9]{2}|[-\w:@!$&'()*+,;=.~])*$

Search.

Parameters:
  • app – the application
  • q (str) – the query
  • sort_field_get (Callable[[dict], str]) – function to get propererty to sort by.
  • result_info (Mutablemapping[~KT, ~VT]) – mapping in which all encountered facets in the result set are put
  • facets (Optional[Iterable[str]]) – a list of facets to return and count
  • limit (Optional[int]) – maximum hits to be returned
  • offset (Optional[int]) – offset in resultset
  • filters (Optional[Mapping[str, Mapping[str, Union[str, Set[str]]]]]) – mapping of JSON pointer -> value, used to filter on some value.
  • iso_639_1_code (Optional[str]) – the language of the query
Return type:

Asyncgenerator[Tuple[str, dict], None]

Returns:

A generator over the search results (id, doc, metadata)

Raises:

ValueError if filter syntax is invalid, if the ISO 639-1 code is not recognized, or if the offset is invalid.

set_new_identifier(app, old_id, new_id)[source]

Set new identifier

storage_all(app)[source]

Get all data

Return type:Asyncgenerator[Tuple[str, str, dict], None]
storage_create(app, docid, doc, searchable_text, iso_639_1_code)[source]

Store a new document.

Parameters:
  • app – the application
  • docid (str) – the ID under which to store this document. May or may not already exist in the data store.
  • doc (dict) – the document to store; a “JSON dictionary”.
  • searchable_text (str) – this will be indexed for free-text search.
  • iso_639_1_code (Optional[str]) – the language of the document.
Return type:

str

Returns:

new ETag

Raises:

KeyError if the docid already exists.

storage_delete(app, docid, etags)[source]

Delete document only if it has one of the provided Etags.

Parameters:
  • app – the application
  • docid (str) – the ID of the document to delete.
  • etags (Set[str]) – the last known ETags of this document.
Raises:

ValueError if none of the given etags match the stored etag.

Raises:

KeyError if a document with the given id doesn’t exist.

Return type:

None

storage_extract(app, ptr, distinct=False)[source]

Generator to extract values from the stored documents, optionally distinct.

Used to, for example, get a list of all tags or ids in the system. Or to get all documents stored in the system.

Parameters:
  • app – the application
  • ptr (str) – JSON pointer to the element.
  • distinct (bool) – Return only distinct values.
Raises:

ValueError if filter syntax is invalid.

Return type:

Generator[str, None, None]

storage_id()[source]

New unique identifier.

Return type:str
storage_retrieve(app, docid, etags)[source]

Get document and corresponsing etag by id.

Parameters:
  • app – the application
  • docid (str) – document id
  • etags (Optional[Set[str]]) – None, or a set of Etags
Return type:

Tuple[Optional[dict], str]

Returns:

A tuple. The first element is either the document or None if the document’s Etag corresponds to one of the given etags. The second element is the current etag.

Raises:

KeyError – if not found

storage_update(app, docid, doc, searchable_text, etags, iso_639_1_code)[source]

Update the document with the given ID only if it has one of the provided Etags.

Parameters:
  • app – the application
  • docid (str) – the ID under which to store this document. May or may not already exist in the data store.
  • doc (dict) – the document to store; a “JSON dictionary”.
  • searchable_text (str) – this will be indexed for free-text search.
  • etags (Set[str]) – one or more Etags.
  • iso_639_1_code (Optional[str]) – the language of the document.
Return type:

str

Returns:

new ETag

Raises:

ValueError if none of the given etags match the stored etag.

Raises:

KeyError if the docid doesn’t exist.

startup_actions

add_resource_identifiers(app)[source]
replace_old_identifiers(app)[source]
run_startup_actions(app)[source]