API and Code

Below is the API documention for the the Stetl Python code.

Main Entry Points

There are several entry points through which Stetl can be called. The most common is to use the commandline script bin/stetl. This command should be available after doing an install.

In some contexts like integrations you may want to call Stetl via Python. The entries are then.

stetl.main.main()[source]

The main function, to be called from commandline, like python src/main.py -c etl.cfg.

Args:
-c
--config <config_file>
 the Stetl config file.
-s –section <section_name> the section in the Stetl config (ini) file to execute (default is [etl]).
-a –args <arglist> substitutable args for symbolic, {arg}, values in Stetl config file, in format “arg1=foo arg2=bar” etc.
-d –doc <class> Get component documentation like its configuration parameters, e.g. stetl –doc stetl.inputs.fileinput.FileInput
-h –help get help info
stetl.main.print_doc(class_name)[source]

Print documentation for class in particular config options

class stetl.etl.ETL(options_dict, args_dict=None)[source]

The main class: builds ETL Chains with connected Components from a config and let them run.

Usually this class is called via main but it may be called directly for direct integration.

Core Framework

The core framework is directly under the directory src/stetl. Below are the main seven classes. Their interrelation is as follows:

One or more stetl.chain.Chain objects are built from a Stetl ETL configuration via the stetl.factory.Factory class. A stetl.chain.Chain consists of a set of connected stetl.component.Component objects. A stetl.component.Component is either an stetl.input.Input, an stetl.output.Output or a stetl.filter.Filter. Data and status flows as stetl.packet.Packet objects from an stetl.input.Input via zero or more stetl.filter.Filter objects to a final stetl.output.Output.

As a trivial example: an stetl.input.Input could be an XML file, a stetl.filter.Filter could represent an XSLT file and an stetl.output.Output a PostGIS database. This is effected by specialized classes in the subpackages inputs, filters, and outputs. New in 1.1.0: stetl.Splitter to split data to multiple Outputs.

class stetl.factory.Factory[source]

Object and class Factory (Pattern). Based on: http://stackoverflow.com/questions/2226330/instantiate-a-python-class-from-a-name

class_forname(class_string)[source]

Returns class instance specified by a string.

Args:
class_string: The string representing a class.
Raises:
ValueError if module part of the class is not specified.
new_instance(class_obj, configdict, section)[source]

Returns object instance from class instance.

Args:
class_obj: object representing a class instance. args: standard args. kwargs: standard args.
class stetl.component.Component(configdict, section, consumes='none', produces='none')[source]

Abstract Base class for all Input, Filter and Output Components.

after_chain_invoke(packet)[source]

Called right after entire Component Chain invoke.

after_invoke(packet)[source]

Called right after Component invoke.

before_invoke(packet)[source]

Called just before Component invoke.

exit()[source]

Allows derived Components to perform a one-time exit/cleanup.

init()[source]

Allows derived Components to perform a one-time init.

input_format()[source]

CONFIG - The specific input format if the consumes parameter is a list or the format to be converted to the output_format. Required: False Default: None

invoke(packet)[source]

Components override for Component-specific behaviour, typically read, filter or write actions.

output_format()[source]

CONFIG - The specific output format if the produces parameter is a list or the format to which the input format is converted. Required: False Default: None

class stetl.component.Config(ptype=<type 'str'>, default=None, required=False)[source]

Decorator class to tie config values from the .ini file to object instance property values. Somewhat like the Python standard @property but with the possibility to define default values, typing and making properties required.

Each property is defined by @Config(type, default, required). Basic idea comes from: https://wiki.python.org/moin/PythonDecoratorLibrary#Cached_Properties

class stetl.chain.Chain(chain_str, config_dict)[source]

Holder for single invokable pipeline of components A Chain is basically a singly linked list of Components Each Component executes a part of the total ETL. Data along the Chain is passed within a Packet object. The compatibility of input and output for linked Components is checked when adding a Component to the Chain.

add(etl_comp)[source]

Add component to end of Chain :param etl_comp: :return:

assemble()[source]

Builder method: build a Chain of linked Components :return:

assemble2()[source]

Builder method: build a Chain of linked Components :return:

get_by_class(clazz)[source]

Get Component instance from Chain by class, mainly for testing. :param clazz: :return Component:

get_by_id(id)[source]

Get Component instance from Chain, mainly for testing. :param name: :return Component:

get_by_index(index)[source]

Get Component instance from Chain by position/index in Chain, mainly for testing. :param clazz: :return Component:

run()[source]

Run the ETL Chain. :return:

class stetl.packet.Packet(data=None)[source]

Represents units of (any) data and status passed along Chain of Components.

class stetl.input.Input(configdict, section, produces)[source]

Bases: stetl.component.Component

Abstract Base class for all Input Components.

class stetl.output.Output(configdict, section, consumes)[source]

Bases: stetl.component.Component

Abstract Base class for all Output Components.

class stetl.filter.Filter(configdict, section, consumes, produces)[source]

Bases: stetl.component.Component

Maps input to output. Abstract base class for specific Filters.

class stetl.splitter.Splitter(config_dict, child_list)[source]

Bases: stetl.component.Component

Component that splits a single input to multiple output Components. Use this for example to produce multiple output file formats (GML, GeoJSON etc) or to publish to multiple remote services (SOS, SensorThings API) or for simple debugging: target Output and StandardOutput.

after_chain_invoke(packet)[source]

Called right after entire Component Chain invoke.

after_invoke(packet)[source]

Called right after Component invoke.

before_invoke(packet)[source]

Called just before Component invoke.

Components: Inputs

class stetl.inputs.dbinput.DbInput(configdict, section, produces)[source]

Bases: stetl.input.Input

Input from any database (abstract base class).

class stetl.inputs.dbinput.PostgresDbInput(configdict, section)[source]

Bases: stetl.inputs.dbinput.SqlDbInput

Input by querying records from a Postgres database. Input is a query, like SELECT * from mytable. Output is zero or more records as record array (array of dict) or single record (dict).

produces=FORMAT.record_array (default) or FORMAT.record

host()[source]

CONFIG - host name or host IP-address, defaults to ‘localhost’

password()[source]

CONFIG - User password, defaults to ‘postgres’

port()[source]

CONFIG - port for host, defaults to ‘5432’

schema()[source]

CONFIG - The postgres schema name, defaults to ‘public’

user()[source]

CONFIG - User name, defaults to ‘postgres’

class stetl.inputs.dbinput.SqlDbInput(configdict, section)[source]

Bases: stetl.inputs.dbinput.DbInput

Input using a query from any SQL-based RDBMS (abstract base class).

column_names()[source]

CONFIG - Column names to populate records with. If empty taken from table metadata.

database_name()[source]

CONFIG - Database name

do_query(query_str)[source]

DB-neutral query returning Python record list.

query()[source]

CONFIG - The query (string) to fire.

raw_query(query_str)[source]

Performs DB-specific query and returns raw records iterator.

read_once()[source]

CONFIG - Read once? i.e. only do query once and stop

result_to_output(db_tuples)[source]

Convert DB-specific record tuples to single Python record (dict) or record array (list of dict).

table()[source]

CONFIG - Table name

tuples_to_records(db_tuples, columns=None)[source]

Convert tuple array (list of tuple) to list of records (list of dict’s) using list of column names.

class stetl.inputs.dbinput.SqliteDbInput(configdict, section)[source]

Bases: stetl.inputs.dbinput.SqlDbInput

Input by querying records from a SQLite database. Input is a query, like SELECT * from mytable. Output is zero or more records as record array (array of dict) or single record (dict).

produces=FORMAT.record_array (default) or FORMAT.record

class stetl.inputs.fileinput.ApacheLogFileInput(configdict, section)[source]

Bases: stetl.inputs.fileinput.FileInput

Parses Apache log files. Lines are converted into records based on the log format. Log format should follow Apache Log Format. See ApacheLogParser for details.

produces=FORMAT.record

key_map()[source]

CONFIG - Map of cryptic %-field names to readable keys in record.

Type: dictionary

Required: False

Default: {‘%l’: ‘logname’, ‘%>s’: ‘status’, ‘%D’: ‘deltat’, ‘%{User-agent}i’: ‘agent’, ‘%b’: ‘bytes’, ‘%{Referer}i’: ‘referer’, ‘%u’: ‘user’, ‘%t’: ‘time’, “’%h”: ‘host’, ‘%r’: ‘request’}

log_format()[source]

CONFIG - Log format according to Apache CLF

Required: False

Default: ‘%h %l %u %t “%r” %>s %b “%{Referer}i” “%{User-agent}i”’

class stetl.inputs.fileinput.CsvFileInput(configdict, section)[source]

Bases: stetl.inputs.fileinput.FileInput

Parse CSV file into stream of records (dict structures) or a one-time record array. NB raw version: CSV needs to have first line with fieldnames.

produces=FORMAT.record or FORMAT.record_array

delimiter()[source]

CONFIG - A one-character string used to separate fields. It defaults to ‘,’.

Required: False

Default: ‘,’ (comma)

quote_char()[source]

CONFIG - A one-character string used to quote fields containing special characters, such as the delimiter or quotechar, or which contain new-line characters. It defaults to ‘”’

Required: False

Default: “

class stetl.inputs.fileinput.FileInput(configdict, section, produces)[source]

Bases: stetl.input.Input

Abstract base class for specific FileInputs, use derived classes.

CONFIG - Should we recurse into sub-directories to find files?

Required: False

Default: False

file_path()[source]

CONFIG - Path to file or files or URLs: can be a dir or files or URLs or even multiple, comma separated. For URLs only JSON is supported now.

Required: True

Default: None

filename_pattern()[source]

CONFIG - Filename pattern according to Python glob.glob for example: ‘*.[gxGX][mM][lL]’

Required: False

Default: ‘*.[gxGX][mM][lL]’

read_file(file_path)[source]

Override in subclass.

class stetl.inputs.fileinput.GlobFileInput(configdict, section, produces=['string', 'line_stream'])[source]

Bases: stetl.inputs.fileinput.FileInput

Returns file names based on the glob.glob pattern given as filename_filter.

produces=FORMAT.string or FORMAT.line_stream

class stetl.inputs.fileinput.JsonFileInput(configdict, section)[source]

Bases: stetl.inputs.fileinput.FileInput

Parse JSON file from file system or URL into hierarchical data struct. The struct format may also be a GeoJSON structure. In that case the output_format needs to be explicitly set to geojson_collection in the component config.

produces=FORMAT.struct or FORMAT.geojson_collection

class stetl.inputs.fileinput.LineStreamerFileInput(configdict, section, produces='line_stream')[source]

Bases: stetl.inputs.fileinput.FileInput

Reads text-files, producing a stream of lines, one line per Packet. NB assumed is that lines in the file have newlines !!

process_line(line)[source]

Override in subclass.

class stetl.inputs.fileinput.StringFileInput(configdict, section)[source]

Bases: stetl.inputs.fileinput.FileInput

Reads and produces file as String.

produces=FORMAT.string

format_args()[source]

CONFIG - Formatting of content according to Python String.format() Input file should have substitutable values like {schema} {foo} format_args should be of the form format_args = schema:test foo:bar

Required: False

Default: None

read_file(file_path)[source]

Overridden from base class.

class stetl.inputs.fileinput.XmlElementStreamerFileInput(configdict, section)[source]

Bases: stetl.inputs.fileinput.FileInput

Extracts XML elements from a file, outputs each feature element in Packet. Parsing is streaming (no internal DOM buildup) so any file size can be handled. Use this class for your big GML files!

produces=FORMAT.etree_element

element_tags()[source]

CONFIG - Comma-separated string of XML (feature) element tag names of the elements that should be extracted and added to the output element stream.

Required: True

Default: None

strip_namespaces()[source]

CONFIG - should namespaces be removed from the input document and thus not be present in the output element stream?

Required: False

Default: False

class stetl.inputs.fileinput.XmlFileInput(configdict, section)[source]

Bases: stetl.inputs.fileinput.FileInput

Parses XML files into etree docs (do not use for large files!).

produces=FORMAT.etree_doc

class stetl.inputs.fileinput.XmlLineStreamerFileInput(configdict, section)[source]

Bases: stetl.inputs.fileinput.LineStreamerFileInput

DEPRECATED Streams lines from an XML file(s) NB assumed is that lines in the file have newlines !! DEPRECATED better is to use XmlElementStreamerFileInput for GML features.

produces=FORMAT.xml_line_stream

class stetl.inputs.fileinput.ZipFileInput(configdict, section)[source]

Bases: stetl.inputs.fileinput.FileInput

Parse ZIP file from file system or URL into a stream of records containing zipfile-path and file names.

produces=FORMAT.record

name_filter()[source]

CONFIG - Regular “glob.glob” expression for filtering out filenames from the ZIP archive.

Required: False

Default: * (all files in zip-archive)

class stetl.inputs.httpinput.ApacheDirInput(configdict, section, produces='record')[source]

Bases: stetl.inputs.httpinput.HttpInput

Read file data from an Apache directory “index” HTML page. Uses http://stackoverflow.com/questions/686147/url-tree-walker-in-python produces=FORMAT.record. Each record contains file_name and file_data (other meta data like date time is too fragile over different Apache servers).

filter_file(file_name)[source]

Filter the file_name, e.g. to suppress reading, default: return file_name. :param file_name: :return string or None:

init()[source]

Read the list of files from the Apache index URL.

next_file()[source]

Return a tuple (name, date, size) with next file info. :return tuple:

no_more_files()[source]

More files left?. :return Boolean:

read(packet)[source]

Read the data from the URL. :param packet: :return:

class stetl.inputs.httpinput.HttpInput(configdict, section, produces='any')[source]

Bases: stetl.input.Input

Fetch data from remote services like WFS via HTTP protocol. Base class: subclasses will do datatype-specific formatting of the returned data.

produces=FORMAT.any

format_data(data)[source]

Format response data, override in subclasses, defaults to returning original data. :param packet: :return:

parameters()[source]

CONFIG - Flat JSON-like struct of the parameters to be appended to the url.

Example: (parameters require quotes):

url = http://geodata.nationaalgeoregister.nl/natura2000/wfs
parameters = {
    service : WFS,
    version : 1.1.0,
    request : GetFeature,
    srsName : EPSG:28992,
    outputFormat : text/xml; subtype=gml/2.1.2,
    typename : natura2000
}

Required: False

Default: None

read(packet)[source]

Read the data from the URL. :param packet: :return:

read_from_url(url, parameters=None)[source]

Read the data from the URL. :param url: the url to fetch :param parameters: optional dict of query parameters :return:

url()[source]

CONFIG - The HTTP URL string.

Required: True

Default: None

class stetl.inputs.ogrinput.OgrInput(configdict, section)[source]

Bases: stetl.input.Input

Direct GDAL OGR input via Python OGR wrapper. Via the Python API http://gdal.org/python an OGR data source is accessed and from each layer the Features are read. Each Layer corresponds to a “doc”, so for multi-layer sources the ‘end-of-doc’ flag is set after a Layer has been read.

This input can read almost any geospatial dataformat. One can use the features directly in a Stetl Filter or use a converter to e.g. convert to GeoJSON structures.

produces=FORMAT.ogr_feature or FORMAT.ogr_feature_array (all features)

data_source()[source]

CONFIG - String denoting the OGR datasource. Usually a path to a file like “path/rivers.shp” or connection string to PostgreSQL like “PG: host=localhost dbname=’rivers’ user=’postgres’”.

Required: True

Default: None

source_format()[source]

CONFIG - Instructs GDAL to use driver by that name to open datasource. Not required for many standard formats that are self-describing like ESRI Shapefile.

Examples: ‘PostgreSQL’, ‘GeoJSON’ etc

Required: False

Default: None

source_options()[source]

CONFIG - Custom datasource-specific options. Used in gdal.SetConfigOption().

Type: dictionary

Required: False

Default: None

sql()[source]

CONFIG - String with SQL query. Mandatory for PostgreSQL OGR source.

Required: False (True for PostgreSQL OGR source)

Default: None

class stetl.inputs.ogrinput.OgrPostgisInput(configdict, section)[source]

Bases: stetl.input.Input

Input from PostGIS via ogr2ogr command. For now hardcoded to produce an ogr GML line stream. OgrInput may be a better alternative.

Alternatives: either stetl.input.PostgresqlInput or stetl.input.OgrInput.

produces=FORMAT.xml_line_stream

class stetl.inputs.deegreeinput.DeegreeBlobstoreInput(configdict, section)[source]

Bases: stetl.input.Input

Read features from deegree Blobstore DB into an etree doc.

produces=FORMAT.etree_doc

Components: Filters

class stetl.filters.xsltfilter.XsltFilter(configdict, section)[source]

Bases: stetl.filter.Filter

Invokes XSLT processor (via lxml) for given XSLT script on an etree doc.

consumes=FORMAT.etree_doc, produces=FORMAT.etree_doc

class stetl.filters.xmlassembler.XmlAssembler(configdict, section)[source]

Bases: stetl.filter.Filter

Split a stream of etree DOM XML elements (usually Features) into etree DOM docs. Consumes and buffers elements until max_elements reached, will then produce an etree doc.

consumes=FORMAT.etree_element, produces=FORMAT.etree_doc

class stetl.filters.xmlvalidator.XmlSchemaValidator(configdict, section)[source]

Bases: stetl.filter.Filter

Validates an etree doc and prints result to log.

consumes=FORMAT.etree_doc, produces=FORMAT.etree_doc

class stetl.filters.stringfilter.StringFilter(configdict, section, consumes, produces)[source]

Bases: stetl.filter.Filter

Base class for any string filtering

class stetl.filters.stringfilter.StringSubstitutionFilter(configdict, section)[source]

Bases: stetl.filters.stringfilter.StringFilter

String filtering using Python advanced String formatting. String should have substitutable values like {schema} {foo} format_args should be of the form format_args = schema:test foo:bar ...

consumes=FORMAT.string, produces=FORMAT.string

class stetl.filters.templatingfilter.Jinja2TemplatingFilter(configdict, section)[source]

Bases: stetl.filters.templatingfilter.TemplatingFilter

Implements Templating using Jinja2. Jinja2 http://jinja.pocoo.org, is a modern and designer-friendly templating language for Python modelled after Django’s templates. A ‘struct’ format as input provides a tree-like structure that could originate from a JSON file or REST service. This input struct provides all the variables to be inserted into the template. The template itself can be configured in this component as a Jinja2 string or -file. An optional ‘template_search_paths’ provides a list of directories from which templates can be fethced. Default is the current working directory. Via the optional ‘globals_path’ a JSON structure can be inserted into the Template environment. The variables in this globals struture are typically “boilerplate” constants like: id-prefixes, point of contacts etc.

consumes=FORMAT.struct, produces=FORMAT.string

add_env_filters(jinja2_env)[source]

Register additional Filters on the template environment by updating the filters dict: Somehow min and max of list are not present so add them as well.

static geojson2gml_filter(value, source_crs=4326, target_crs=None, gml_id=None, gml_format='GML2', gml_longsrs='NO')[source]

Jinja2 custom Filter: generates any GML geometry from a GeoJSON geometry. By specifying a target_crs we can even reproject from the source CRS. The gml_format=GML2|GML3 determines the general GML form: e.g. pos/posList or coordinates. gml_longsrs=YES|NO determines the srsName format like EPSG:4326 or urn:ogc:def:crs:EPSG::4326 (long).

template_globals_path()[source]

CONFIG - One or more JSON files or URLs with global variables that can be used anywhere in template. Multiple files will be merged into one globals dictionary Required: False Default: None

template_search_paths()[source]

CONFIG - List of directories where to search for templates, default is current working directory only. Required: False Default: [os.getcwd()]

class stetl.filters.templatingfilter.StringTemplatingFilter(configdict, section)[source]

Bases: stetl.filters.templatingfilter.TemplatingFilter

Implements Templating using Python’s internal string.Template. A template string or file should be configured. The input record contains the actual values to be substituted in the template string as a record (key/value pairs). Output is a regular string.

consumes=FORMAT.record or FORMAT.record_array, produces=FORMAT.string

class stetl.filters.templatingfilter.TemplatingFilter(configdict, section, consumes='any', produces='string')[source]

Bases: stetl.filter.Filter

Abstract base class for specific template-based filters. See https://wiki.python.org/moin/Templating Subclasses implement a specific template language like Python string.Template, Mako, Genshi, Jinja2,

consumes=FORMAT.any, produces=FORMAT.string

create_template()[source]

To be overridden in subclasses.

template_file()[source]

CONFIG - Path to template file. One of template_file or template_string needs to be configured. Required: False Default: None

template_string()[source]

CONFIG - Template string. One of template_file or template_string needs to be configured. Required: False Default: None

class stetl.filters.gmlfeatureextractor.GmlFeatureExtractor(configdict, section='gml_feature_extractor')[source]

Bases: stetl.filter.Filter

Extract arrays of GML features etree elements from etree docs.

consumes=FORMAT.etree_doc, produces=FORMAT.etree_feature_array

class stetl.filters.gmlsplitter.GmlSplitter(configdict, section='gml_splitter')[source]

Bases: stetl.filter.Filter

Split a stream of text XML lines into documents DEPRECATED: use the more robust XmlElementStreamerFileInput+XmlAssembler instead!!! TODO phase out

consumes=FORMAT.xml_line_stream, produces=FORMAT.etree_doc

class stetl.filters.formatconverter.FormatConverter(configdict, section)[source]

Bases: stetl.filter.Filter

Converts (almost) any packet format (if converter available).

consumes=FORMAT.any, produces=FORMAT.any but actual formats are changed at initialization based on the input to output format to be converted via the input_format and output_format config parameters.

converter_args()[source]

CONFIG - Custom converter-specific arguments.

Type: dictionary

Required: False

Default: None

static etree_doc2geojson_collection(packet, converter_args=None)[source]

Use converter_args to determine XML tag names for features and GeoJSON feature id. For example

converter_args = {
‘root_tag’: ‘FeatureCollection’, ‘feature_tag’: ‘featureMember’, ‘feature_id_attr’: ‘fid’ }
Parameters:
  • packet
  • converter_args
Returns:

static etree_doc2struct(packet, strip_space=True, strip_ns=True, sub=False, attr_prefix='', gml2ogr=True, ogr2json=True)[source]
Parameters:
  • packet
  • strip_space
  • strip_ns
  • sub
  • attr_prefix
  • gml2ogr
  • ogr2json
Returns:

static etree_elem2geojson_feature(packet, converter_args=None)[source]
static etree_elem2struct(packet, strip_space=True, strip_ns=True, sub=False, attr_prefix='', gml2ogr=True, ogr2json=True)[source]
Parameters:
  • packet
  • strip_space
  • strip_ns
  • sub
  • attr_prefix
  • gml2ogr
  • ogr2json
Returns:

Components: Outputs

class stetl.outputs.fileoutput.FileOutput(configdict, section)[source]

Bases: stetl.output.Output

Pretty print input to file. Input may be an etree doc or any other stringify-able input.

consumes=FORMAT.any

file_path()[source]

CONFIG - Path to file, for MultiFileOutput can be of the form like: gmlcities-%03d.gml

Required: True

Default: None

class stetl.outputs.fileoutput.MultiFileOutput(configdict, section)[source]

Bases: stetl.outputs.fileoutput.FileOutput

Print to multiple files from subsequent packets like strings or etree docs, file_path must be of a form like: gmlcities-%03d.gml.

consumes=FORMAT.any

class stetl.outputs.standardoutput.StandardOutput(configdict, section)[source]

Bases: stetl.output.Output

Print any input to standard output.

consumes=FORMAT.any

class stetl.outputs.standardoutput.StandardXmlOutput(configdict, section)[source]

Bases: stetl.output.Output

Pretty print XML from etree doc to standard output. OBSOLETE, can be done with StandardOutput

consumes=FORMAT.etree_doc

class stetl.outputs.httpoutput.HttpOutput(configdict, section, consumes='any')[source]

Bases: stetl.output.Output

Output via HTTP protocol, usually via POST.

consumes=FORMAT.any

content_type()[source]

CONFIG - The HTTP ContentType request header for target request.

Required: False

Default: ‘text/xml’

create_payload(packet)[source]

Create a HTTP body payload like for POST of an XML or JSON message. Subclasses like WFS and SOS override. :param packet: :return payload as string:

host()[source]

CONFIG - The hostname/IP addr for target request.

Required: True

Default: None

list_fanout()[source]

CONFIG - If we consume a list(), should we create a HTTP req for each member?

Required: False

Default: True

method()[source]

CONFIG - The HTTP method for target request.

Required: False

Default: POST

password()[source]

CONFIG - The Password for HTTP basic auth for target request.

Required: False

Default: None

path()[source]

CONFIG - The path number for target request.

Required: False

Default: ‘/’

port()[source]

CONFIG - The port number for target request.

Required: True

Default: 80

user()[source]

CONFIG - The Username for HTTP basic auth for target request.

Required: False

Default: None

class stetl.outputs.ogroutput.Ogr2OgrOutput(configdict, section)[source]

Bases: stetl.output.Output

Output from GML etree doc to any OGR2OGR output using the GDAL/OGR ogr2ogr command

consumes=FORMAT.etree_doc

class stetl.outputs.ogroutput.OgrOutput(configdict, section)[source]

Bases: stetl.output.Output

Direct GDAL OGR output via Python OGR wrapper. Via the Python API http://gdal.org/python OGR Features are written.

This output can write almost any geospatial, OGR-defined, dataformat.

consumes=FORMAT.ogr_feature or FORMAT.ogr_feature_array

append()[source]

CONFIG - Add to destination destination if it extists (ogr2ogr -append option).

Type: boolean

Required: False

Default: False

dest_create_options()[source]

CONFIG - Creation options.

Examples: ..

Required: False

Default: []

dest_data_source()[source]

CONFIG - String denoting the OGR data destination. Usually a path to a file like “path/rivers.shp” or connection string to PostgreSQL like “PG: host=localhost dbname=’rivers’ user=’postgres’”.

Required: True

Default: None

dest_format()[source]

CONFIG - Instructs GDAL to use driver by that name to open data destination. Not required for many standard formats that are self-describing like ESRI Shapefile.

Examples: ‘PostgreSQL’, ‘GeoJSON’ etc

Required: False

Default: None

dest_options()[source]

CONFIG - Custom data destination-specific options. Used in gdal.SetConfigOption().

Type: dictionary

Required: False

Default: None

layer_create_options()[source]

CONFIG - Options for newly created layer (-lco).

Type: list

Required: True

Default: []

new_layer_name()[source]

CONFIG - Layer name for layer created in the destination source.

Type: string

Required: True

overwrite()[source]

CONFIG - Overwrite destination if it extists (ogr2ogr -overwrite option).

Type: boolean

Required: False

Default: False

sql()[source]

CONFIG - String with SQL query. Mandatory for PostgreSQL OGR dest.

Required: False (True for PostgreSQL OGR dest)

Default: None

target_srs()[source]

CONFIG - SRS (projection) for the target.

Type: string

Required: False

Default: None (take from Input)

class stetl.outputs.dboutput.DbOutput(configdict, section, consumes)[source]

Bases: stetl.output.Output

Output to any database (abstract base class).

class stetl.outputs.dboutput.PostgresDbOutput(configdict, section)[source]

Bases: stetl.outputs.dboutput.DbOutput

Output to PostgreSQL database. Input is an SQL string. Output by executing input SQL string.

consumes=FORMAT.string

database()[source]

CONFIG - Database name.

host()[source]

CONFIG - Hostname for DB.

password()[source]

CONFIG - DB Password for user.

schema()[source]

CONFIG - Postgres schema name for DB.

user()[source]

CONFIG - DB User name.

class stetl.outputs.dboutput.PostgresInsertOutput(configdict, section, consumes='record')[source]

Bases: stetl.outputs.dboutput.PostgresDbOutput

Output by inserting single record into Postgres database. Input is a record (Python dic structure) or a Python list of dicts (records). Creates an INSERT for Postgres to insert each single record. When the “replace” parameter is True, any existing record keyed by “key” is attempted to be deleted first.

NB a constraint is that each record needs to contain all values as an INSERT query is built once for the columns in the first record.

consumes=FORMAT.record

key()[source]

CONFIG - The key column name of the table, required when replacing records.

replace()[source]

CONFIG - Replace record if exists?

table()[source]

CONFIG - Table for inserts.

class stetl.outputs.wfsoutput.WFSTOutput(configdict, section)[source]

Bases: stetl.output.Output

Insert features via WFS-T (WFS Transaction) OGC protocol from an etree doc.

consumes=FORMAT.etree_doc

class stetl.outputs.deegreeoutput.DeegreeBlobstoreOutput(configdict, section)[source]

Bases: stetl.output.Output

Insert features into deegree Blobstore from an etree doc.

consumes=FORMAT.etree_doc

class stetl.outputs.deegreeoutput.DeegreeFSLoaderOutput(configdict, section)[source]

Bases: stetl.output.Output

Insert features via deegree using deegree’s FSLoader tool from an etree doc.

consumes=FORMAT.etree_doc