GDAL-VECTOR-PIPELINE(1) | GDAL | GDAL-VECTOR-PIPELINE(1) |
NAME
gdal-vector-pipeline - Process a vector dataset
Added in version 3.11.
SYNOPSIS
Usage: gdal vector pipeline [OPTIONS] <PIPELINE> Process a vector dataset. Positional arguments: Common Options: -h, --help Display help message and exit --json-usage Display usage as JSON document and exit --config <KEY>=<VALUE> Configuration option [may be repeated] --progress Display progress bar <PIPELINE> is of the form: read|concat [READ-OPTIONS] ( ! <STEP-NAME> [STEP-OPTIONS] )* ! write [WRITE-OPTIONS]
A pipeline chains several steps, separated with the ! (quotation mark) character. The first step must be read or concat, and the last one write. Each step has its own positional or non-positional arguments. Apart from read, concat and write, all other steps can potentially be used several times in a pipeline.
Potential steps are:
- •
- read
* read [OPTIONS] <INPUT> ------------------------ Read a vector dataset. Positional arguments: -i, --input <INPUT> Input vector datasets [required] Options: -l, --layer, --input-layer <INPUT-LAYER> Input layer name(s) [may be repeated] Advanced Options: --if, --input-format <INPUT-FORMAT> Input formats [may be repeated] --oo, --open-option <KEY>=<VALUE> Open options [may be repeated]
- •
- concat
* concat [OPTIONS] <INPUT>... ----------------------------- Concatenate vector datasets. Positional arguments: -i, --input <INPUT> Input vector datasets [1.. values] [required] Options: -l, --layer, --input-layer <INPUT-LAYER> Input layer name(s) [may be repeated] --mode <MODE> Determine the strategy to create output layers from source layers . MODE=merge-per-layer-name|stack|single (default: merge-per-layer-name) --output-layer <OUTPUT-LAYER> Name of the output vector layer (single mode), or template to name the output vector layers (stack mode) --source-layer-field-name <SOURCE-LAYER-FIELD-NAME> Name of the new field to add to contain identificoncation of the source layer, with value determined from 'source-layer-field-content' --source-layer-field-content <SOURCE-LAYER-FIELD-CONTENT> A string, possibly using {AUTO_NAME}, {DS_NAME}, {DS_BASENAME}, {DS_INDEX}, {LAYER_NAME}, {LAYER_INDEX} --field-strategy <FIELD-STRATEGY> How to determine target fields from source fields. FIELD-STRATEGY=union|intersection (default: union) -s, --src-crs <SRC-CRS> Source CRS -d, --dst-crs <DST-CRS> Destination CRS Advanced Options: --if, --input-format <INPUT-FORMAT> Input formats [may be repeated] --oo, --open-option <KEY>=<VALUE> Open options [may be repeated]
Details for options can be found in gdal vector concat.
- •
- clip
* clip [OPTIONS] ---------------- Clip a vector dataset. Options: --active-layer <ACTIVE-LAYER> Set active layer (if not specified, all) --bbox <BBOX> Clipping bounding box as xmin,ymin,xmax,ymax Mutually exclusive with --geometry, --like --bbox-crs <BBOX-CRS> CRS of clipping bounding box --geometry <GEOMETRY> Clipping geometry (WKT or GeoJSON) Mutually exclusive with --bbox, --like --geometry-crs <GEOMETRY-CRS> CRS of clipping geometry --like <DATASET> Dataset to use as a template for bounds Mutually exclusive with --bbox, --geometry --like-sql <SELECT-STATEMENT> SELECT statement to run on the 'like' dataset Mutually exclusive with --like-where --like-layer <LAYER-NAME> Name of the layer of the 'like' dataset --like-where <WHERE-EXPRESSION> WHERE SQL clause to run on the 'like' dataset Mutually exclusive with --like-sql
Details for options can be found in gdal vector clip.
- •
- edit
* edit [OPTIONS] ---------------- Edit metadata of a vector dataset. Options: --active-layer <ACTIVE-LAYER> Set active layer (if not specified, all) --geometry-type <GEOMETRY-TYPE> Layer geometry type --crs <CRS> Override CRS (without reprojection) --metadata <KEY>=<VALUE> Add/update dataset metadata item [may be repeated] --unset-metadata <KEY> Remove dataset metadata item [may be repeated] --layer-metadata <KEY>=<VALUE> Add/update layer metadata item [may be repeated] --unset-layer-metadata <KEY> Remove layer metadata item [may be repeated]
Details for options can be found in gdal vector edit.
- •
- filter
* filter [OPTIONS] ------------------ Filter a vector dataset. Options: --active-layer <ACTIVE-LAYER> Set active layer (if not specified, all) --bbox <BBOX> Bounding box as xmin,ymin,xmax,ymax --where <WHERE>|@<filename> Attribute query in a restricted form of the queries used in the SQL WHERE statement
Details for options can be found in gdal vector filter.
- •
- geom
* geom <COMMAND> [OPTIONS] where <COMMAND> is one of: - buffer: Compute a buffer around geometries of a vector dataset. - explode-collections: Explode geometries of type collection of a vector dataset. - make-valid: Fix validity of geometries of a vector dataset. - segmentize: Segmentize geometries of a vector dataset. - set-type: Modify the geometry type of a vector dataset. - simplify: Simplify geometries of a vector dataset. - swap-xy: Swap X and Y coordinates of geometries of a vector dataset.
Details for options can be found in gdal vector geom.
- •
- reproject
* reproject [OPTIONS] --------------------- Reproject a vector dataset. Options: --active-layer <ACTIVE-LAYER> Set active layer (if not specified, all) -s, --src-crs <SRC-CRS> Source CRS -d, --dst-crs <DST-CRS> Destination CRS [required]
Details for options can be found in gdal vector reproject.
- •
- select
* select [OPTIONS] <FIELDS> --------------------------- Select a subset of fields from a vector dataset. Positional arguments: --fields <FIELDS> Fields to select (or exclude if --exclude) [may be repeated] [required] Options: --active-layer <ACTIVE-LAYER> Set active layer (if not specified, all) --exclude Exclude specified fields Mutually exclusive with --ignore-missing-fields --ignore-missing-fields Ignore missing fields Mutually exclusive with --exclude
Details for options can be found in gdal vector select.
- •
- sql
* sql [OPTIONS] <statement>|@<filename> --------------------------------------- Apply SQL statement(s) to a dataset. Positional arguments: --sql <statement>|@<filename> SQL statement(s) [may be repeated] [required] Options: -l, --output-layer <OUTPUT-LAYER> Output layer name(s) [may be repeated] --dialect <DIALECT> SQL dialect (e.g. OGRSQL, SQLITE)
Details for options can be found in gdal vector sql.
- •
- write
* write [OPTIONS] <OUTPUT> -------------------------- Write a vector dataset. Positional arguments: -o, --output <OUTPUT> Output vector dataset [required] Options: -f, --of, --format, --output-format <OUTPUT-FORMAT> Output format ("GDALG" allowed) --co, --creation-option <KEY>=<VALUE> Creation option [may be repeated] --lco, --layer-creation-option <KEY>=<VALUE> Layer creation option [may be repeated] --overwrite Whether overwriting existing output is allowed --update Whether to open existing dataset in update mode --overwrite-layer Whether overwriting existing layer is allowed --append Whether appending to existing layer is allowed -l, --output-layer <OUTPUT-LAYER> Output layer name
DESCRIPTION
gdal vector pipeline can be used to process a vector dataset and perform various processing steps.
GDALG OUTPUT (ON-THE-FLY / STREAMED DATASET)
A pipeline can be serialized as a JSON file using the GDALG output format. The resulting file can then be opened as a vector dataset using the GDALG: GDAL Streamed Algorithm driver, and apply the specified pipeline in a on-the-fly / streamed way.
The command_line member of the JSON file should nominally be the whole command line without the final write step, and is what is generated by gdal vector pipeline ! .... ! write out.gdalg.json.
{ "type": "gdal_streamed_alg", "command_line": "gdal vector pipeline ! read in.gpkg ! reproject --dst-crs=EPSG:32632" }
The final write step can be added but if so it must explicitly specify the stream output format and a non-significant output dataset name.
{ "type": "gdal_streamed_alg", "command_line": "gdal vector pipeline ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write --output-format=streamed streamed_dataset" }
EXAMPLES
Example 1: Reproject a GeoPackage file to CRS EPSG:32632 ("WGS 84 / UTM zone 32N")
$ gdal vector pipeline --progress ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write out.gpkg --overwrite
Example 2: Serialize the command of a reprojection of a GeoPackage file in a GDALG file, and later read it
$ gdal vector pipeline --progress ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write in_epsg_32632.gdalg.json --overwrite $ gdal vector info in_epsg_32632.gdalg.json
Example 3: None
Union 2 source shapefiles (with similar structure), reproject them to EPSG:32632, keep only cities larger than 1 million inhabitants and write to a GeoPackage
$ gdal vector pipeline --progress ! concat --single --dst-crs=EPSG:32632 france.shp belgium.shp ! filter --where "pop > 1e6" ! write out.gpkg --overwrite
AUTHOR
Even Rouault <even.rouault@spatialys.com>
COPYRIGHT
1998-2025
May 6, 2025 |