NAME

gdal-vector_concat - Concatenate vector datasets

Added in version 3.11.

SYNOPSIS

Usage: gdal vector concat [OPTIONS] <INPUT>... <OUTPUT>
Concatenate vector datasets.
Positional arguments:
  -i, --input <INPUT>                                        Input vector datasets [1.. values] [required]
  -o, --output <OUTPUT>                                      Output vector dataset [required]
Common Options:
  -h, --help                                                 Display help message and exit
  --json-usage                                               Display usage as JSON document and exit
  --config <KEY>=<VALUE>                                     Configuration option [may be repeated]
  --progress                                                 Display progress bar
Options:
  -l, --layer, --input-layer <INPUT-LAYER>                   Input layer name(s) [may be repeated]
  -f, --of, --format, --output-format <OUTPUT-FORMAT>        Output format ("GDALG" allowed)
  --co, --creation-option <KEY>=<VALUE>                      Creation option [may be repeated]
  --lco, --layer-creation-option <KEY>=<VALUE>               Layer creation option [may be repeated]
  --overwrite                                                Whether overwriting existing output is allowed
  --update                                                   Whether to open existing dataset in update mode
  --overwrite-layer                                          Whether overwriting existing layer is allowed
  --append                                                   Whether appending to existing layer is allowed
  --mode <MODE>                                              Determine the strategy to create output layers from source layers . MODE=merge-per-layer-name|stack|single (default: merge-per-layer-name)
  --output-layer <OUTPUT-LAYER>                              Name of the output vector layer (single mode), or template to name the output vector layers (stack mode)
  --source-layer-field-name <SOURCE-LAYER-FIELD-NAME>        Name of the new field to add to contain identificoncation of the source layer, with value determined from 'source-layer-field-content'
  --source-layer-field-content <SOURCE-LAYER-FIELD-CONTENT>  A string, possibly using {AUTO_NAME}, {DS_NAME}, {DS_BASENAME}, {DS_INDEX}, {LAYER_NAME}, {LAYER_INDEX}
  --field-strategy <FIELD-STRATEGY>                          How to determine target fields from source fields. FIELD-STRATEGY=union|intersection (default: union)
  -s, --src-crs <SRC-CRS>                                    Source CRS
  -d, --dst-crs <DST-CRS>                                    Destination CRS
Advanced Options:
  --if, --input-format <INPUT-FORMAT>                        Input formats [may be repeated]
  --oo, --open-option <KEY>=<VALUE>                          Open options [may be repeated]

DESCRIPTION

gdal vector concat concatenates several source datasets.

It has 3 main modes:

--mode = merge-per-layer-name (the default). The output dataset generated by the command will contain as many layers as there are different layer names in the source datasets. For example if there are 2 datasets, one with layers a and b, and the other one with layers b and c, 3 output layers will be created: a, b (merging the 2 source layers) and c.
--mode = stack. The output dataset generated by the command will contain as many layers as there are layers in the source datasets. For example if there are 2 datasets ds1 (with layers a and b) and ds2 (with layers b and c), 4 output layers will be created: ds1_a, ds1_b, ds2_b and ds2_c.
--mode = single. The output dataset generated by the command will contain one single layer, merging all layers in the source datasets.

When an output layer merges several source layer, by default the resulting schema will contain the union of all source fields. It is possible to select only the intersection with the --field-strategy set to intersection. Regarding the resulting CRS, by default the CRS of the source layer will be used as the target CRS, and features of other source layers that do no match this CRS will be reprojected to it. --dst-crs can be used to select a given destination CRS.

This command can also be used as the first step of gdal vector pipeline.

Standard options

-f, --of, --format, --output-format <OUTPUT-FORMAT>: Which output vector format to use. Allowed values may be given by gdal --formats | grep vector | grep rw | sort

--co <NAME>=<VALUE>

Many formats have one or more optional dataset creation options that can be used to control particulars about the file created. For instance, the GeoPackage driver supports creation options to control the version.

May be repeated.

The dataset creation options available vary by format driver, and some simple formats have no creation options at all. A list of options supported for a format can be listed with the --formats command line option but the documentation for the format is the definitive source of information on driver creation options. See Vector drivers format specific documentation for legal creation options for each format.

Note that dataset creation options are different from layer creation options.

--overwrite: Allow program to overwrite existing target file or dataset. Otherwise, by default, gdal errors out if the target file or dataset already exists.

--update: Whether the output dataset must be opened in update mode. Implies that it already exists. This mode is useful when adding new layer(s) to an already existing dataset.

--overwrite-layer: Whether overwriting existing layer(s) is allowed.

--append: Whether appending features to existing layer(s) is allowed

-l, --layer, --input-layer <LAYER>: Name of one or more layers to inspect. If no layer names are passed, then all layers will be selected.

--output-layer <OUTPUT-LAYER>: Name of the output vector layer (in single mode, and the default is "merged"), or template to name the output vector layers in stack mode (the default value is {AUTO_NAME}). Not allowed in merge-per-layer-name mode.
The template in stack mode can be a string with the following variables that will be substituted with a value computed from the input layer being processed:

{AUTO_NAME}: equivalent to {DS_BASENAME}_{LAYER_NAME} if both values are different, or {LAYER_NAME} when they are identical (case of shapefile).
{DS_NAME}: name of the source dataset
{DS_BASENAME}: base name of the source dataset
{DS_INDEX}: index of the source dataset
{LAYER_NAME}: name of the source layer
{LAYER_INDEX}: index of the source layer

--mode merge-per-layer-name|stack|single: Determine the strategy to create output layers from source layers. See introductory paragraph for more details.

--source-layer-field-name <SOURCE-LAYER-FIELD-NAME>: If specified, the schema of the target layer will be extended with a field whose name is the value of this option and whose content is determined --source-layer-field-content.

--source-layer-field-content <SOURCE-LAYER-FIELD-CONTENT>: If specified, the schema of the target layer will be extended with a new field (whose name is given by --source-layer-field-name, or source_ds_lyr otherwise), whose content is determined by the specified template (see --output-layer for variables that can be used).

--field-strategy union|intersection: Determines how the schema of the target layer is built from the schemas of the input layers:

union (default) to use a super-set of all the fields from all source layers.
intersection to use a sub-set of all the common fields from all source layers.

-s, --src-crs <SRC-CRS>

Set source spatial reference. If not specified the SRS found in the input dataset will be used.

The coordinate reference systems that can be passed are anything supported by the OGRSpatialReference.SetFromUserInput() call, which includes EPSG Projected, Geographic or Compound CRS (i.e. EPSG:4296), a well known text (WKT) CRS definition, PROJ.4 declarations, or the name of a .prj file containing a WKT CRS definition.

Starting with GDAL 2.2, if the SRS has an explicit vertical datum that points to a PROJ.4 geoidgrids, and the input dataset is a single band dataset, a vertical correction will be applied to the values of the dataset.

-d, --dst-crs <SRC-CRS>

Set destination spatial reference.

GDALG OUTPUT (ON-THE-FLY / STREAMED DATASET)

This program supports serializing the command line as a JSON file using the GDALG output format. The resulting file can then be opened as a vector dataset using the GDALG: GDAL Streamed Algorithm driver, and apply the specified pipeline in a on-the-fly / streamed way.

EXAMPLES

Example 1: Creating a GeoPackage stacking all input shapefiles in separate layers.

gdal vector concat --stack *.shp out.gpkg

Example 2: Adding a field to indicate the source layer, and reprojecting to a single CRS

Concatenate the content of france.shp and germany.shp in merged.shp, reprojecting them to ETRS89, and add a 'country' field to each feature whose value is 'france' or 'germany' depending where it comes from:

gdal vector concat --single --source-layer-field-name=country --dst-crs=EPSG:4258 france.shp germany.shp merged.shp

AUTHOR

Even Rouault <even.rouault@spatialys.com>

COPYRIGHT

1998-2025

May 6, 2025