GDAL-VECTOR-COMBINE(1) GDAL GDAL-VECTOR-COMBINE(1)

gdal-vector-combine - Combine geometries into geometry collections

Added in version 3.13.

Usage: gdal vector combine [OPTIONS] <INPUT> <OUTPUT>
Combine features into collections
Positional arguments:
  -i, --input <INPUT>                                  Input vector datasets [required] [not available in pipelines]
  -o, --output <OUTPUT>                                Output vector dataset [required] [not available in pipelines]
Common Options:
  -h, --help                                           Display help message and exit
  --json-usage                                         Display usage as JSON document and exit
  --config <KEY>=<VALUE>                               Configuration option [may be repeated]
  -q, --quiet                                          Quiet mode (no progress bar or warning message) [not available in pipelines]
Options:
  -l, --layer, --input-layer <INPUT-LAYER>             Input layer name(s) [may be repeated] [not available in pipelines]
  -f, --of, --format, --output-format <OUTPUT-FORMAT>  Output format ("GDALG" allowed) [not available in pipelines]
  --co, --creation-option <KEY>=<VALUE>                Creation option [may be repeated] [not available in pipelines]
  --lco, --layer-creation-option <KEY>=<VALUE>         Layer creation option [may be repeated] [not available in pipelines]
  --overwrite                                          Whether overwriting existing output dataset is allowed [not available in pipelines]
  --update                                             Whether to open existing dataset in update mode [not available in pipelines]
  --overwrite-layer                                    Whether overwriting existing output layer is allowed [not available in pipelines]
  --append                                             Whether appending to existing layer is allowed [not available in pipelines]
                                                       Mutually exclusive with --upsert
  --output-layer <OUTPUT-LAYER>                        Output layer name [not available in pipelines]
  --skip-errors                                        Skip errors when writing features [not available in pipelines]
  --group-by <GROUP-BY>                                Names of field(s) by which inputs should be grouped [may be repeated]
  --keep-nested                                        Avoid combining the components of multipart geometries
  --add-extra-fields <ADD-EXTRA-FIELDS>                Whether to add extra fields, depending on if they have identical values within each group. ADD-EXTRA-FIELDS=no|sometimes-identical|always-identical (default: no)
Advanced Options:
  --if, --input-format <INPUT-FORMAT>                  Input formats [may be repeated] [not available in pipelines]
  --oo, --open-option <KEY>=<VALUE>                    Open options [may be repeated] [not available in pipelines]
  --output-oo, --output-open-option <KEY>=<VALUE>      Output open options [may be repeated] [not available in pipelines]
  --upsert                                             Upsert features (implies 'append') [not available in pipelines]
                                                       Mutually exclusive with --append

gdal vector combine combines geometries into geometry collections.

The --group-by argument can be used to determine which features are combined.

By default, the parts of multipart geometries will be combined into an un-nested collection of the most specific type possible (e.g., MultiPolygon rather than GeometryCollection). For example, a Polygon and a two-part MultiPolygon would be combined into a three-part MultiPoygon. This is done because many GIS file formats and software packages do not handle nested GeometryCollections types. If the nested representation in the manner of PostGIS' ST_Collect is preferred (a two-component GeometryCollection containing the Polygon and MultiPolygon), then --keep-nested can be used.

combine can be used as a step of gdal vector pipeline.

The names of fields whose unique values will be used to collect input geometries. Any fields not listed in --group-by will be removed from the source layer. If --group-by is omitted, the entire layer will be combined into a single feature.
If input geometries have multiple parts, combine them into a GeometryCollection of the input geometries.
Whether fields from source features that have the same value among the features belonging to a same group are copied to the corresponding output feature.

By default, no, only fields listed in --group-by are copied to the output layer.

When specifying sometimes-identical, fields, for which all input features within at least one output group have the same value, will be copied to the output layer schema. The value of the field is the value that is common for all input features of a group. If the input features within some group are not identical, the field value in the output for this group will be set to null.

When specifying always-identical, fields for which all input features within a given output group have the same value are copied to the output layer, provided this condition holds for all output groups.

For example, let's suppose we have four input features with the following content, and we group them by country

name country country_fr type
Mainland France France France continental
Corsica France France island
Mainland USA USA Etats-Unis d'Amérique continental
Alaska USA Etats-Unis d'Amérique continental

With sometimes-identical, the output will be:

country country_fr type
France France
USA Etats-Unis d'Amérique continental

The name field has been removed because it is distinct for each input feature. The type field appears as an output field, because at least for the USA, the two grouped input features that makes it have the same value. For France, its content is set to null, because its value is different in the two input features.

With always-identical, the output will be:

country country_fr
France France
USA Etats-Unis d'Amérique

The type field has been removed because its value is different in the two input features of France.

The program returns status code 0 in case of success, and non-zero in case of error (non-blocking errors emitted as warnings are considered as a successful execution).

Whether appending features to existing layer(s) is allowed. This also creates the output dataset if it does not exist yet.
Many formats have one or more optional dataset creation options that can be used to control particulars about the file created. For instance, the GeoPackage driver supports creation options to control the version.

May be repeated.

The dataset creation options available vary by format driver, and some simple formats have no creation options at all. A list of options supported for a format can be listed with the --formats command line option but the documentation for the format is the definitive source of information on driver creation options. See Vector drivers format specific documentation for legal creation options for each format.

Note that dataset creation options are different from layer creation options.

Format/driver name to be attempted to open the input file(s). It is generally not necessary to specify it, but it can be used to skip automatic driver detection, when it fails to select the appropriate driver. This option can be repeated several times to specify several candidate drivers. Note that it does not force those drivers to open the dataset. In particular, some drivers have requirements on file extensions.

May be repeated.

Many formats have one or more optional layer creation options that can be used to control particulars about the layer created. For instance, the GeoPackage driver supports layer creation options to control the feature identifier or geometry column name, setting the identifier or description, etc.

May be repeated.

The layer creation options available vary by format driver, and some simple formats have no layer creation options at all. A list of options supported for a format can be listed with the --formats command line option but the documentation for the format is the definitive source of information on driver creation options. See Vector drivers format specific documentation for legal creation options for each format.

Note that layer creation options are different from dataset creation options.

Dataset open option (format specific).

May be repeated.

Which output vector format to use. Allowed values may be given by gdal --formats | grep vector | grep rw | sort
Added in version 3.12.

Dataset open option for output dataset (format specific).

May be repeated.

Allow program to overwrite existing target file or dataset. Otherwise, by default, gdal errors out if the target file or dataset already exists.
--overwrite-layer
Whether overwriting the existing output vector layer is allowed.
Added in version 3.12.

Whether failures to write feature(s) should be ignored. Note that this option sets the size of the transaction unit to one feature at a time, which may cause severe slowdown when inserting into databases.

Whether to open an existing output dataset in update mode.
Added in version 3.12.

Variant of --append where the OGRLayer::UpsertFeature() operation is used to insert or update features instead of appending with OGRLayer::CreateFeature().

This is currently implemented only in a few drivers: GPKG -- GeoPackage vector, Elasticsearch: Geographically Encoded Objects for Elasticsearch and MongoDBv3 (drivers that implement upsert expose the GDAL_DCAP_UPSERT capability).

The upsert operation uses the FID of the input feature, when it is set (and the FID column name is not the empty string), as the key to update existing features. It is crucial to make sure that the FID in the source and target layers are consistent.

For the GPKG driver, it is also possible to upsert features whose FID is unset or non-significant (the --unset-fid option of gdal vector edit can be used to ignore the FID from the source feature), when there is a UNIQUE column that is not the integer primary key.

Dan Baston <dbaston@gmail.com>

1998-2026

June 5, 2026