Search Index JSON Properties

Capella Operational

reference

Use a JSON payload to control the settings for a Search index.

You can choose to create a Search index with a JSON payload, and import it through the Capella UI. This JSON payload sets all the settings for your new Search index.

Your JSON payload must contain the properties described in Initial Settings, including the Params Object.

Search index aliases only need to include the properties described in Initial Settings.

Initial Settings

The start of the JSON payload for a Search index contains important settings for your index:

{
    "name": "gfx",
    "type": "fulltext-index",
    "uuid": "28b999e9e17dd4a7",
    "sourceType": "gocbcore",
    "sourceName": "travel-sample",
    "sourceUUID": "d3604c0ec4792b4c6c5f7f2cc8207b80",
    "sourceParams": {},
    "planParams": {
        "maxPartitionsPerPIndex": 1024,
        "indexPartitions": 1,
        "numReplicas": 0
    },
    "params": {

To view the entire JSON payload, click View.

All Search index payloads have the following properties:

Property Type Required? Description

Property	Type	Required?	Description
name	String	Yes	The name of the Search index. A Search index name must be unique for each cluster.
type	String	Yes	The type of index you want to create: `fulltext-index`: Create a Search index. `fulltext-alias`: Create an alias for a Search index. For more information about Search index aliases, see Create Search Index Aliases.
uuid	String	No	The UUID for the Search index. The Search Service automatically generates a UUID for a Search index. If you use an existing UUID, the Search Service updates the existing Search index. Do not include the `uuid` property when you want to copy an index to a different cluster or create a new index. View the UUID for an existing index from the Capella UI by selecting an existing index and clicking Index Definition. The UUID displays in the Index Definition on the Update Index page.
sourceType	String	Yes	The `sourceType` is always `"gocbcore"` for a Search index. For a Search index alias, it’s `"nil"`.
sourceName	String	Yes	The name of the bucket where you want to create the Search index. Do not include a `sourceName` for a Search index alias.
sourceUUID	String	No	The UUID of the bucket where you want to create the Search index. The Search Service automatically finds the UUID for the bucket. Do not include the `sourceUUID` property when you want to copy an index to a different cluster, or create a new index.
sourceParams	Object	No	This object contains advanced settings for index behavior. Do not add content into this object unless instructed by Couchbase Support.
planParams	Object	Yes	An object that sets the Search index’s partitions and replications. For more information, see planParams Object. Do not include any of the properties in a `planParams` object for a Search index alias.
params	Object	Yes	An object that sets the Search index’s type identifier, type mappings, and analyzers. For more information, see Params Object. For a Search index alias, this object contains the targets object.

name

String

Yes

The name of the Search index. A Search index name must be unique for each cluster.

type

String

Yes

The type of index you want to create:

fulltext-index: Create a Search index.
fulltext-alias: Create an alias for a Search index. For more information about Search index aliases, see Create Search Index Aliases.

uuid

String

The UUID for the Search index.

The Search Service automatically generates a UUID for a Search index.

If you use an existing UUID, the Search Service updates the existing Search index. Do not include the uuid property when you want to copy an index to a different cluster or create a new index.

View the UUID for an existing index from the Capella UI by selecting an existing index and clicking Index Definition. The UUID displays in the Index Definition on the Update Index page.

sourceType

String

Yes

The sourceType is always "gocbcore" for a Search index. For a Search index alias, it’s "nil".

sourceName

String

Yes

The name of the bucket where you want to create the Search index. Do not include a sourceName for a Search index alias.

sourceUUID

String

The UUID of the bucket where you want to create the Search index.

The Search Service automatically finds the UUID for the bucket.

Do not include the sourceUUID property when you want to copy an index to a different cluster, or create a new index.

sourceParams

Object

This object contains advanced settings for index behavior.

Do not add content into this object unless instructed by Couchbase Support.

planParams

Object

Yes

An object that sets the Search index’s partitions and replications. For more information, see planParams Object.

Do not include any of the properties in a planParams object for a Search index alias.

params

Object

Yes

An object that sets the Search index’s type identifier, type mappings, and analyzers. For more information, see Params Object.

For a Search index alias, this object contains the targets object.

planParams Object

Do not include any of the properties in a planParams object for a Search index alias.

The planParams object sets a Search index’s partition and replication settings:

    "planParams": {
        "maxPartitionsPerPIndex": 1024,
        "indexPartitions": 1,
        "numReplicas": 0
    },

To view the entire JSON payload, click View.

The planParams object contains the following properties:

Property Type Required? Description

maxPartitionsPerPIndex

n/a

This setting is deprecated. Use indexPartitions, instead.

indexPartitions

Number

Yes

The number of partitions to split the Search index into, across the nodes you have available in your cluster with the Search Service enabled. Use index partitions to increase index and query performance on large datasets.

The scoring calculation for regular Search queries can be affected by the number of partitions in your Search index, and how the Search Service distributes documents across partitions. This is a limitation of the tf-idf weighting scheme.

numReplicas

Number

Yes

For high-availability, set the number of replicas the Search Service creates for the Search index.

You can create up to three replicas for a Search index. Each replica creates a full copy of the Search index to increase high-availability. To turn off replication for the Search index, set numReplicas to 0.

The number of replicas you can create depends on the number of nodes you have available with the Search Service enabled.

Params Object

The params object sets a Search index’s type identifier, type mappings, and analyzers.

For a Search index alias, it includes a JSON object for each Search index to include in the alias.

It contains the following properties:

Property	Type	Required?	Description
doc_config	Object	Yes	An object that sets how the Search index sets a document’s type. For more information, see Doc_config Object.
mapping	Object	Yes	An object that sets the analyzers and type mappings for a Search index. For more information, see Mapping Object.
targets	Object	Index Alias Only	An object that contains JSON objects for each Search index to add to the alias definition. The key for each JSON object must be the fully qualified name of each Search index. For example: "targets": { "vector-sample.color.color-index": {} }

Property

Type

Required?

Description

doc_config

Object

Yes

An object that sets how the Search index sets a document’s type. For more information, see Doc_config Object.

mapping

Object

Yes

An object that sets the analyzers and type mappings for a Search index. For more information, see Mapping Object.

targets

Object

Index Alias Only

An object that contains JSON objects for each Search index to add to the alias definition.

The key for each JSON object must be the fully qualified name of each Search index.

For example:

"targets": {

    "vector-sample.color.color-index": {}
}

Doc_config Object

The doc_config object sets how the Search index sets a document’s type:

        "doc_config": {
            "docid_prefix_delim": "",
            "docid_regexp": "",
            "mode": "scope.collection.type_field",
            "type_field": "type"
        },

To view the entire JSON payload, click View.

The doc_config object is a child object of the Params Object. It contains the following properties:

Property Type Required? Description

mode

String

Yes

Set a type identifier for the Search index to filter documents from search results:

type_field: Use the value from a specific field in the documents.
docid_prefix_delim: Use the leading characters in the documents' ID values, up to but not including a specified separator.
docid_regexp: Use a regular expression on the documents' ID values.

If you want your Search index to only include documents from a specific collection, the mode value must be "scope.collection.{mode}".

docid_prefix_delim

String

Yes

If mode is docid_prefix_delim, set the separator character to use on a document’s ID value.

For example, to filter documents based on the characters before a _ in their ID values, set docid_prefix_delim to _.

docid_regexp

String

Yes

If mode is docid_regexp, set the regular expression to use on a document’s ID value to determine its type.

For example, to filter documents that contain the characters _40 in their ID value, set docid_regexp to _[3-5]0.

type_field

String

Yes

If mode is type_field, set the name of the field to use to filter documents.

For example, to filter documents based on the value of their type field, set type_field to type.

Mapping Object

The mapping object contains a Search index’s analyzers and other global index settings. Some of these settings map to Global Index Settings in the UI:

        "mapping": {
            "analysis": {
                "analyzers": {
                    "My_Analyzer": {
                        "token_filters": [
                            "apostrophe",
                            "My_Token_Filter"
                        ],
                        "char_filters": [
                            "asciifolding",
                            "html",
                            "My_Char_Filter"
                        ],
                        "type": "custom",
                        "tokenizer": "My_Tokenizer_Excep"
                    }
                },
                "char_filters": {
                    "My_Char_Filter": {
                        "regexp": "[']",
                        "replace": " ",
                        "type": "regexp"
                    }
                },
                "tokenizers": {
                    "My_Tokenizer_Excep": {
                        "exceptions": [
                            "[*]"
                        ],
                        "tokenizer": "unicode",
                        "type": "exception"
                    },
                    "My_Tokenizer_RegExp": {
                        "regexp": "[*]",
                        "type": "regexp"
                    }
                },
                "token_filters": {

To view the entire JSON payload, click View.

The mapping object is a child object of the Params Object. It contains the following properties:

Property Type Required? Description

Property	Type	Required?	Description
analysis	Object	Yes	An object that contains the following child objects: Analyzers Object Char_filters Object Tokenizers Object Token_filters Object Token_maps Object Date_time_parsers Object
default_analyzer	String	Yes	The name of the default analyzer to use for the Search index. For more information about analyzers, see Analyzers.
default_datetime_parser	String	Yes	The name of the default date/time parser to use for the Search index. For more information about date/time parsers, see Date/Time Parsers.
default_field	String	Yes	Set a name for the `all` field in the Search index. If you enable the include_in_all property for a document field, the contents of that document field can be searched without specifying a field name or by specifying the default field’s name in your Search query.
default_mapping	Object	No	An object that contains settings for the default type mapping on the Search index. The default type mapping contains all documents under the `_default` scope and `_default` collection in the bucket. This type mapping is included for compatibility only. For more information about the properties inside the `default_mapping` object, see Default_mapping Object.
default_type	String	No	This setting is included for compatibility with earlier indexes only.
docvalues_dynamic	Boolean	Yes	To include the value for each instance of an indexed field in the Search index to support facets and sorting search results, set `docvalues_dynamic` to `true`. To exclude the values for an indexed field in the index, set `docvalues_dynamic` to `false`.
index_dynamic	Boolean	Yes	To index any fields in the Search index where `dynamic` is `true`, set `index_dynamic` to `true`. To exclude dynamic fields from the index, set `index_dynamic` to `false`.
store_dynamic	Boolean	Yes	To return the content from an indexed field in the Search index, set `store_dynamic` to `true`. To exclude field content from the index, set `store_dynamic` to `false`.
type_field	String	No	Use the same value assigned to the `type_field` in `doc_config`, if applicable.
types	Object	No	An object that contains any user-defined type mappings for the Search index, as `{scope}.{collection}` objects inside a `types` object. For more information, see Types Object.

analysis

Object

Yes

An object that contains the following child objects:

Analyzers Object
Char_filters Object
Tokenizers Object
Token_filters Object
Token_maps Object
Date_time_parsers Object

default_analyzer

String

Yes

The name of the default analyzer to use for the Search index.

For more information about analyzers, see Analyzers.

default_datetime_parser

String

Yes

The name of the default date/time parser to use for the Search index.

For more information about date/time parsers, see Date/Time Parsers.

default_field

String

Yes

Set a name for the all field in the Search index.

If you enable the include_in_all property for a document field, the contents of that document field can be searched without specifying a field name or by specifying the default field’s name in your Search query.

default_mapping

Object

An object that contains settings for the default type mapping on the Search index.

The default type mapping contains all documents under the _default scope and _default collection in the bucket.

This type mapping is included for compatibility only.

For more information about the properties inside the default_mapping object, see Default_mapping Object.

default_type

String

This setting is included for compatibility with earlier indexes only.

docvalues_dynamic

Boolean

Yes

To include the value for each instance of an indexed field in the Search index to support facets and sorting search results, set docvalues_dynamic to true.

To exclude the values for an indexed field in the index, set docvalues_dynamic to false.

index_dynamic

Boolean

Yes

To index any fields in the Search index where dynamic is true, set index_dynamic to true.

To exclude dynamic fields from the index, set index_dynamic to false.

store_dynamic

Boolean

Yes

To return the content from an indexed field in the Search index, set store_dynamic to true.

To exclude field content from the index, set store_dynamic to false.

type_field

String

Use the same value assigned to the type_field in doc_config, if applicable.

types

Object

An object that contains any user-defined type mappings for the Search index, as {scope}.{collection} objects inside a types object.

For more information, see Types Object.

Analyzers Object

The analyzers object contains any custom analyzers defined for a Search index.

                "analyzers": {
                    "My_Analyzer": {
                        "token_filters": [
                            "apostrophe",
                            "My_Token_Filter"
                        ],
                        "char_filters": [
                            "asciifolding",
                            "html",
                            "My_Char_Filter"
                        ],
                        "type": "custom",
                        "tokenizer": "My_Tokenizer_Excep"
                    }
                },

To view the entire JSON payload, click View.

The analyzers object is a child object of the analysis object. It contains any number of {analyzer_name} objects:

Property Type Required? Description

Property	Type	Required?	Description
{analyzer_name}	Object	Yes	Set the name of this object to the name you want for your custom analyzer. You can reference the `{analyzer_name}` object elsewhere in your Search index definition to use the analyzer. For more information about the properties in an `{analyzer_name}` object, see {Analyzer_name} Object.

{analyzer_name}

Object

Yes

Set the name of this object to the name you want for your custom analyzer.

You can reference the {analyzer_name} object elsewhere in your Search index definition to use the analyzer.

For more information about the properties in an {analyzer_name} object, see {Analyzer_name} Object.

{Analyzer_name} Object

The {analyzer_name} object defines a custom analyzer for a Search index:

                    "My_Analyzer": {
                        "token_filters": [
                            "apostrophe",
                            "My_Token_Filter"
                        ],
                        "char_filters": [
                            "asciifolding",
                            "html",
                            "My_Char_Filter"
                        ],
                        "type": "custom",
                        "tokenizer": "My_Tokenizer_Excep"
                    }

An {analyzer_name} object is a child object of the Analyzers Object. It contains the following properties:

Property Type Required? Description

Property	Type	Required?	Description
token_filters	Array	Yes	An array of strings that contains the token filters for the custom analyzer. For more information about the token filters you can define in a Search index JSON payload, see Token_filters Object. You can also use one of the default token filters.
char_filters	Array	Yes	An array of strings that contains the character filters for the custom analyzer. For more information about the character filters you can define in a Search index JSON payload, see Char_filters Object. You can also use one of the default character filters available.
type	String	Yes	The `type` is always `"custom"`.
tokenizer	String	Yes	The selected tokenizer for the custom analyzer.

token_filters

Array

Yes

An array of strings that contains the token filters for the custom analyzer.

For more information about the token filters you can define in a Search index JSON payload, see Token_filters Object.

You can also use one of the default token filters.

char_filters

Array

Yes

An array of strings that contains the character filters for the custom analyzer.

For more information about the character filters you can define in a Search index JSON payload, see Char_filters Object.

You can also use one of the default character filters available.

type

String

Yes

The type is always "custom".

tokenizer

String

Yes

The selected tokenizer for the custom analyzer.

Char_filters Object

The char_filters object contains any custom character filters defined for a Search index:

                "char_filters": {
                    "My_Char_Filter": {
                        "regexp": "[']",
                        "replace": " ",
                        "type": "regexp"
                    }
                },

To view the entire JSON payload, click View.

The char_filters object is a child object of the analysis object. It contains any number of {char_filter_name} objects:

Property Type Required? Description

Property	Type	Required?	Description
{char_filter_name}	Object	Yes	Set the name of this object to the name you want for your custom character filter. You can reference the `{char_filter_name}` object elsewhere in your Search index definition to use the character filter. For more information about the properties in an `{char_filter_name}` object, see {Char_filter_name} Object.

{char_filter_name}

Object

Yes

Set the name of this object to the name you want for your custom character filter.

You can reference the {char_filter_name} object elsewhere in your Search index definition to use the character filter.

For more information about the properties in an {char_filter_name} object, see {Char_filter_name} Object.

{Char_filter_name} Object

The {char_filter_name} object defines a specific custom character filter for a Search index:

                    "My_Char_Filter": {
                        "regexp": "[']",
                        "replace": " ",
                        "type": "regexp"
                    }

A {char_filter_name} object is a child object of the Char_filters Object. It contains the following properties:

Property Type Required? Description

Property	Type	Required?	Description
regexp	String	Yes	The regular expression to use to filter characters from search queries and documents.
replace	String	No	The content to insert instead of the content in the `regexp` property.
type	String	Yes	The `type` is always `regexp`.

regexp

String

Yes

The regular expression to use to filter characters from search queries and documents.

replace

String

The content to insert instead of the content in the regexp property.

type

String

Yes

The type is always regexp.

Tokenizers Object

The tokenizers object contains any custom tokenizers defined for a Search index:

                "tokenizers": {
                    "My_Tokenizer_Excep": {
                        "exceptions": [
                            "[*]"
                        ],
                        "tokenizer": "unicode",
                        "type": "exception"
                    },
                    "My_Tokenizer_RegExp": {
                        "regexp": "[*]",
                        "type": "regexp"
                    }
                },

To view the entire JSON payload, click View.

The tokenizers object is a child object of the analysis object. It contains any number of {tokenizer_name objects}:

Property Type Required? Description

Property	Type	Required?	Description
{tokenizer_name}	Object	Yes	Set the name of this object to the name you want for your custom tokenizer. You can reference the `{tokenizer_name}` object elsewhere in your Search index definition to use the tokenizer. For more information about the properties in an `{tokenizer_name}` object, see {Tokenizer_name} Object.

{tokenizer_name}

Object

Yes

Set the name of this object to the name you want for your custom tokenizer.

You can reference the {tokenizer_name} object elsewhere in your Search index definition to use the tokenizer.

For more information about the properties in an {tokenizer_name} object, see {Tokenizer_name} Object.

{Tokenizer_name} Object

The {tokenizer_name} object defines a specific custom tokenizer for a Search index. For example, the following My_Tokenizer_Excep object defines an exception tokenizer:

                    "My_Tokenizer_Excep": {
                        "exceptions": [
                            "[*]"
                        ],
                        "tokenizer": "unicode",
                        "type": "exception"
                    },

A {tokenizer_name} object is a child object of the Tokenizers Object. It contains the following properties:

Property Type Required? Description

Property	Type	Required?	Description
exceptions	Array	Yes	If the tokenizer’s `type` value is `exception`, define an array of regular expressions to remove from text input to create tokens. For example, if you add the characters `sh` as a string to the `exceptions` array, an input string of `shTimeshToshGo` has the tokens `Time`, `To`, and `Go`.
regexp	String	Yes	If the tokenizer’s `type` value is `regexp`, set the regular expression that the tokenizer uses to divide input into tokens. The tokenizer takes any matches for the regular expression from the input text stream and uses them as tokens. For example, if you use the regular expression `\w*\w`, an input string of `Full Text Search` has the tokens `Full`, `Text`, and `Search`.
tokenizer	String	Yes	If the tokenizer’s `type` value is `exception`, give a default tokenizer to apply to the tokens created with the `exceptions` array. You can choose a default tokenizer or use a tokenizer defined in the `tokenizers` object.
type	String	Yes	The tokenizer’s type. Can be one of: `regexp`: The tokenizer uses a regular expression to create tokens. The tokenizer uses any matches to the regular expression as individual tokens. `exception`: The tokenizer uses an array of regular expressions to remove content and create tokens. The tokenizer uses any matches to the regular expressions and creates tokens from the surrounding text.

exceptions

Array

Yes

If the tokenizer’s type value is exception, define an array of regular expressions to remove from text input to create tokens.

For example, if you add the characters sh as a string to the exceptions array, an input string of shTimeshToshGo has the tokens Time, To, and Go.

regexp

String

Yes

If the tokenizer’s type value is regexp, set the regular expression that the tokenizer uses to divide input into tokens.

The tokenizer takes any matches for the regular expression from the input text stream and uses them as tokens.

For example, if you use the regular expression \w*\w, an input string of Full Text Search has the tokens Full, Text, and Search.

tokenizer

String

Yes

If the tokenizer’s type value is exception, give a default tokenizer to apply to the tokens created with the exceptions array.

You can choose a default tokenizer or use a tokenizer defined in the tokenizers object.

type

String

Yes

The tokenizer’s type. Can be one of:

regexp: The tokenizer uses a regular expression to create tokens. The tokenizer uses any matches to the regular expression as individual tokens.
exception: The tokenizer uses an array of regular expressions to remove content and create tokens. The tokenizer uses any matches to the regular expressions and creates tokens from the surrounding text.

Token_filters Object

The token_filters object contains any custom token filters defined for a Search index.

                "token_filters": {
                    "My_Token_Filter": {
                        "min": 3,
                        "max": 255,
                        "type": "length"
                    }
                },

To view the entire JSON payload, click View.

The token_filters object is a child object of the analysis object. It contains any number of {token_filter_name} objects:

Property Type Required? Description

Property	Type	Required?	Description
{token_filter_name}	Object	Yes	Set the name of this object to the name you want for your custom token filter. You can reference the `{token_filter_name}` object elsewhere in your Search index definition to use the token filter. For more information about the properties in an `{token_filter_name}` object, see {Token_filter_name} Object.

{token_filter_name}

Object

Yes

Set the name of this object to the name you want for your custom token filter.

You can reference the {token_filter_name} object elsewhere in your Search index definition to use the token filter.

For more information about the properties in an {token_filter_name} object, see {Token_filter_name} Object.

{Token_filter_name} Object

The {token_filter_name} object defines a custom token filter for a Search index. For example, the following My_Token_Filter object defines a custom length token filter:

                    "My_Token_Filter": {
                        "min": 3,
                        "max": 255,
                        "type": "length"
                    }

A {token_filter_name} object is a child object of the Token_filters Object. It contains the following properties:

Property Type Required? Description

Property	Type	Required?	Description
type	String	Yes	The token filter’s type. Can be one of: `dict_compound`: Use a wordlist to find and create tokens from compound words in existing tokens. See Dict_compound Token Filters. `edge_ngram`: Use a set character length to create tokens from the start or end of existing tokens. See Edge_ngram Token Filters. `elision`: Use a wordlist to remove elisions from input tokens. See Elision Token Filters. `keyword_marker`: Use a wordlist of keywords to find and create new tokens. See Keyword_marker Token Filters. `length`: Use a set character length to filter tokens that are too long or too short. See Length Token Filters. `ngram`: Use a set character length to create new tokens. See Ngram Token Filters. `normalize_unicode`: Use Unicode Normalization to convert tokens. See Normalize_unicode Token Filters. `shingle`: Use a set character length and separator to concatenate and create new tokens. See Shingle Token Filters. `stop_tokens`: Use a wordlist to find and remove words from tokens. See Stop_tokens Token Filters. `truncate_token`: Use a set character length to truncate existing tokens. See Truncate_token Token Filters.

type

String

Yes

The token filter’s type. Can be one of:

dict_compound: Use a wordlist to find and create tokens from compound words in existing tokens. See Dict_compound Token Filters.
edge_ngram: Use a set character length to create tokens from the start or end of existing tokens. See Edge_ngram Token Filters.
elision: Use a wordlist to remove elisions from input tokens. See Elision Token Filters.
keyword_marker: Use a wordlist of keywords to find and create new tokens. See Keyword_marker Token Filters.
length: Use a set character length to filter tokens that are too long or too short. See Length Token Filters.
ngram: Use a set character length to create new tokens. See Ngram Token Filters.
normalize_unicode: Use Unicode Normalization to convert tokens. See Normalize_unicode Token Filters.
shingle: Use a set character length and separator to concatenate and create new tokens. See Shingle Token Filters.
stop_tokens: Use a wordlist to find and remove words from tokens. See Stop_tokens Token Filters.
truncate_token: Use a set character length to truncate existing tokens. See Truncate_token Token Filters.

Dict_compound Token Filters

A dict_compound token filter uses a wordlist to find subwords inside an input token. If the token filter finds a subword inside a compound word, it turns it into a separate token.

      "My_Dict_Compound_Filter": {
        "dict_token_map": "articles_ca",
        "type": "dict_compound"
      },

For example, if you had a wordlist that contained play and jump, the token filter converts playful jumping into two tokens: play and jump.

Property	Type	Required?	Description
dict_token_map	String	Yes	The wordlist to use to find subwords in existing tokens. You can use a default wordlist or one defined in the Token_maps Object.

Property

Type

Required?

Description

dict_token_map

String

Yes

The wordlist to use to find subwords in existing tokens.

You can use a default wordlist or one defined in the Token_maps Object.

Edge_ngram Token Filters

An edge_ngram token filter uses a specified range to create new tokens. You can also choose whether to create the new token from the start or backward from the end of the input token.

      "My_Edge_ngram_Filter": {
        "back": false,
        "min": 4,
        "max": 5,
        "type": "edge_ngram"
      },

For example, if you had a miminum of four and a maximum of five with an input token of breweries, the token filter creates the tokens brew and brewe.

Property Type Required? Description

Property	Type	Required?	Description
back	Boolean	Yes	To create new tokens starting from the end and moving backward in an input token, set `back` to `true`. To create new tokens starting from the beginning and moving forward in an input token, set `back` to `false`.
min	Integer	Yes	Set the minimum character length for a new token.
max	Integer	Yes	Set the maximum character length for a new token.

back

Boolean

Yes

To create new tokens starting from the end and moving backward in an input token, set back to true.

To create new tokens starting from the beginning and moving forward in an input token, set back to false.

min

Integer

Yes

Set the minimum character length for a new token.

max

Integer

Yes

Set the maximum character length for a new token.

Elision Token Filters

An elision token filter removes elisions from input tokens.

      "My_Elision_Filter": {
        "articles_token_map": "stop_fr",
        "type": "elision"
      },

For example, if you had the stop_fr wordlist in an elision token filter, the token je m’appelle John becomes the tokens je, appelle, and John.

Property	Type	Required?	Description
articles_token_map	String	Yes	The wordlist to use to find and remove elisions in existing tokens. You can use a default wordlist or one defined in the Token_maps Object.

Property

Type

Required?

Description

articles_token_map

String

Yes

The wordlist to use to find and remove elisions in existing tokens.

You can use a default wordlist or one defined in the Token_maps Object.

Keyword_marker Token Filters

A keyword_marker token filter finds keywords in an input token and turns them into tokens.

      "My_Keyword_Marker_Filter": {
        "keywords_token_map": "articles_ca",
        "type": "keyword_marker"
      },

For example, if you had a wordlist that contained the keyword beer, the token beer and breweries becomes the token beer.

Property	Type	Required?	Description
keywords_token_map	String	Yes	The wordlist to use to find keywords in existing tokens. You can use a default wordlist or one defined in the Token_maps Object.

Property

Type

Required?

Description

keywords_token_map

String

Yes

The wordlist to use to find keywords in existing tokens.

You can use a default wordlist or one defined in the Token_maps Object.

Length Token Filters

A length token filter removes tokens that are shorter or longer than a set character length.

      "My_Length_Filter": {
       "min": 2,
       "max": 4,
       "type": "length"
      },

For example, if you had a range with a minimum of two and a maximum of four, the token beer and breweries becomes the tokens beer and and.

Property	Type	Required?	Description
min	Integer	Yes	The minimum character length for a new token from the token filter.
max	Integer	Yes	The maximum character length for a new token from the token filter.

Property

Type

Required?

Description

min

Integer

Yes

The minimum character length for a new token from the token filter.

max

Integer

Yes

The maximum character length for a new token from the token filter.

Ngram Token Filters

An ngram token filter uses a specified character length to split an input token into new tokens.

      "My_Ngram_Filter": {
        "min": 4,
        "max": 5,
        "type": "ngram"
      },

For example, if you had a range with a minimum of four and a maximum of five, the token beers becomes the tokens beer, beers, and eers.

Property	Type	Required?	Description
min	Integer	Yes	The minimum character length for a new token from the token filter.
max	Integer	Yes	The maximum character length for a new token from the token filter.

Property

Type

Required?

Description

min

Integer

Yes

The minimum character length for a new token from the token filter.

max

Integer

Yes

The maximum character length for a new token from the token filter.

Normalize_unicode Token Filters

A normalize_unicode token filter uses a specified Unicode Normalization form to create new tokens.

      "My_Normalize_Unicode_Filter": {
        "form": "nfkd",
        "type": "normalize_unicode"
      },

Property Type Required? Description

Property	Type	Required?	Description
form	String	Yes	Select the form of Unicode Normalization to use on input tokens: `nfc`: Use canonical decomposition and canonical composition to normalize characters. The token filter separates combined unicode characters, then merges them into a single character. `nfd`: Use canonical decomposition to normalize characters. The token filter separates combined unicode characters. `nfkc`: Use compatibility decomposition to normalize characters. The token filter converts unicode characters to remove variants. `nfkd`: Use compatibility decomposition and canonical composition to normalize characters. The token filter removes variants, then separates combined unicode characters to merge them into a single character. For more information about Unicode Normalization, see the Unicode Consortium’s Unicode Normalization Forms report.

form

String

Yes

Select the form of Unicode Normalization to use on input tokens:

nfc: Use canonical decomposition and canonical composition to normalize characters. The token filter separates combined unicode characters, then merges them into a single character.
nfd: Use canonical decomposition to normalize characters. The token filter separates combined unicode characters.
nfkc: Use compatibility decomposition to normalize characters. The token filter converts unicode characters to remove variants.
nfkd: Use compatibility decomposition and canonical composition to normalize characters. The token filter removes variants, then separates combined unicode characters to merge them into a single character.

For more information about Unicode Normalization, see the Unicode Consortium’s Unicode Normalization Forms report.

Shingle Token Filters

A shingle token filter uses a specified character length and separator to create new tokens.

      "My_Shingle_Filter":{
        "min": 2,
        "max": 3,
        "output_original": true,
        "separator": " ",
        "filler": "x",
        "type": "shingle"
      },

For example, if you use a whitespace tokenizer, a range with a minimum of two and a maximum of three, and a space as a separator, the token abc def becomes abc, def, and abc def.

Property Type Required? Description

Property	Type	Required?	Description
min	Integer	Yes	The minimum character length for a new token before concatenation.
max	Integer	Yes	The maximum character length for a new token before concatenation.
output_original	Boolean	Yes	To add the original token to the token filter’s output, set `output_original` to `true`. To exclude the original token from the token filter’s output, set `output_original` to `false`.
separator	String	No	Set a `separator` to include a character or characters in between concatenated tokens.
filler	String	No	If another token filter removes a token from the input for this token filter, set a `filler` to replace the removed token.

min

Integer

Yes

The minimum character length for a new token before concatenation.

max

Integer

Yes

The maximum character length for a new token before concatenation.

output_original

Boolean

Yes

To add the original token to the token filter’s output, set output_original to true.

To exclude the original token from the token filter’s output, set output_original to false.

separator

String

Set a separator to include a character or characters in between concatenated tokens.

filler

String

If another token filter removes a token from the input for this token filter, set a filler to replace the removed token.

Stop_tokens Token Filters

A stop_tokens token filter uses a wordlist to remove specific tokens from input.

      "My_Stop_Tokens_Filter":{
        "stop_token_map": "articles_ca",
        "type": "stop_tokens"
      },

For example, if you have a wordlist that contains the word and, the token beers and breweries becomes beers and breweries.

Property	Type	Required?	Description
stop_token_map	String	Yes	The wordlist to use to filter tokens. The token filter removes any tokens from input that match an entry in the wordlist. You can use a default wordlist or one defined in the Token_maps Object.

Property

Type

Required?

Description

stop_token_map

String

Yes

The wordlist to use to filter tokens.

The token filter removes any tokens from input that match an entry in the wordlist.

You can use a default wordlist or one defined in the Token_maps Object.

Truncate_token Token Filters

A truncate_token token filter uses a specified character length to shorten any input tokens that are too long.

      "My_Truncate_Token_Filter":{
        "length": 4,
        "type": "truncate_token"
      }

For example, if you had a length of four, the token beer and breweries becomes beer, and, and brewe.

Property	Type	Required?	Description
length	Integer	Yes	The maximum character length for an output token.

Property

Type

Required?

Description

length

Integer

Yes

The maximum character length for an output token.

Token_maps Object

The token_maps object contains any custom wordlists defined for a Search index:

                "token_maps": {
                    "My_Wordlist": {
                        "type": "custom",
                        "tokens": [
                            "the",
                            "is",
                            "and"
                        ]
                    }
                },

To view the entire JSON payload, click View.

The token_maps object is a child object of the analysis object. It contains any number of {wordlist_name} objects:

Property Type Required? Description

Property	Type	Required?	Description
{wordlist_name}	Object	Yes	Set the name of this object to the name you want for your custom wordlist. You can reference the `{wordlist_name}` object elsewhere in your Search index definition to use the wordlist. For more information about the properties in an `{wordlist_name}` object, see {Wordlist_name} Object.

{wordlist_name}

Object

Yes

Set the name of this object to the name you want for your custom wordlist.

You can reference the {wordlist_name} object elsewhere in your Search index definition to use the wordlist.

For more information about the properties in an {wordlist_name} object, see {Wordlist_name} Object.

{Wordlist_name} Object

The {wordlist_name} object defines a custom wordlist for a Search index:

                    "My_Wordlist": {
                        "type": "custom",
                        "tokens": [
                            "the",
                            "is",
                            "and"
                        ]
                    }

A {wordlist_name} object is a child object of the Token_maps Object. It contains the following properties:

Property Type Required? Description

Property	Type	Required?	Description
type	String	Yes	The `type` is always `"custom"`.
tokens	Array	Yes	An array of strings that contains each word added to the wordlist.

type

String

Yes

The type is always "custom".

tokens

Array

Yes

An array of strings that contains each word added to the wordlist.

Date_time_parsers Object

The date_time_parsers object contains any custom date/time parsers defined for a Search index:

                "date_time_parsers": {
                    "My_Date_Time_Parser": {
                        "type": "flexiblego",
                        "layouts": [
                            "RFC850"
                        ]
                    }
                }

To view the entire JSON payload, click View.

The date_time_parsers object is a child object of the analysis object. It contains any number of {date_time_parser_name} objects:

Property Type Required? Description

Property	Type	Required?	Description
{date_time_parser_name}	Object	Yes	Set the name of this object to the name you want for your custom date/time parser. You can reference the `{date_time_parser_name}` object elsewhere in your Search index definition to use the date/time parser. For more information about the properties in an `{date_time_parser_name}` object, see {date_time_parser_name} Object.

{date_time_parser_name}

Object

Yes

Set the name of this object to the name you want for your custom date/time parser.

You can reference the {date_time_parser_name} object elsewhere in your Search index definition to use the date/time parser.

For more information about the properties in an {date_time_parser_name} object, see {date_time_parser_name} Object.

{date_time_parser_name} Object

The {date_time_parser_name} object defines a custom date/time parser for a Search index:

                    "My_Date_Time_Parser": {
                        "type": "flexiblego",
                        "layouts": [
                            "RFC850"
                        ]
                    }

A {date_time_parser_name} object is a child object of the Date_time_parsers Object. It contains the following properties:

Property Type Required? Description

Property	Type	Required?	Description
type	String	Yes	The `type` is always `"flexiblego"`.
layouts	Array	Yes	An array of strings that contains layouts for date and time fields. Use a layout from the Go Programming Language Time Package’s Layout Constant.

type

String

Yes

The type is always "flexiblego".

layouts

Array

Yes

An array of strings that contains layouts for date and time fields.

Use a layout from the Go Programming Language Time Package’s Layout Constant.

Default_mapping Object

The default_mapping object contains settings for the default type mapping on the Search index. The default type mapping is a legacy feature and only included for compatibility.

            "default_mapping": {
                "dynamic": false,
                "enabled": false
            },

To view the entire JSON payload, click View.

The default_mapping object is a child object of the Mapping Object. It contains the following properties:

Property Type Required? Description

Property	Type	Required?	Description
dynamic	Boolean	Yes	To index all available fields in a document with the default type mapping, set `dynamic` to `true`. To only index the fields you specify in the type mapping, set `dynamic` to `false`.
enabled	Boolean	Yes	To enable the Search Service’s default type mapping, set `enabled` to `true`. The default type mapping includes all documents in the bucket in the Search index, even if they do not match another configured type mapping. This can increase index size and indexing time. To disable the default type mapping, set `enabled` to `false`.

dynamic

Boolean

Yes

To index all available fields in a document with the default type mapping, set dynamic to true.

To only index the fields you specify in the type mapping, set dynamic to false.

enabled

Boolean

Yes

To enable the Search Service’s default type mapping, set enabled to true.

The default type mapping includes all documents in the bucket in the Search index, even if they do not match another configured type mapping. This can increase index size and indexing time.

To disable the default type mapping, set enabled to false.

Types Object

The types object contains any additional user-defined type mappings for a Search index.

            "types": {
                "inventory.hotel": {
                    "dynamic": false,
                    "enabled": true,
                    "properties": {
                        "_$xattrs": {
                            "dynamic": true,
                            "enabled": true
                        },
                        "reviews": {
                            "dynamic": false,
                            "enabled": true,
                            "properties": {
                                "content": {
                                    "enabled": true,
                                    "dynamic": false,
                                    "fields": [
                                        {
                                            "docvalues": true,
                                            "include_in_all": true,
                                            "include_term_vectors": true,
                                            "index": true,
                                            "name": "content",
                                            "store": true,
                                            "type": "text",
                                            "analyzer": "My_Analyzer"
                                        }
                                    ]
                                }
                            }
                        },
                        "city": {
                            "enabled": true,
                            "dynamic": false,
                            "fields": [
                                {
                                    "docvalues": true,
                                    "include_in_all": true,
                                    "include_term_vectors": true,
                                    "index": true,
                                    "name": "city",
                                    "store": true,
                                    "type": "text"
                                }
                            ]
                        }
                    }
                }
            }

To view the entire JSON payload, click View.

The types object is a child object of the Mapping Object. It contains any number of {scope}.{collection} objects:

Property Type Required? Description

{scope}.{collection}

Object

Yes

The name of the type mapping. Corresponds to the selected scope and collection where the type mapping applies. For example, inventory.airline.

For more information about the properties in an {scope}.{collection} object, see {Scope}.{collection} Objects, JSON Object Field Objects, and XATTRs Objects.

To add a type identifier as an additional filter to your type mapping, add the filter to the end of your {scope}.{collection} object. For example, to use a type_field filter that uses the type field, and add only documents with a type value of hotel, the object name would be {scope}.{collection}.hotel

{Scope}.{collection} Objects, JSON Object Field Objects, and XATTRs Objects

The {scope}.{collection} object defines a custom type mapping for a Search index, on a specific scope and collection in the cluster:

                "inventory.hotel": {
                    "dynamic": false,
                    "enabled": true,
                    "properties": {
                        "_$xattrs": {
                            "dynamic": true,
                            "enabled": true
                        },
                        "reviews": {
                            "dynamic": false,
                            "enabled": true,
                            "properties": {
                                "content": {
                                    "enabled": true,
                                    "dynamic": false,
                                    "fields": [
                                        {
                                            "docvalues": true,
                                            "include_in_all": true,
                                            "include_term_vectors": true,
                                            "index": true,
                                            "name": "content",
                                            "store": true,
                                            "type": "text",
                                            "analyzer": "My_Analyzer"
                                        }
                                    ]
                                }
                            }
                        },
                        "city": {
                            "enabled": true,
                            "dynamic": false,
                            "fields": [
                                {
                                    "docvalues": true,
                                    "include_in_all": true,
                                    "include_term_vectors": true,
                                    "index": true,
                                    "name": "city",
                                    "store": true,
                                    "type": "text"
                                }
                            ]
                        }
                    }
                }

A {scope}.{collection} object is a child object of the Types Object.

A JSON object field object is a child object of the {scope}.{collection} object. It defines a mapping for a field that contains a JSON object in your document schema. It can contain additional mappings as {field_name} Object under its properties object.

A JSON object field object must:

Be a child object of a {scope}.{collection} object.
Have a name that matches a JSON object field in your documents.
Use the same property structure as a {scope}.{collection} object.

If your cluster is running Couchbase Server version 7.6.2 and later and you’re adding Extended Attributes (XATTRs) from your document metadata to your Search index, your XATTRs mapping definition must:

Be a child object of a {scope}.{collection} object.
Have the name _$xattrs.
Use the same property structure as a {scope}.{collection} object.

For example, the following JSON index definition snippet defines a {scope}.{collection} object for the inventory.hotel scope and collection. It adds a dynamic mapping for any XATTRs metadata present on documents in the collection. It adds two nested JSON Object Field objects, reviews and ratings, that contain a single document field object for the Cleanliness field:

{
    "inventory.hotel": {
        "dynamic": false,
        "enabled": true,
        "properties": {
            "_$xattrs": {
                "dynamic": true,
                "enabled": true
            },
            "reviews": {
                "dynamic": false,
                "enabled": true,
                "properties": {
                    "ratings": {
                        "dynamic": false,
                        "enabled": true,
                        "properties": {
                            "Cleanliness": {
                                "enabled": true,
                                "dynamic": false,
                                "fields": [
                                    {
                                        "docvalues": true,
                                        "include_in_all": true,
                                        "index": true,
                                        "name": "Cleanliness",
                                        "store": true,
                                        "type": "number"
                                    }
                                ]
                            }
                        }
                    }
                }
            }
        }
    }
}

You can view the JSON document schema for this example by looking at any document in the hotel collection from the travel-sample dataset. For more information, see Manage Documents with the Capella UI.

{scope}.{collection} objects, JSON Object Field objects, and XATTRs objects can contain the following properties:

Property Type Required? Description

Property	Type	Required?	Description
dynamic	Boolean	Yes	To index all fields under the specified scope and collection, JSON object, or all fields inside XATTRs, set `dynamic` to `true`. To only index the fields you specify and enable the `properties` block, set `dynamic` to `false`.
enabled	Boolean	Yes	To enable the mapping and include any documents that match it in the Search index, set `enabled` to `true`. To remove any documents that match this mapping from the Search index, set `enabled` to `false`.
properties	Object	No	The `properties` object is only enabled if `dynamic` is set to `false`. Specifies properties for the fields to index in the mapping. Contains any number of `{field_name}` objects. For more information, see {field_name} Object.

dynamic

Boolean

Yes

To index all fields under the specified scope and collection, JSON object, or all fields inside XATTRs, set dynamic to true.

To only index the fields you specify and enable the properties block, set dynamic to false.

enabled

Boolean

Yes

To enable the mapping and include any documents that match it in the Search index, set enabled to true.

To remove any documents that match this mapping from the Search index, set enabled to false.

properties

Object

The properties object is only enabled if dynamic is set to false.

Specifies properties for the fields to index in the mapping. Contains any number of {field_name} objects.

For more information, see {field_name} Object.

{field_name} Object

The {field_name} object contains properties and an array for a document field in a type mapping. You can have multiple {field_name} objects in a properties object.

                        "reviews": {
                            "dynamic": false,
                            "enabled": true,
                            "properties": {
                                "content": {
                                    "enabled": true,
                                    "dynamic": false,
                                    "fields": [
                                        {
                                            "docvalues": true,
                                            "include_in_all": true,
                                            "include_term_vectors": true,
                                            "index": true,
                                            "name": "content",
                                            "store": true,
                                            "type": "text",
                                            "analyzer": "My_Analyzer"
                                        }
                                    ]
                                }
                            }
                        },

To view the entire JSON payload, click View.

The name of the object corresponds to the name of the field you want to include or exclude from your Search index.

A {field_name} object contains the following properties:

Property Type Required? Description

Property	Type	Required?	Description
enabled	Boolean	Yes	To add this document field to the Search index, set `enabled` to `true`. To remove this document field from the index, set `enabled` to `false`.
dynamic	Boolean	No	This field is included for legacy compatibility only.
fields	Array	Yes	An array that contains objects with settings for each document field to index in the type mapping. For more information, see Fields Array.

enabled

Boolean

Yes

To add this document field to the Search index, set enabled to true.

To remove this document field from the index, set enabled to false.

dynamic

Boolean

This field is included for legacy compatibility only.

fields

Array

Yes

An array that contains objects with settings for each document field to index in the type mapping.

For more information, see Fields Array.

Fields Array

The fields array contains objects with settings for each document field to index in the type mapping:

                                    "fields": [
                                        {
                                            "docvalues": true,
                                            "include_in_all": true,
                                            "include_term_vectors": true,
                                            "index": true,
                                            "name": "content",
                                            "store": true,
                                            "type": "text",
                                            "analyzer": "My_Analyzer"
                                        }
                                    ]

To view the entire JSON payload, click View.

The fields array is located inside a {field_name} object. It contains the following properties:

Property Type Required? Description

analyzer

String

Text Only

If the document field’s type is text, set the analyzer to use for the document field.

If you want to use the default analyzer for the content of this document field, you do not need to include an analyzer property.

dims

Number

Vector Only

For a vector child field, enter the total number of elements in the vector embedding array.

From Couchbase Server version 7.6.2 and later, Vector Search indexes can support arrays with up to 4096 elements. Arrays can be an array of arrays.

For more information about Vector Search indexes, see Use Vector Search for AI Applications or Create a Vector Search Index in Quick Mode.

docvalues

Boolean

Yes

To include the value for each instance of the field in the Search index to support facets and sorting search results, set docvalues to true.

To exclude the values for each instance of this field from the index, set docvalues to false.

include_in_all

Boolean

Yes

To allow this field to be searched without specifying the specific field’s name in the search, set include_in_all to true.

When enabled, you can search this field through the specified default_field set in the type mapping.

To only search this field by specifying the field name, set include_in_all to false.

include_term_vectors

Boolean

Yes

To use term vectors, store must be set to true.

To allow the Search Service to highlight matching search terms in search results for this field, set include_term_vectors to true.

You must also enable term vectors to use includeLocations in a Search query. For more information, see includeLocations.

To disable term highlighting and reduce index size, set include_term_vectors to false.

index

Boolean

Yes

To include the document field in the Search index, set index to true.

To exclude the document field from the index, set index to false.

name

String

Yes

The document field’s name.

similarity

String

Vector Only

For a vector child field, choose the method to calculate the similarity between the vector embedding in a Vector Search index and the vector embedding in a Vector Search query.

It’s recommended to choose the same similarity metric for your Search index as the one used in your embedding model.

dot_product: Calculated by adding the result of multiplying a vector’s components, or the product of the magnitudes of the vectors and the cosine of the angle between them. The dot product of 2 vectors is affected by the length and direction of each of the vectors, rather than just taking a straight-line distance.

Dot product similarity is commonly used by Large Language Models (LLMs). Use dot_product to get the best results with an embedding model that uses dot product similarity.
l2_norm: Also known as Euclidean distance. Uses the straight-line distance between 2 vectors to calculate similarity. Smaller euclidean distances mean that the values of each coordinate in the vectors are closer together.

It’s best to use l2_norm similarity when your embeddings contain information about the count or measure of specific things, and your embedding model uses the same similarity metric.

For more information about Vector Search indexes, see Use Vector Search for AI Applications or Create a Vector Search Index in Quick Mode.

store

Boolean

Yes

To include the content of the document field in the Search index and allow its content to be viewed in search results, set store to true.

To exclude the content of the document field from the index, set store to false.

type

String

Yes

The document field’s type. Can be one of:

text
number
datetime
boolean
geopoint
geoshape
disabled
ip
vector
(Server version 7.6.2 and later) vector_base64

For more information about the available field data types, see Field Data Types.

vector_index_optimized_for

String

Vector Only

For a vector child field, choose whether the Search Service should prioritize recall or latency when returning similar vectors in search results:

recall: The Search Service prioritizes returning the most accurate result. This may increase resource usage for Search queries.

The Search Service uses an nprobe value to calculate the number of centroids to search when using recall priority. This value is calculated by taking the square root of the number of centroids in the index.
latency: The Search Service prioritizes returning results with lower latency. This may reduce the accuracy of results.

The Search Service uses half the nprobe value calculated for recall priority.

For more information about Vector Search indexes, see Use Vector Search for AI Applications or Create a Vector Search Index in Quick Mode.