Index Lifecycle
- concept
An overview of the lifecycle of a Global Secondary Index, from creation and building to updates and scans.
In Couchbase Server 7.0 and later, a Global Secondary Index is created on a single collection — not on an entire bucket. (A collection may, of course, have multiple indexes.) By carefully planning the structure of your data, and storing documents of different types in separate collections, you can improve performance at every stage in the lifecycle of an index.
Index Creation
Index creation happens in two phases: the creation phase and the build phase. During the creation phase, the Index Service validates the user input, decides the host node for the index, and creates the index metadata on the host node. The build phase cannot start until the creation phase is complete.
Index Building
During the build phase, the Index Service reads the documents from the Data Service and builds the index.
-
The projector requests a DCP stream on the specified collection.
-
DCP streams only the documents in the collection.
-
The projector evaluates each document and extracts the indexed field(s).
-
The indexed fields are forwarded to the indexer.
Creating and building indexes can take a long time on keyspaces with lots of existing documents. When you create an index, you can choose to defer the build phase, and then build the deferred index later. This allows multiple indexes to be built at once rather than having to re-scan the entire keyspace for each index.
For more information and examples, refer to CREATE PRIMARY INDEX, CREATE INDEX, and BUILD INDEX.
Index Updates
When a Global Secondary Index has been created on a collection, the index is updated when documents within the collection are updated.
-
The projector keeps a bucket-level DCP stream open for updates — this limits the number of DCP connections.
-
When a document within the collection is updated, the projector utilizes the collection ID available with each mutation, and only evaluates the indexes defined for that collection.
-
The projector only sends updated indexed fields for the qualified indexes within the collection.
There is no additional processing or disk overhead for indexes on unrelated collections.
Refer to Database Change Protocol (DCP) for further details.
Index Scans
When a query needs to make use of a Global Secondary Index, the index is scanned.
-
If the index consistency is
not_bounded
, the scan proceeds without waiting for the index to be updated. -
If the scan consistency is
at_plus
orrequest_plus
, the scan coordinator waits for the index to be updated for that collection only, and then performs the scan.
The scan coordinator does not need to wait for indexes on other collections to be updated, even if documents in other collections are being mutated.
Refer to Index Scans and Index Consistency for further details.