Index Availability and Performance
- concept
The Index Service ensures availability and performance through replication and partitioning. The consistency of query-results can be controlled per query.
Index Replication
Secondary indexes can be replicated across cluster-nodes. This ensures:
-
Availability: If one Index-Service node is lost, the other continues to provide access to replicated indexes.
-
High Performance: If original and replica copies are available, incoming queries are load-balanced across them.
Index-replicas can be created with the SQL++ CREATE INDEX
statement.
Note that whenever a given number of index-replicas is specified for creation, the number must be less than the number of cluster-nodes currently running the Index Service.
If it is not, the index creation fails.
Note also that if, following creation of the maximum number of copies, the number of nodes running the Index Service decreases, Couchbase Server progressively assigns replacement index-replicas to any and all Index-Service nodes subsequently be added to the cluster, until the required number of index-replicas again exists for each replicated index.
Index-replicas can be created as follows:
-
Specifying, by means of the
WITH
clause, the destination nodes. In the following example, an index with two replicas is created. The active index is onnode1
, and the replicas are onnode2
andnode3
:CREATE INDEX country_idx ON airport(country, city) WITH {"nodes": ["node1:8091", "node2:8091", "node3:8091"]};
-
Specifying no destination nodes; but specifying instead, by means of the
WITH
clause and thenum_replica
attribute, only the number of replicas required. The replicas are automatically distributed across those nodes of the cluster that are running the Index Service: the distribution-pattern is based on a projection of optimal index-availability, given the number and disposition of Index-Service nodes across defined server-groups.In the following example, an index is created with two replicas, with no destination-nodes specified:
CREATE INDEX country_idx ON airport(country, city) WITH {"num_replica": 2};
Note that if
nodes
andnum_replica
are both specified in theWITH
clause, the specified number of nodes must be one greater thannum_replica
. -
Specifying a number of index-replicas to be created by the Index Service whenever
CREATE INDEX
is invoked. The default is0
. If the default is changed to, say,2
, creation of a single index is henceforth accompanied by the creation of two replicas, which are automatically distributed across the nodes of the cluster running the Index Service. No explicit specification within theCREATE INDEX
statement is required.With credentials that provide appropriate authorization, this default can be changed by means of the
curl
command, as follows:curl -X POST -u 'Administrator:password' \ 'http://localhost:8091/settings/indexes' \ -d 'numReplica=2'
Here,
numReplica
is an integer that establishes the default number of replicas that must be created wheneverCREATE INDEX
is invoked. Note that this call only succeeds if the cluster contains enough Index Service nodes to host each new index and its replicas: for example, if2
is specified as the default number of replicas, the Index Service must have been established on at least 3 nodes.Note also that whenever explicit specification of replica-numbers is made within the
CREATE INDEX
statement, this explicit specification takes precedence over any established default.
You can change index replication settings via the UI or the REST API. For further information on using SQL++, refer to Query Fundamentals.
Index Partitioning
Index Partitioning increases query performance, by dividing and spreading a large index of documents across multiple nodes. This feature is available only in Couchbase Server Enterprise Edition.
The benefits include:
-
The ability to scale out horizontally, as index size increases.
-
Transparency to queries, requiring no change to existing queries.
-
Reduction of query latency for large, aggregated queries; since partitions can be scanned in parallel.
-
Provision of a low-latency range query, while allowing indexes to be scaled out as needed.
For detailed information, refer to Index Partitioning.
Index Consistency
Whereas Couchbase Server handles data-mutations with full consistency — all mutations to a given key are applied to the same vBucket, and become immediately available — it maintains indexes with degrees of eventual consistency. This means that indexes may at times not contain the most up-to-date information, especially when deployed in a write-heavy environment: changes may take some time to propagate over to the index nodes.
The asynchronous updating nature of global secondary indexes means that they can be very quick to query and do not require the additional overhead of index recalculations at the time documents are modified. SQL++ queries are forwarded to the relevant indexes and the queries are done based on indexed information, rather than the documents as they exist in the data service.
With default query options, the query service will rely on the current index state: the most up-to-date document versions are not retrieved, and only the indexed versions are queried. This provides the best performance. Only updates occurring with a small time frame may not yet have been indexed.
The query service can use the latest versions of documents by modifying the scan_consistency
parameter, specified per query:
-
not_bounded
: Executes the query immediately, without requiring any consistency for the query. If index-maintenance is running behind, out-of-date results may be returned. -
at_plus
: Executes the query, requiring indexes first to be updated to the timestamp of the last update. If index-maintenance is running behind, the query waits for it to catch up. -
request_plus
: Executes the query, requiring the indexes first to be updated to the timestamp of the current query-request. If index-maintenance is running behind, the query waits for it to catch up.
For SQL++, the default consistency is not_bounded
.
When using the request_plus
consistency mode, the query service will ensure that the indexes are synchronized with the data service before querying.
You can specify the scan consistency via the run-time preferences in the Query Workbench, or by setting the scan_consistency request-level parameter.
Index Snapshots
One or more index snapshots are maintained on disk, to permit rapid recovery if node-failures are experienced. In cases where recovery requires an Index-Service node to be restarted, the node’s indexes are rebuilt from the snapshots retained on disk.
Index Rollback
The index service also maintains a DCP failover log. If necessary, the data service can request the index service to return to a specified rollback point and update its history.
Index Rollback After Failover
When a data node fails over, a replica data node is promoted to active. If the index service has more recent data than the new active data node, the data node issues a rollback request to the index service.
In Couchbase Server 6.5 and later, when the index service receives the rollback request, it first attempts to revert to a stored index snapshot. If successful, the index service does not need to rebuild its indexes from scratch when the data node fails over. The index service can continue servicing query clients without interruption.
If the index service cannot revert to a current index snapshot, it rebuilds all indexes from scratch.
If scan consistency is set to If scan consistency is set to |