Pre-filtering Vector Searches
- how-to
You can specify filters as part of a vector search statement which will restrict the documents searched during the query.
About Pre-filtering
Using pre-filtering as part of your vector search offers two key advantages:
-
Enhanced precision and relevance: Narrow your search results based on specific criteria, such as organization, date/time ranges, or geospatial locations.
-
Performance optimization: Reduce the search space before executing queries to improve query execution time and reduce computational overhead.
Prerequisites
-
You have the Search Service enabled on a node in your database. For more information about how to deploy a new node and Services on your database, see Manage Nodes and Clusters.
-
You have a bucket with scopes and collections in your database. For more information about how to create a bucket, see Create a Bucket.
-
Your user account has the Search Admin or Search Reader role.
-
You installed the Couchbase command-line tool (CLI).
-
You have the hostname or IP address for the node in your database where you’re running the Search Service. For more information about where to find the IP address for your node, see List Cluster Nodes.
-
You have created a Vector Search index.
For more information about how to create a Vector Search index, see Create a Vector Search Index with the Server Web Console or Create a Vector Search Index with the REST API and curl/HTTP.
You can download a sample dataset to use with the procedure or examples on this page:
To get the best results with using the sample data with the examples in this documentation, import the sample files from the dataset into your database with the following settings:
-
Use a bucket called
vector-sample
. -
Use a scope called
color
. -
Use a collection called
rgb
forrgb.json
. -
To set your document keys, use the value of the
id
field from each JSON document.
For the best results, consider using the sample Vector Search index from Create a Vector Search Index with the Server Web Console or Create a Vector Search Index with the REST API and curl/HTTP.
-
Procedure
To run a pre-filtered vector search with the REST API:
-
In your command-line tool, enter a
curl
command with theXPOST
verb. -
Set your header content to include
"Content-Type: application/json"
. -
Add your
username
,password
, and the Search Service endpoint on port8094
. -
Add the
index name
you want to query to the endpoint.
curl -XPOST -H "Content-Type: application/json" \
-u ${CB_USERNAME}:${CB_PASSWORD} http://${CB_HOSTNAME}:8094/api/bucket/vector-sample/scope/color/index/{INDEX_NAME}/query \
-d \
Example
In the following example, you will extend a search query
to find matches in color-index
.
A pre-filter on the query will narrow the documents in the index searched to those with a color
field value
that closely matches navy
.
curl -XPOST -H "Content-Type: application/json" \
-u ${CB_USERNAME}:${CB_PASSWORD} http://${CB_HOSTNAME}:8094/api/bucket/vector-sample/scope/color/index/color-index/query \
-d '{
"fields": ["*"],
"query": {
"min": 70,
"max": 80,
"inclusive_min": false,
"inclusive_max": true,
"field": "brightness"
},
"knn": [
{
"k": 10,
"field": "colorvect_l2",
"vector": [ 176, 0, 176 ],
"filter": {
"field: "color",
"match": "navy"
}
}
]
}'