Comma-Separated Join
- Capella Operational
- reference
A comma-separated join enables you to produce new input objects by creating a Cartesian product of all the source objects.
Purpose
A comma-separated join is used within the FROM clause. Like the JOIN clause, it creates an input object by combining two or more source objects. A comma-separated join can combine arbitrary fields from the source documents, and you can chain several comma-separated joins together.
The comma-separated join, by itself, does not specify a join predicate. This means that, in its basic form, the comma-separated join would produce all the possible combinations of the combined source objects — this is known as the Cartesian product.
In practice, it is common to use the query’s WHERE clause to specify a condition for the comma-separated join. Refer to the examples below for further details.
Prerequisites
For you to select data from keyspace or expression, you must have the query_select
privilege on that keyspace.
For more details about user roles, see Authorization.
Syntax
comma-separated-join ::= ',' 'LATERAL'? ( rhs-keyspace | rhs-subquery | rhs-generic )
rhs-keyspace | |
rhs-subquery | |
rhs-generic |
Left-Hand Side
The comma-separated join cannot be the first term within the FROM
clause; it must be preceded by another FROM term.
The term immediately preceding the comma-separated join represents the left-hand side of the comma-separated join.
You can chain the comma-separated join with any of the other permitted FROM terms, including another comma-separated join. For more information, see the page on the FROM clause.
There are restrictions on what types of FROM terms may be chained and in what order — see the descriptions on this page for more details.
The types of FROM term that may be used as the left-hand side of the comma-separated join are summarized in the following table.
To try the examples in this section, set the query context to the inventory
scope in the travel sample dataset.
For more information, see Query Context.
Type | Example |
---|---|
|
|
|
|
|
|
|
|
previous comma-separated join |
|
The comma-separated join is a type of inner join. For each joined object produced, both the left-hand side and right-hand side source objects must be non-MISSING and non-NULL.
The right-hand side of a comma-separated join may be a keyspace reference, a subquery, or a generic expression term.
LATERAL Join
When an expression on the right-hand side of a comma-separated join references a keyspace that is already specified in the same FROM clause, the expression is said to be correlated. In relational databases, a join which contains correlated expressions is referred to as a lateral join. In SQL++, lateral correlations are detected automatically, and there is no need to specify that a join is lateral.
In clusters using Couchbase Server 7.6 and later, you can use the LATERAL keyword as a visual reminder that a join contains correlated expressions. The LATERAL keyword is not required — the keyword is included solely for compatibility with queries from relational databases.
If you use the LATERAL keyword in a join that has no lateral correlation, the keyword is ignored.
You can use the optional LATERAL keyword in front of the right-hand side keyspace of a comma-separated join.
Using the LATERAL keyword in a comma-separated join implies that the right-hand side of the join must appear after the left-hand side of the join. This may prevent the cost-based optimizer from reordering joins in the query to give the optimal join order. For details, see Join Enumeration. |
Right-Hand Side Keyspace
rhs-keyspace ::= keyspace-ref ( 'AS'? alias )? ansi-join-hints?
keyspace-ref | |
alias | |
ansi-join-hints |
Keyspace Reference
Keyspace reference for the right-hand side of the comma-separated join. For details, see Keyspace Reference.
AS Alias
Assigns another name to the keyspace reference. For details, see AS Clause.
Assigning an alias to the keyspace reference is optional.
If you assign an alias to the keyspace reference, the AS
keyword may be omitted.
USE Clause
Enables you to specify that the join should use particular keys, a particular index, or a particular join method. For details, see ANSI JOIN Hints.
You can also supply a join hint within a specially-formatted hint comment.
Note that you cannot specify a join hint for the same keyspace using both the USE clause and a hint comment.
If you do this, the USE clause and the hint comment are both marked as erroneous and ignored by the optimizer.
|
Right-Hand Side Subquery
rhs-subquery ::= subquery-expr 'AS'? alias
subquery-expr | |
alias |
Subquery Expression
Use parentheses to specify a subquery for the right-hand side of the comma-separated join. For details, see Subquery Expression.
A subquery on the right-hand side of the comma-separated join cannot be correlated, i.e. it cannot refer to a keyspace in the outer query block. This will lead to an error. |
AS Alias
Assigns another name to the subquery. For details, see AS Clause.
You must assign an alias to a subquery on the right-hand side of the join.
However, when you assign an alias to the subquery, the AS
keyword may be omitted.
Right-Hand Side Generic Expression
rhs-generic ::= expr ( 'AS'? alias )?
expr | |
alias |
Expression Term
A SQL++ expression generating JSON documents or objects for the right-hand side of the comma-separated join.
An expression on the right-hand side of the comma-separated join may be correlated, i.e. it may refer to a keyspace on the left-hand side of the join. In this case, only a nested-loop join may be used. |
AS Alias
Assigns another name to the generic expression. For details, see AS Clause.
You must assign an alias to the generic expression if it is not an identifier; otherwise, assigning an alias is optional.
However, when you assign an alias to the generic expression, the AS
keyword may be omitted.
Limitations
-
You can chain comma-separated joins with ANSI
JOIN
clauses, ANSINEST
clauses, andUNNEST
clauses. However, you cannot chain comma-separated joins with lookupJOIN
andNEST
clauses, or indexJOIN
andNEST
clauses. -
The right-hand side of a comma-separated join can only be a keyspace identifier, a subquery, or a generic expression. This means that comma-separated joins must come after any
JOIN
,NEST
, orUNNEST
clauses.
Examples
To try the examples in this section, set the query context to the inventory
scope in the travel sample dataset.
For more information, see Query Context.
The following query lists every possible combination of the two input objects.
SELECT * FROM [{"abc": 1}, {"abc": 2}, {"abc": 3}] AS a,
[{"xyz": 1}, {"xyz": 2}] AS b;
Compare the query above with the following query using an ANSI join.
SELECT * FROM [{"abc": 1}, {"abc": 2}, {"abc": 3}] AS a
JOIN [{"xyz": 1}, {"xyz": 2}] AS b ON true;
The results of the two queries are the same.
[
{
"a": {
"abc": 1
},
"b": {
"xyz": 1
}
},
{
"a": {
"abc": 1
},
"b": {
"xyz": 2
}
},
{
"a": {
"abc": 2
},
"b": {
"xyz": 1
}
},
{
"a": {
"abc": 2
},
"b": {
"xyz": 2
}
},
{
"a": {
"abc": 3
},
"b": {
"xyz": 1
}
},
{
"a": {
"abc": 3
},
"b": {
"xyz": 2
}
}
]
The following query uses the WHERE clause to define the condition for a comma-separated join.
SELECT a.airportname AS airport, r.id AS route
FROM route AS r,
airport AS a
WHERE a.faa = r.sourceairport
LIMIT 4;
Compare the query above with the following query using an ANSI join.
SELECT a.airportname AS airport, r.id AS route
FROM route AS r
JOIN airport AS a
ON a.faa = r.sourceairport
LIMIT 4;
The results of the two queries are the same.
[
{
"airport": "Lehigh Valley Intl",
"route": 20010
},
{
"airport": "Lehigh Valley Intl",
"route": 20011
},
{
"airport": "Lehigh Valley Intl",
"route": 28856
},
{
"airport": "Lehigh Valley Intl",
"route": 28857
}
]
The following query uses the WHERE clause to define a condition for a comma-separated join and to filter the query.
SELECT a.airportname AS airport, r.id AS route
FROM route AS r,
airport AS a
WHERE a.faa = r.sourceairport
AND r.sourceairport = "SFO"
LIMIT 4;
Compare the query above with the following query using an ANSI join.
SELECT a.airportname AS airport, r.id AS route
FROM route AS r
JOIN airport AS a
ON a.faa = r.sourceairport
WHERE r.sourceairport = "SFO"
LIMIT 4;
The results of the two queries are the same.
[
{
"airport": "San Francisco Intl",
"route": 10624
},
{
"airport": "San Francisco Intl",
"route": 10625
},
{
"airport": "San Francisco Intl",
"route": 11212
},
{
"airport": "San Francisco Intl",
"route": 11213
}
]
The following query uses the USE clause to specify hints for a comma-separated join.
EXPLAIN SELECT a.airportname AS airport, r.id AS route
FROM route AS r,
airport AS a
USE INDEX(def_inventory_airport_faa) NL
WHERE a.faa = r.sourceairport
AND r.sourceairport = "SFO"
LIMIT 4;
Compare the query above with the following query using an ANSI join.
EXPLAIN SELECT a.airportname AS airport, r.id AS route
FROM route AS r
JOIN airport AS a
USE INDEX(def_inventory_airport_faa) NL
ON a.faa = r.sourceairport
WHERE r.sourceairport = "SFO"
LIMIT 4;
The results of the two queries are the same.
[
{
"optimizer_hints": {
"hints_followed": [
"USE_NL(a)",
"INDEX(a def_inventory_airport_faa)"
]
},
// ...
}
]
The following query chains an ANSI join with a comma-separated join.
SELECT l.name AS airline, a.airportname AS airport, r.id AS route
FROM airline AS l
JOIN route AS r
ON META(l).id = r.airlineid,
airport AS a
WHERE a.faa = r.sourceairport
AND r.sourceairport = "SFO"
LIMIT 4;
[
{
"airline": "AirTran Airways",
"airport": "San Francisco Intl",
"route": 25480
},
{
"airline": "AirTran Airways",
"airport": "San Francisco Intl",
"route": 25481
},
{
"airline": "AirTran Airways",
"airport": "San Francisco Intl",
"route": 25482
},
{
"airline": "AirTran Airways",
"airport": "San Francisco Intl",
"route": 25483
}
]
The following query has a lateral correlation between the subquery and the airport
keyspace.
SELECT airport.airportname, t2.name
FROM airport,
(SELECT name FROM hotel WHERE hotel.city = airport.city) AS t2
LIMIT 5;
Compare the query above with the following query using the LATERAL keyword.
SELECT airport.airportname, t2.name
FROM airport,
LATERAL (SELECT name FROM hotel WHERE hotel.city = airport.city) AS t2
LIMIT 5;
The results of the two queries are the same.
[
{
"airportname": "Mandelieu",
"name": "Hotel Cybelle"
},
{
"airportname": "Cote D\\'Azur",
"name": "Best Western Hotel Riviera Nice"
},
{
"airportname": "Cote D\\'Azur",
"name": "Hotel Anis"
},
{
"airportname": "Cote D\\'Azur",
"name": "NH Nice"
},
{
"airportname": "Cote D\\'Azur",
"name": "Hotel Suisse"
}
]