Rule Definition
Cartesian joins which lead to cartesian products is an absolute performance killer, especially when large data sets are involved which is the case when extremely large tables are involved.
A cartesian join is a multiple-table query that does not explicitly state a join condition among the tables, it creates a Cartesian product. A Cartesian product consists of every possible combination of rows from the tables. This result set is usually extremely large and unwieldy, killing application performance.
In addition, most of the time, this is also a functional bug.
Remediation
Check the statement and if so, add a join's clause between the tables.
Violation Code Sample
Select * from table1, table2
other exemple:
hypothsis
Select count(*) from publishers
->3 rows
Select count(*) from roysched
->87 rows
Select count(*) from Titles
-> 18 rows
example of cartesian product
Select p.pub_name, p.city, t.title, r.royalty
From
publishers p,
titles t,
roysched r
--> 4698 rows( 87 x 18 x 3)
Select p.pub_name, p.city, t.title, r.royalty
From
publishers p,
titles t,
roysched r
Where
p.pub_id = t.pub_id
--> 1566 rows ( 87 x 18)
Select p.pub_name, p.city, t.title, r.royalty
From
publishers p,
titles t,
roysched r
Where
p.pub_id = t.pub_id AND
r.royalty = 14
--> 252 rows
Fixed Code Sample
Select * from table1 T1 inner join table2 T2 on (T1.C = T2.C)
example of non cartesian product:
Select p.pub_name, p.city, t.title, r.royalty
From
publishers p,
titles t,
roysched r
Where
p.pub_id = t.pub_id AND
t.title_id = r.title_id
--> 87 rows
Related Technologies
Technical Criterion
Efficiency - SQL and Data Handling Performance
About CAST Appmarq
CAST Appmarq is by far the biggest repository of data about real IT systems. It's built on thousands of analyzed applications, made of 35 different technologies, by over 300 business organizations across major verticals. It provides IT Leaders with factual key analytics to let them know if their applications are on track.