PieCloudDB Database’s New Generation Optimizer 「Dutch」：Specifically Designed for Cloud-Native and Distributed Environments

MARCH 22ND, 2023

The new optimizer developed by PieCloudDB Database called "Dutch." The name "Dutch" is derived from a popular game among young people, "Red Dead Redemption." In the game, there is an NPC character named "Dutch" whose catchphrase is "I have a plan," which aligns perfectly with the primary function of an optimizer—to plan and optimize database operations.

The mascot of OpenPie: "Pai Pai"

The optimizer is an essential component of a database system. It is responsible for analyzing, optimizing, and generating execution plans for user queries, ensuring that query results are returned in the fastest and most efficient manner possible. By generating the optimal query execution plan, the optimizer aims to optimize query performance. The quality of the execution plan often leads to significant differences in performance. The optimizer "Dutch" built by PieCloudDB, incorporates numerous optimization features. As the "think tank" of the database system, it helps enhance the performance of PieCloudDB.

The Optimization Features of the Four Major Stages

Similar to PostgreSQL, the query optimization process in PieCloudDB is generally divided into four stages: pre-processing stage, scan/join optimization stage, optimization stage beyond scan/join, and post-processing stage. "Dutch" has made significant optimizations in all four processing stages.

Stage 1:Pre-processing Stage

During the pre-processing stage, the optimizer "Dutch" applies logical equivalence transformations to simplify and optimize the query tree. Since statistical information to assist in cost estimation is not yet available, this stage typically relies on proven rules to distribute constraints, simplify expressions, and eliminate unnecessary joins in the expression and join trees.

Converting IN and EXISTS Subqueries into Semi-joins

PieCloudDB categorizes subqueries into SubLink and SubQuery based on their location and purpose. Sublinks are often accompanied by predicate verbs such as ANY/ALL/EXISTS and appear in constraints like WHERE/ON. If sublinks are processed as is, it can impact query efficiency, and the optimization space is limited due to the generation of SubPlan. Therefore, during the preprocessing stage of query optimization, PieCloudDB strives to convert sublinks into semi-joins or anti-joins, allowing for greater optimization potential.

Consider the following SQL query as an example:

SELECT … FROM foo WHERE EXISTS (SELECT 1 FROM bar WHERE foo.a = bar.c);

Whereas the EXISTS clause contains a subquery, PieCloudDB will transform it into a Semi-Join during the preprocessing stage:

SELECT ... FROM foo *SEMI JOIN* bar ON foo.a = bar.c;

Optimizing Subqueries

Statement after the FROM is a subquery. Without optimization, when executing queries with such subqueries, a separate plan will be created for the subquery, generating a Sub-Query scan, which will then be joined with the Parent Query. This approach often fails to find the optimal solution and incurs significant query costs.

Consider the following example: if no optimization is applied, the subqueries "bar" and "baz" will be joined first. Since there is no join condition, a Cartesian product will be formed between "bar" and "baz," and then joined with the outer "foo" query:

SELECT * FROM foo JOIN (SELECT bar.c FROM bar JOIN baz ON TRUE) AS sub ON foo.a = sub.c;

After the optimization, "bar" and "baz" are placed on the same level, allowing the join between "foo" and "bar" to be executed first, followed by the join with "baz." This reordering of joins results in a lower cost:

SELECT * FROM foo JOIN (bar JOIN baz ON TRUE) ON foo.a = bar.c;

Transforming Outer Joins into Inner Joins or Anti-joins

Outer joins have limitations in terms of predicate pushdown and join order selection. Therefore, in the pre-processing stage, "Dutch" strives to convert outer joins into inner joins or anti joins.

In the following SQL statement, the LEFT JOIN generates tuples with NULL values, and the equality condition in the WHERE clause acts as a strict constraint, indicating that if the input is NULL, the output must also be NULL or FALSE. If the column on "bar" contains a NULL value, the filtering result of "bar.d = 42" will be FALSE. This means that the tuples filled with NULL values, generated by the LEFT JOIN, are filtered out by the WHERE condition. In this case, the LEFT JOIN effectively becomes an INNER JOIN in terms of semantics.

SELECT ... FROM foo LEFT JOIN bar ON (...) WHERE bar.d = 42;

In such situations, the PieCloudDB optimizer "Dutch" automatically identifies these types of queries and utilizes the optimization opportunity to transform outer joins into inner joins.

SELECT ... FROM foo INNER JOIN bar ON (...) WHERE bar.d = 42;

For certain cases of outer joins, the PieCloudDB optimizer converts them into anti-joins during the preprocessing stage. Let's consider the following SQL statement as an example:

SELECT * FROM foo LEFT JOIN bar ON foo.a = bar.c WHERE bar.c IS NULL;

Similar to the previous example, the LEFT JOIN also generates a result set with many tuples filled with NULL values. In this case, the WHERE condition restricts the result to only include rows where bar.c is NULL. This transforms the LEFT JOIN semantically into an anti-join. PieCloudDB automatically detects this optimization opportunity during the preprocessing stage and converts the outer join into an anti join:

SELECT * FROM foo ANTI JOIN bar on foo.a = bar.c;

In addition to these optimizations, during the preprocessing stage, the "Dutch" optimizer implements several other optimizations, including:

Distribution of constraint conditions
Construction of equivalence classes
Collection of outer join information
Elimination of redundant joins
Simplification of expressions
...

Stage 2:Scan/Join Optimization Stage

The scan/join optimization stage can be considered the most complex stage in the optimizer's processing. In this stage, the "Dutch" optimizer, driven by cost, handles the FROM and WHERE parts of the query, while also taking into account the information from the ORDER BY clause.

In this stage, the "Dutch" optimizer's processing can be divided into two steps. Firstly, it generates scan paths for the base tables and calculates the cost and result set size of these scan paths, which provides the cost for join operations. In the second step, "Dutch" searches the entire space of join orders to generate the optimal join path for the join operations. This step has a high complexity (n!), and PieCloudDB employs dynamic programming and genetic algorithms to handle it, selecting the algorithm based on GUC values. If the query involves outer joins, considering the constraints imposed by outer joins on the join order, the complexity of this step is further increased as the join order cannot be freely switched like in the case of inner joins.

Stage 3:Optimization Stage beyond Scan/Join

Compared to the second stage, although this stage involves handling multiple tasks, its complexity is relatively lower. In this stage, "Dutch" first processes the Group By, aggregation, window functions, and DISTINCT operations. It then handles set operations and, finally, processes the ORDER BY clause. Each of these steps generates one or more paths, and "Dutch" filters these paths based on their cost. Additionally, it adds LockRows, Limit, and ModifyTable operations to the selected paths.

Stage 4:Post-processing Stage

After the previous three stages, "Dutch" has generated an approximate query plan. In the post-processing stage, "Dutch" converts the selected optimal path into a query plan and makes some adjustments to the optimal plan.

Distributed and Cloud-Native Optimization Features

In addition to the mentioned optimization features, to meet the performance requirements for querying cloud-based data, the PieCloudDB optimizer "Dutch" has implemented numerous optimizations and improvements for complex query scenarios, incorporating advanced distributed and cloud-native capabilities.

The Distributed Features of "Dutch"

Building on the foundation of the aforementioned optimization features, the "Dutch" optimizer extends its capabilities with a wide range of optimizations specifically designed for distributed environments. Firstly, "Dutch" introduces the concept of Motion, which enables the movement of data across different Executors. By leveraging Motion, "Dutch" can generate distributed query plans. These query plans are divided into smaller units and distributed to different Executors for parallel execution. With parallel execution, many complex queries can be further optimized. For example, in the case of aggregate operations, the distributed advantages can be harnessed to improve performance through multi-stage aggregation across Executors.

To illustrate the concept of multi-stage aggregation, let's consider the following SQL query. The "Dutch" optimizer would generate a query plan as follows:

# explain (costs off) select sum(distinct b) from t group by a;  
 
                                   QUERY PLAN  
------------------------------------------------------------------  
 Gather Motion 3:1  (slice1; segments: 3)  
   ->  HashAggregate  
         Group Key: a  
         ->  HashAggregate  
               Group Key: a, b  
               ->  Redistribute Motion 3:3  (slice2; segments: 3)  
                     Hash Key: a  
                     ->  Streaming HashAggregate  
                           Group Key: a, b  
                           ->  Seq Scan on t

Due to the presence of a distinct operation, the "Dutch" optimizer will perform three stages of aggregation. In the first stage, PieCloudDB will perform a local aggregation on each execution node, using columns a and b as the group key. Next, utilizing Motion, a Reshuffle operation is performed to redistribute the data across the nodes and another aggregation is carried out on the reshuffled data to achieve global distinctness. Finally, the final aggregation is performed based on the group key to obtain the desired query result.

The Cloud-Native Features of "Dutch"

Due to the adoption of object storage in PieCloudDB's storage engine, "JANM," the optimizer "Dutch" has implemented several advanced optimizations, including Aggregate Pushdown, Block Skipping, and precomputation. In this section, we will focus on aggregate pushdown. Stay tuned for upcoming content on other cloud-native features offered by "Dutch".

PieCloudDB implements Aggregate Pushdown, which effectively reduces data transmission and processing in the query execution plan, thereby improving query efficiency. In analytical scenarios, aggregation operations such as SUM, AVG, MAX, and MIN are common and used to perform aggregate calculations on database tables. In most data warehouses, when processing aggregation operations, it is typically required to first scan and join the tables, and then calculate the aggregate functions. In cases where the data volume is extremely large, this approach can lead to lower query performance.

The aggregate pushdown optimization strategy implemented by the PieCloudDB optimizer "Dutch" allows for pushing down aggregate operations before the join operations, significantly reducing the amount of data processed during the join. Through testing, it has been observed that this optimization can result in performance improvements of several hundred to even thousands of times in certain scenarios.

Taking the following SQL query as an example, this query involves joining the tables t1 and t2 and then grouping the result based on t1.a to obtain the average value of t2.c. Without the aggregate pushdown optimization, the query would perform the join operation between t1 and t2 first and then perform the aggregation based on t1.a. In this case, if both t1 and t2 tables are large, the cost of the join operation would be high and could impact performance. However, with the aggregate pushdown optimization, the aggregation operation is performed before the join operation. If t2.b exhibits significant aggregation characteristics, the amount of data to be processed is significantly reduced. As a result, the performance is greatly improved when performing the join operation.

# EXPLAIN (COSTS OFF) SELECT t1.a, avg(t2.c) FROM t1 JOIN t2 ON t1.b = t2.b GROUP BY t1.a;  
                                  QUERY PLAN  
-------------------------------------------------------------------  
 Gather Motion 3:1  (slice1; segments: 3)  
   ->  Finalize GroupAggregate  
         Group Key: t1.a  
         ->  Sort  
               Sort Key: t1.a  
               ->  Redistribute Motion 3:3  (slice2; segments: 3)  
                     Hash Key: t1.a  
                     ->  Hash Join  
                           Hash Cond: (t1.b = t2.b)  
                           ->  Seq Scan on t1  
                           ->  Hash  
                               >  Broadcast Motion 3:3  (slice3; segments: 3)  
                                  ->  Partial HashAggregate  
                                      Group Key: t2.b  
                                       ->  Seq Scan on t2

The query optimizer is one of the most important and complex components of a database system. As a cloud-native virtual data warehouse, PieCloudDB continuously refines its optimizer, known as "Dutch" to drive performance improvements while ensuring efficient and stable operation of the database system.

Related Blogs:

no related blog