Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not call gporca for simple queries #900

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

leborchuk
Copy link
Contributor

@leborchuk leborchuk commented Feb 2, 2025

I've made a simple test

  1. create demo-cluster
  2. execute simple insert values query 1000 times with gporca disabled
  3. repeat 2. with gporca enabled
  4. compare the results - test with gporca enabled more than 12x slower

Here the results:

postgres=# set optimizer=off;
SET
Time: 1.657 ms
postgres=# do $$
begin
for i in 1..1000 loop
insert into test values(i);
end loop;
end;
$$;
DO
Time: 801.485 ms
postgres=# set optimizer=on;
SET
Time: 1.540 ms
postgres=# do $$
begin
for i in 1..1000 loop
insert into test values(i);
end loop;
end;
$$;
DO
Time: 10751.109 ms (00:10.751)

Honestly, the expected result. Integration with gporca includes a large number of copies and transformations.

In this PR, I propose disabling gporca for simple queries such as insert values. Of course, users could do the same manually, but I have not heard anyone actually doing so. Therefore, it would be great if the database switches to the postgres optimizer if a query is too simple to use gporca. We know that gporca certainly won't produce a better execution plan.

I formalized it in the enabled_for_optimizer function. We use postgres optimizer if we do not use any of: aggregation, with clause, recurse clause, window functions. And the number of relations in a query less or equal optimizer_relations_threshold. Otherwise, use gporca.

P.S. This was inspired by conclusions from the Integrating the Orca Optimizer into MySQL article. One conclusion was that it is not advisable to use gporca for simple queries. Let's implement this )

@leborchuk
Copy link
Contributor Author

Fix tests when gporca is disabled, need to run test workflow once again

@leborchuk
Copy link
Contributor Author

Sorry, I see here failed only ic-resgroup-v2/resgroup/resgroup_cpu_max_percent test with

diff -I HINT: -I CONTEXT: -I GP_IGNORE: -U3 /__w/cloudberry/cloudberry/src/test/isolation2/expected/resgroup/resgroup_cpu_max_percent.out /__w/cloudberry/cloudberry/src/test/isolation2/results/resgroup/resgroup_cpu_max_percent.out
--- /__w/cloudberry/cloudberry/src/test/isolation2/expected/resgroup/resgroup_cpu_max_percent.out	2025-02-04 06:22:21.746467393 -0800
+++ /__w/cloudberry/cloudberry/src/test/isolation2/results/resgroup/resgroup_cpu_max_percent.out	2025-02-04 06:22:21.754467504 -0800
@@ -224,7 +224,7 @@
 SELECT verify_cpu_usage('rg1_cpu_test', 90, 10);
  verify_cpu_usage 
 ------------------
- t                
+ f                
 (1 row)

But do not have any ideas why the resource group RG1_CPU_TEST CPU usage differs from 90 by more than 10%. Maybe it has nothing to do with my fixes at all (test passed the first time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant