Well, actually, this note is a continuation of a model series on reports.

So the structure of an article is the same: the problem description -> non-model solution -> model clause solution.

But also this note is very similar to summation of tree values, I've written a little bit earlier.

Input data:

So let's get started.

Our input for this post would be a table t from here and an additional table t_hrchy with a hierarchical structure of departments:

SQL> drop table t_hrchy;So this table tells us that DEP1 is a parent department for DEP2, DEP2 is a parent department for DEP3 and DEP3 is a parent department for DEP4.

Table dropped

SQL>

SQL> create table t_hrchy as (select 'DEP2' dep, 'DEP1' dep_par from dual union all

2 select 'DEP3' dep, 'DEP2' dep_par from dual union all

3 select 'DEP4' dep, 'DEP3' dep_par from dual)

4 /

Table created

SQL> select * from t_hrchy;

DEP DEP_PAR

---- -------

DEP2 DEP1

DEP3 DEP2

DEP4 DEP3

SQL>

Problem:

Now our task would be to create a report where in one column would be sales of a particular department output and in another column we need to get the sales of a current department and for all the departments that are under the current one in the organization structure.

The data should be consolidated only inside the same year and inside the same city.

Non-Model solution:

When the matter concerns the hierarchical structure the first thing that pops up in our head is hierarchical queries with CONNECT BY clause.

Elaborating this a little bit we can use the following query proposed by Rob van Wijk here or here:

SQL> select to_char(year,'yyyy') year,Lets see what this query actually does.

2 city,

3 t.dep,

4 th.dep_par,

5 sales,

6 sum(connect_by_root sales) tot_sales

7 from t, t_hrchy th

8 where t.dep = th.dep(+)

9 connect by t.dep = prior dep_par

10 and prior year = year

11 and prior city = city

12 group by city, year, t.dep, dep_par, sales

13 order by 1,2,3;

YEAR CITY DEP DEP_PAR SALES TOT_SALES

---- ------ ---- ------- ---------- ----------

2006 Moscow DEP1 562 2254

2006 Moscow DEP2 DEP1 457 1692

2006 Moscow DEP3 DEP2 890 1235

2006 Moscow DEP4 DEP3 345 345

2006 Omsk DEP1 23 264

2006 Omsk DEP2 DEP1 154 241

2006 Omsk DEP3 DEP2 87 87

2007 Moscow DEP1 762 2926

2007 Moscow DEP2 DEP1 657 2164

2007 Moscow DEP3 DEP2 1020 1507

2007 Moscow DEP4 DEP3 487 487

2007 Omsk DEP1 34 403

2007 Omsk DEP2 DEP1 213 369

2007 Omsk DEP3 DEP2 156 156

14 rows selected

SQL>

First we use an outer join of two tables - so for each row we get a parent department name.

Then we build a hierarchy by using

connect by t.dep = prior dep_parMention that hierarchy is built in the backward direction starting from the children and going up to their parents.

Also we add in the connect by clause the following:

and prior year = yearSo we build hierarchy separartely for different years and cities.

and prior city = city

By using the operator

connect_by_root salesin every row we find a value for sales of the root elements for this or that thread in a hierarchy. As we didn't specify any START WITH condition - the hierarchy is built starting from all the departments present in the data.

For example, let's look what would happen inside the partition for 2007 year Moscow city:

SQL> select to_char(year,'yyyy') year,As you see DEP1 is met four times:

2 city,

3 t.dep,

4 th.dep_par,

5 sales,

6 connect_by_root sales root_sales,

7 level,

8 sys_connect_by_path(t.dep,'/') hrchy_path

9 from t, t_hrchy th

10 where t.dep = th.dep(+)

11 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')

12 and t.city='Moscow'

13 connect by t.dep = prior dep_par

14 and prior year = year

15 and prior city = city

16 /

YEAR CITY DEP DEP_PAR SALES ROOT_SALES LEVEL HRCHY_PATH

---- ------ ---- ------- ---------- ---------- ---------- ----------------------------

2007 Moscow DEP1 762 762 1 /DEP1

2007 Moscow DEP2 DEP1 657 657 1 /DEP2

2007 Moscow DEP1 762 657 2 /DEP2/DEP1

2007 Moscow DEP3 DEP2 1020 1020 1 /DEP3

2007 Moscow DEP2 DEP1 657 1020 2 /DEP3/DEP2

2007 Moscow DEP1 762 1020 3 /DEP3/DEP2/DEP1

2007 Moscow DEP4 DEP3 487 487 1 /DEP4

2007 Moscow DEP3 DEP2 1020 487 2 /DEP4/DEP3

2007 Moscow DEP2 DEP1 657 487 3 /DEP4/DEP3/DEP2

2007 Moscow DEP1 762 487 4 /DEP4/DEP3/DEP2/DEP1

10 rows selected

SQL>

- Firstly it stands for itself (level=1).

- Then we begin to build hierarchy starting from DEP2 and DEP1 is met as a "child" (remember we build a hierarchy in the opposite direction) for DEP2 (level=2)

- The next time we meet it as a "grandchild" for DEP3 (level=3)

- And finally as a "great-grandchild" for DEP4 (level=4).

In each row we have a value of sales for the root department. So if we group data by department and put a sum() aggregate function on connect_by_root(sales) column we get the sales of DEP1 and all the sales of all departments that are under DEP1 in one row.

The same stuff will happen to all the other departments:

SQL> select to_char(year,'yyyy') year,

2 city,

3 t.dep,

4 th.dep_par,

5 sales,

6 sum(connect_by_root sales) tot_salesh

7 from t, t_hrchy th

8 where t.dep = th.dep(+)

9 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')

10 and t.city='Moscow'

11 connect by t.dep = prior dep_par

12 and prior year = year

13 and prior city = city

14 group by year,city,t.dep,th.dep_par,sales

15 order by 1,2,3

16 /

YEAR CITY DEP DEP_PAR SALES TOT_SALESH

---- ------ ---- ------- ---------- ----------

2007 Moscow DEP1 762 2926

2007 Moscow DEP2 DEP1 657 2164

2007 Moscow DEP3 DEP2 1020 1507

2007 Moscow DEP4 DEP3 487 487

SQL>

Model Solution

Now we are proceeding and will solve the same problem using Model clause without any connect by's.

To get less data to be outputed at every step - let's restrict our data to 2007 year and Moscow city only as we did in Non-Model solution:

SQL> select * from t, t_hrchy thLet's think of what woud be placed in PARTITION BY part of our model. Remeber that partitions should be absolutely independent one of another.

2 where t.dep = th.dep(+)

3 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')

4 and t.city='Moscow'

5 order by 3,1,2

6 /

CITY DEP YEAR SALES DEP DEP_PAR

------ ---- ----------- ---------- ---- -------

Moscow DEP1 31.12.2007 762

Moscow DEP2 31.12.2007 657 DEP2 DEP1

Moscow DEP3 31.12.2007 1020 DEP3 DEP2

Moscow DEP4 31.12.2007 487 DEP4 DEP3

SQL>

Yes - we would put YEAR and CITY there. As a task instructs us to treat sales of different years and cities separately.

In dimension we would place DEP and DEP_PART columns.

And in measures - we'll put SALES, of course. And create one more measure called TOT_SALES with initial value of zero.

SQL> select * from t, t_hrchy th

2 where t.dep = th.dep(+)

3 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')

4 and t.city='Moscow'

5 model

6 partition by (year,city)

7 dimension by (t.dep dep,dep_par)

8 measures(sales, 0 tot_sales)

9 ()

10 order by 1,2,3

11 /

YEAR CITY DEP DEP_PAR SALES TOT_SALES

----------- ------ ---- ------- ---------- ----------

31.12.2007 Moscow DEP1 762 0

31.12.2007 Moscow DEP2 DEP1 657 0

31.12.2007 Moscow DEP3 DEP2 1020 0

31.12.2007 Moscow DEP4 DEP3 487 0

SQL>

Now we are going to write a rule for calculating the tot_sales.

For any cell in our model the tot_sales should be equal to a sum of sales of the current department and all the tot_sales of all departments under the current.

We can write it as:

tot_sales[any,any] = sales[CV(),CV()]+nvl(sum(tot_sales)[any,CV(dep)],0)

BTW it would be correct in case when the tot_sales of the departments under the current one already included all the tot_sales of the departments under them.

So what is important here is the ORDER in what the rules are executed.

If we don't specify anything about the order of execution to our model it uses the default SEQUENTIAL ORDER and can easily throw out an error for us:

SQL> select * from t, t_hrchy th

2 where t.dep = th.dep(+)

3 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')

4 and t.city='Moscow'

5 model

6 partition by (year,city)

7 dimension by (t.dep dep,dep_par)

8 measures(sales, 0 tot_sales)

9 rules

10 (tot_sales[any,any] = sales[CV(),CV()]+nvl(sum(tot_sales)[any,CV(dep)],0))

11 order by 1,2,3

12 /

...

ORA-32637: Self cyclic rule in sequential order MODEL

SQL>

But putting AUTOMATIC ORDER causes all rules to be evaluated in an order based on their logical dependencies.

Let's check:

SQL> select * from t, t_hrchy th

2 where t.dep = th.dep(+)

3 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')

4 and t.city='Moscow'

5 model

6 partition by (year,city)

7 dimension by (t.dep dep,dep_par)

8 measures(sales, 0 tot_sales)

9 rules automatic order

10 (tot_sales[any,any] = sales[CV(),CV()]+nvl(sum(tot_sales)[any,CV(dep)],0))

11 order by 1,2,3

12 /

YEAR CITY DEP DEP_PAR SALES TOT_SALES

----------- ------ ---- ------- ---------- ----------

31.12.2007 Moscow DEP1 762 2926

31.12.2007 Moscow DEP2 DEP1 657 2164

31.12.2007 Moscow DEP3 DEP2 1020 1507

31.12.2007 Moscow DEP4 DEP3 487 487

SQL>

VoilĂ ! This is exactly what we were looking for.

Let's find a detailed description of what's happening when we don't use AUTOMATIC ORDER and what's changing when we begin to use it.

First of all, the equivalent of our single rule would be actually a list of four rules for every row (unique set of dimension values) in our data:

tot_sales['DEP1', null ]=sales['DEP1', null ]+nvl(sum(tot_sales)[any,'DEP1'],0),

tot_sales['DEP2','DEP1']=sales['DEP2','DEP1']+nvl(sum(tot_sales)[any,'DEP2'],0),

tot_sales['DEP3','DEP2']=sales['DEP3','DEP2']+nvl(sum(tot_sales)[any,'DEP3'],0),

tot_sales['DEP4','DEP3']=sales['DEP4','DEP3']+nvl(sum(tot_sales)[any,'DEP4'],0)

But using SEQUENTIAL ORDER doesn't guarantee us any order of these rules execution. Because as it stated in the documetation: SEQUENTIAL ORDER: This causes rules to be evaluated in the order they are written. This is the default.

But we have only one rule in our model. So the detailed rules can be run in any order:

for example first we find tot_sales['DEP1', null ] and then tot_sales['DEP2','DEP1'], or vice versa. And that is the problem, because this two possibilities will give us different results in the output.

What actually ORA-32637 stays for is:

Cause: A self-cyclic rule was detected in the sequential order MODEL. Sequential order MODELs cannot have self cyclic rules to guarantee that the results do not depend on the order of evaluation of the cells that are updated or upserted.

Action: Use ordered rule evaluation for this rule.

So if use order by DEP we shouldn't have such an error, let's try:

SQL> select * from t, t_hrchy thYes, the query returns no error - because we explicitly defined the order of rules execution. But as you see we have wrong results. Because actually in our case we should order by dep DESC! You can try it yourself.

2 where t.dep = th.dep(+)

3 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')

4 and t.city='Moscow'

5 model

6 partition by (year,city)

7 dimension by (t.dep dep,dep_par)

8 measures(sales, 0 tot_sales)

9 rules

10 (tot_sales[any,any] order by dep = sales[CV(),CV()]+nvl(sum(tot_sales)[any,CV(dep)],0))

11 order by 1,2,3

12 /

YEAR CITY DEP DEP_PAR SALES TOT_SALES

----------- ------ ---- ------- ---------- ----------

31.12.2007 Moscow DEP1 762 762

31.12.2007 Moscow DEP2 DEP1 657 657

31.12.2007 Moscow DEP3 DEP2 1020 1020

31.12.2007 Moscow DEP4 DEP3 487 487

SQL>

While we don't know what are the dependences in the hierarchy data beforehand - let's give Oracle to decide - what are the dependencies between cells and simply use AUTOMATIC ORDER:

SQL> select * from t, t_hrchy th

2 where t.dep = th.dep(+)

3 model

4 partition by (to_char(year,'yyyy') year,city)

5 dimension by (t.dep dep,dep_par)

6 measures(sales, 0 tot_sales)

7 rules automatic order

8 (tot_sales[any,any] = sales[CV(),CV()]+nvl(sum(tot_sales)[any,CV(dep)],0))

9 order by 1,2,3

10 /

YEAR CITY DEP DEP_PAR SALES TOT_SALES

---- ------ ---- ------- ---------- ----------

2006 Moscow DEP1 562 2254

2006 Moscow DEP2 DEP1 457 1692

2006 Moscow DEP3 DEP2 890 1235

2006 Moscow DEP4 DEP3 345 345

2006 Omsk DEP1 23 264

2006 Omsk DEP2 DEP1 154 241

2006 Omsk DEP3 DEP2 87 87

2007 Moscow DEP1 762 2926

2007 Moscow DEP2 DEP1 657 2164

2007 Moscow DEP3 DEP2 1020 1507

2007 Moscow DEP4 DEP3 487 487

2007 Omsk DEP1 34 403

2007 Omsk DEP2 DEP1 213 369

2007 Omsk DEP3 DEP2 156 156

14 rows selected

SQL>

to be continued...

## 3 comments:

Voder,

This method works, but it takes som time to process, I wasn't able to get your connect by example to work above. Can you give me some tips on how to sum multiple columns in this type of query with out a seperate select for each column? In my mind it should be able to do it in one pass...

select rpad('*', 2*level, '*') || g3e_fid || ' (' || g3e_fno || ')',

active_customer_count_a,

--Active Customer Count ABC

(select sum(active_customer_count_a)

from pgn_lbs_temp t2

start with t2.g3e_fid = t1.g3e_fid

connect by prior t2.g3e_fid = t2.g3e_sourcefid) sum_active_customer_count_a,

(select sum(active_customer_count_b)

from pgn_lbs_temp t3

start with t3.g3e_fid = t1.g3e_fid

connect by prior t3.g3e_fid = t3.g3e_sourcefid) sum_active_customer_count_b,

(select sum(active_customer_count_c)

from pgn_lbs_temp t4

start with t4.g3e_fid = t1.g3e_fid

connect by prior t4.g3e_fid = t4.g3e_sourcefid) sum_active_customer_count_c,

--Summer1 ABC

(select sum(load_hist_summer1_a)

from pgn_lbs_temp t5

start with t5.g3e_fid = t1.g3e_fid

connect by prior t5.g3e_fid = t5.g3e_sourcefid) load_hist_summer1_a,

(select sum(load_hist_summer1_b)

from pgn_lbs_temp t6

start with t6.g3e_fid = t1.g3e_fid

connect by prior t6.g3e_fid = t6.g3e_sourcefid) load_hist_summer1_b,

(select sum(load_hist_summer1_c)

from pgn_lbs_temp t7

start with t7.g3e_fid = t1.g3e_fid

connect by prior t7.g3e_fid = t7.g3e_sourcefid) load_hist_summer1_c,

--Winter1 ABC

(select sum(load_hist_winter1_a)

from pgn_lbs_temp t8

start with t8.g3e_fid = t1.g3e_fid

connect by prior t8.g3e_fid = t8.g3e_sourcefid) load_hist_winter1_a,

(select sum(load_hist_winter1_b)

from pgn_lbs_temp t9

start with t9.g3e_fid = t1.g3e_fid

connect by prior t9.g3e_fid = t9.g3e_sourcefid) load_hist_winter1_b,

(select sum(load_hist_winter1_c)

from pgn_lbs_temp t10

start with t10.g3e_fid = t1.g3e_fid

connect by prior t10.g3e_fid = t10.g3e_sourcefid) load_hist_winter1_c

from pgn_lbs_temp t1

start with t1.g3e_fid = 32352823

connect by prior g3e_fid = g3e_sourcefid;

I think the following example will help you to understand how to achieve the needed behavior:

SQL> with t as (select 1 id, null parent_id, 100 value1, 20 value2 from dual union all

2 select 2 id, 1 parent_id, 50 value1, 10 value2 from dual union all

3 select 3 id, 2 parent_id, 5 value1, 30 value2 from dual)

4 --

5 select id_new id,

6 max(decode(id, id_new, value1)) value1,

7 max(decode(id, id_new, value2)) value2,

8 sum(value1),

9 sum(value2)

10 from (select t.*, connect_by_root(id) id_new, level

11 from t

12 connect by prior id = parent_id)

13 group by id_new

14 /

ID VALUE1 VALUE2 SUM(VALUE1) SUM(VALUE2)

---------- ---------- ---------- ----------- -----------

1 100 20 155 60

2 50 10 55 40

3 5 30 5 30

SQL>

Volder,

Thank you for your example that helped me out immensly.

Here is my question, it appears that when I use the with statement that queries two views the explain plain cost goes to 168 million. When I extract the data into a temporary table the explain plan cost goes to 4. Any ideas on why I can't do it with my views in the with statement?

WITH temp AS (

SELECT * FROM PGN_LBS_TEMP

/*

SELECT

tr.g3e_tno,

tr.g3e_fid,

tr.g3e_fno,

tr.g3e_sourcefid,

nvl(l.active_customer_count_a, 0) active_customer_count_a,

nvl(l.active_customer_count_b, 0) active_customer_count_b,

nvl(l.active_customer_count_c, 0) active_customer_count_c,

nvl(l.load_hist_summer1_a, 0) load_hist_summer1_a,

nvl(l.load_hist_summer1_b, 0) load_hist_summer1_b,

nvl(l.load_hist_summer1_c, 0) load_hist_summer1_c,

nvl(l.load_hist_winter1_a, 0) load_hist_winter1_a,

nvl(l.load_hist_winter1_b, 0) load_hist_winter1_b,

nvl(l.load_hist_winter1_c, 0) load_hist_winter1_c,

nvl(l.load_hist_summer2_a, 0) load_hist_summer2_a,

nvl(l.load_hist_summer2_b, 0) load_hist_summer2_b,

nvl(l.load_hist_summer2_c, 0) load_hist_summer2_c,

nvl(l.load_hist_winter2_a, 0) load_hist_winter2_a,

nvl(l.load_hist_winter2_b, 0) load_hist_winter2_b,

nvl(l.load_hist_winter2_c, 0) load_hist_winter2_c

FROM gisadmin.traceresult tr

LEFT OUTER JOIN gisadmin.lbs_custload_history_3_phase l ON (tr.g3e_fid = l.g3e_fid AND tr.g3e_fno IN (314, 244, 307, 237))

WHERE tr.g3e_tno = (SELECT g3e_id FROM traceid WHERE g3e_name = 'X134 Trace')

*/

)

SELECT

g3e_fid_new,

sum(active_customer_count_a),

sum(active_customer_count_b),

sum(active_customer_count_c),

sum(load_hist_summer1_a),

sum(load_hist_summer1_b),

sum(load_hist_summer1_c)

from (

select temp.*, connect_by_root(g3e_fid) g3e_fid_new, level

from temp

start with temp.g3e_fid = 32352823

connect by prior g3e_fid = g3e_sourcefid

)

where g3e_fid_new = 32352823

group by g3e_fid_new

;

Post a Comment