Well, actually, this note is a continuation of a model series on reports.
So the structure of an article is the same: the problem description -> non-model solution -> model clause solution.
But also this note is very similar to summation of tree values, I've written a little bit earlier.
Input data:
So let's get started.
Our input for this post would be a table t from here and an additional table t_hrchy with a hierarchical structure of departments:
SQL> drop table t_hrchy;So this table tells us that DEP1 is a parent department for DEP2, DEP2 is a parent department for DEP3 and DEP3 is a parent department for DEP4.
Table dropped
SQL>
SQL> create table t_hrchy as (select 'DEP2' dep, 'DEP1' dep_par from dual union all
2 select 'DEP3' dep, 'DEP2' dep_par from dual union all
3 select 'DEP4' dep, 'DEP3' dep_par from dual)
4 /
Table created
SQL> select * from t_hrchy;
DEP DEP_PAR
---- -------
DEP2 DEP1
DEP3 DEP2
DEP4 DEP3
SQL>
Problem:
Now our task would be to create a report where in one column would be sales of a particular department output and in another column we need to get the sales of a current department and for all the departments that are under the current one in the organization structure.
The data should be consolidated only inside the same year and inside the same city.
Non-Model solution:
When the matter concerns the hierarchical structure the first thing that pops up in our head is hierarchical queries with CONNECT BY clause.
Elaborating this a little bit we can use the following query proposed by Rob van Wijk here or here:
SQL> select to_char(year,'yyyy') year,Lets see what this query actually does.
2 city,
3 t.dep,
4 th.dep_par,
5 sales,
6 sum(connect_by_root sales) tot_sales
7 from t, t_hrchy th
8 where t.dep = th.dep(+)
9 connect by t.dep = prior dep_par
10 and prior year = year
11 and prior city = city
12 group by city, year, t.dep, dep_par, sales
13 order by 1,2,3;
YEAR CITY DEP DEP_PAR SALES TOT_SALES
---- ------ ---- ------- ---------- ----------
2006 Moscow DEP1 562 2254
2006 Moscow DEP2 DEP1 457 1692
2006 Moscow DEP3 DEP2 890 1235
2006 Moscow DEP4 DEP3 345 345
2006 Omsk DEP1 23 264
2006 Omsk DEP2 DEP1 154 241
2006 Omsk DEP3 DEP2 87 87
2007 Moscow DEP1 762 2926
2007 Moscow DEP2 DEP1 657 2164
2007 Moscow DEP3 DEP2 1020 1507
2007 Moscow DEP4 DEP3 487 487
2007 Omsk DEP1 34 403
2007 Omsk DEP2 DEP1 213 369
2007 Omsk DEP3 DEP2 156 156
14 rows selected
SQL>
First we use an outer join of two tables - so for each row we get a parent department name.
Then we build a hierarchy by using
connect by t.dep = prior dep_parMention that hierarchy is built in the backward direction starting from the children and going up to their parents.
Also we add in the connect by clause the following:
and prior year = yearSo we build hierarchy separartely for different years and cities.
and prior city = city
By using the operator
connect_by_root salesin every row we find a value for sales of the root elements for this or that thread in a hierarchy. As we didn't specify any START WITH condition - the hierarchy is built starting from all the departments present in the data.
For example, let's look what would happen inside the partition for 2007 year Moscow city:
SQL> select to_char(year,'yyyy') year,As you see DEP1 is met four times:
2 city,
3 t.dep,
4 th.dep_par,
5 sales,
6 connect_by_root sales root_sales,
7 level,
8 sys_connect_by_path(t.dep,'/') hrchy_path
9 from t, t_hrchy th
10 where t.dep = th.dep(+)
11 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')
12 and t.city='Moscow'
13 connect by t.dep = prior dep_par
14 and prior year = year
15 and prior city = city
16 /
YEAR CITY DEP DEP_PAR SALES ROOT_SALES LEVEL HRCHY_PATH
---- ------ ---- ------- ---------- ---------- ---------- ----------------------------
2007 Moscow DEP1 762 762 1 /DEP1
2007 Moscow DEP2 DEP1 657 657 1 /DEP2
2007 Moscow DEP1 762 657 2 /DEP2/DEP1
2007 Moscow DEP3 DEP2 1020 1020 1 /DEP3
2007 Moscow DEP2 DEP1 657 1020 2 /DEP3/DEP2
2007 Moscow DEP1 762 1020 3 /DEP3/DEP2/DEP1
2007 Moscow DEP4 DEP3 487 487 1 /DEP4
2007 Moscow DEP3 DEP2 1020 487 2 /DEP4/DEP3
2007 Moscow DEP2 DEP1 657 487 3 /DEP4/DEP3/DEP2
2007 Moscow DEP1 762 487 4 /DEP4/DEP3/DEP2/DEP1
10 rows selected
SQL>
- Firstly it stands for itself (level=1).
- Then we begin to build hierarchy starting from DEP2 and DEP1 is met as a "child" (remember we build a hierarchy in the opposite direction) for DEP2 (level=2)
- The next time we meet it as a "grandchild" for DEP3 (level=3)
- And finally as a "great-grandchild" for DEP4 (level=4).
In each row we have a value of sales for the root department. So if we group data by department and put a sum() aggregate function on connect_by_root(sales) column we get the sales of DEP1 and all the sales of all departments that are under DEP1 in one row.
The same stuff will happen to all the other departments:
SQL> select to_char(year,'yyyy') year,
2 city,
3 t.dep,
4 th.dep_par,
5 sales,
6 sum(connect_by_root sales) tot_salesh
7 from t, t_hrchy th
8 where t.dep = th.dep(+)
9 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')
10 and t.city='Moscow'
11 connect by t.dep = prior dep_par
12 and prior year = year
13 and prior city = city
14 group by year,city,t.dep,th.dep_par,sales
15 order by 1,2,3
16 /
YEAR CITY DEP DEP_PAR SALES TOT_SALESH
---- ------ ---- ------- ---------- ----------
2007 Moscow DEP1 762 2926
2007 Moscow DEP2 DEP1 657 2164
2007 Moscow DEP3 DEP2 1020 1507
2007 Moscow DEP4 DEP3 487 487
SQL>
Model Solution
Now we are proceeding and will solve the same problem using Model clause without any connect by's.
To get less data to be outputed at every step - let's restrict our data to 2007 year and Moscow city only as we did in Non-Model solution:
SQL> select * from t, t_hrchy thLet's think of what woud be placed in PARTITION BY part of our model. Remeber that partitions should be absolutely independent one of another.
2 where t.dep = th.dep(+)
3 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')
4 and t.city='Moscow'
5 order by 3,1,2
6 /
CITY DEP YEAR SALES DEP DEP_PAR
------ ---- ----------- ---------- ---- -------
Moscow DEP1 31.12.2007 762
Moscow DEP2 31.12.2007 657 DEP2 DEP1
Moscow DEP3 31.12.2007 1020 DEP3 DEP2
Moscow DEP4 31.12.2007 487 DEP4 DEP3
SQL>
Yes - we would put YEAR and CITY there. As a task instructs us to treat sales of different years and cities separately.
In dimension we would place DEP and DEP_PART columns.
And in measures - we'll put SALES, of course. And create one more measure called TOT_SALES with initial value of zero.
SQL> select * from t, t_hrchy th
2 where t.dep = th.dep(+)
3 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')
4 and t.city='Moscow'
5 model
6 partition by (year,city)
7 dimension by (t.dep dep,dep_par)
8 measures(sales, 0 tot_sales)
9 ()
10 order by 1,2,3
11 /
YEAR CITY DEP DEP_PAR SALES TOT_SALES
----------- ------ ---- ------- ---------- ----------
31.12.2007 Moscow DEP1 762 0
31.12.2007 Moscow DEP2 DEP1 657 0
31.12.2007 Moscow DEP3 DEP2 1020 0
31.12.2007 Moscow DEP4 DEP3 487 0
SQL>
Now we are going to write a rule for calculating the tot_sales.
For any cell in our model the tot_sales should be equal to a sum of sales of the current department and all the tot_sales of all departments under the current.
We can write it as:
tot_sales[any,any] = sales[CV(),CV()]+nvl(sum(tot_sales)[any,CV(dep)],0)
BTW it would be correct in case when the tot_sales of the departments under the current one already included all the tot_sales of the departments under them.
So what is important here is the ORDER in what the rules are executed.
If we don't specify anything about the order of execution to our model it uses the default SEQUENTIAL ORDER and can easily throw out an error for us:
SQL> select * from t, t_hrchy th
2 where t.dep = th.dep(+)
3 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')
4 and t.city='Moscow'
5 model
6 partition by (year,city)
7 dimension by (t.dep dep,dep_par)
8 measures(sales, 0 tot_sales)
9 rules
10 (tot_sales[any,any] = sales[CV(),CV()]+nvl(sum(tot_sales)[any,CV(dep)],0))
11 order by 1,2,3
12 /
...
ORA-32637: Self cyclic rule in sequential order MODEL
SQL>
But putting AUTOMATIC ORDER causes all rules to be evaluated in an order based on their logical dependencies.
Let's check:
SQL> select * from t, t_hrchy th
2 where t.dep = th.dep(+)
3 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')
4 and t.city='Moscow'
5 model
6 partition by (year,city)
7 dimension by (t.dep dep,dep_par)
8 measures(sales, 0 tot_sales)
9 rules automatic order
10 (tot_sales[any,any] = sales[CV(),CV()]+nvl(sum(tot_sales)[any,CV(dep)],0))
11 order by 1,2,3
12 /
YEAR CITY DEP DEP_PAR SALES TOT_SALES
----------- ------ ---- ------- ---------- ----------
31.12.2007 Moscow DEP1 762 2926
31.12.2007 Moscow DEP2 DEP1 657 2164
31.12.2007 Moscow DEP3 DEP2 1020 1507
31.12.2007 Moscow DEP4 DEP3 487 487
SQL>
VoilĂ ! This is exactly what we were looking for.
Let's find a detailed description of what's happening when we don't use AUTOMATIC ORDER and what's changing when we begin to use it.
First of all, the equivalent of our single rule would be actually a list of four rules for every row (unique set of dimension values) in our data:
tot_sales['DEP1', null ]=sales['DEP1', null ]+nvl(sum(tot_sales)[any,'DEP1'],0),
tot_sales['DEP2','DEP1']=sales['DEP2','DEP1']+nvl(sum(tot_sales)[any,'DEP2'],0),
tot_sales['DEP3','DEP2']=sales['DEP3','DEP2']+nvl(sum(tot_sales)[any,'DEP3'],0),
tot_sales['DEP4','DEP3']=sales['DEP4','DEP3']+nvl(sum(tot_sales)[any,'DEP4'],0)
But using SEQUENTIAL ORDER doesn't guarantee us any order of these rules execution. Because as it stated in the documetation: SEQUENTIAL ORDER: This causes rules to be evaluated in the order they are written. This is the default.
But we have only one rule in our model. So the detailed rules can be run in any order:
for example first we find tot_sales['DEP1', null ] and then tot_sales['DEP2','DEP1'], or vice versa. And that is the problem, because this two possibilities will give us different results in the output.
What actually ORA-32637 stays for is:
Cause: A self-cyclic rule was detected in the sequential order MODEL. Sequential order MODELs cannot have self cyclic rules to guarantee that the results do not depend on the order of evaluation of the cells that are updated or upserted.
Action: Use ordered rule evaluation for this rule.
So if use order by DEP we shouldn't have such an error, let's try:
SQL> select * from t, t_hrchy thYes, the query returns no error - because we explicitly defined the order of rules execution. But as you see we have wrong results. Because actually in our case we should order by dep DESC! You can try it yourself.
2 where t.dep = th.dep(+)
3 and trunc(t.year,'y')=to_date('01.01.2007','dd.mm.yyyy')
4 and t.city='Moscow'
5 model
6 partition by (year,city)
7 dimension by (t.dep dep,dep_par)
8 measures(sales, 0 tot_sales)
9 rules
10 (tot_sales[any,any] order by dep = sales[CV(),CV()]+nvl(sum(tot_sales)[any,CV(dep)],0))
11 order by 1,2,3
12 /
YEAR CITY DEP DEP_PAR SALES TOT_SALES
----------- ------ ---- ------- ---------- ----------
31.12.2007 Moscow DEP1 762 762
31.12.2007 Moscow DEP2 DEP1 657 657
31.12.2007 Moscow DEP3 DEP2 1020 1020
31.12.2007 Moscow DEP4 DEP3 487 487
SQL>
While we don't know what are the dependences in the hierarchy data beforehand - let's give Oracle to decide - what are the dependencies between cells and simply use AUTOMATIC ORDER:
SQL> select * from t, t_hrchy th
2 where t.dep = th.dep(+)
3 model
4 partition by (to_char(year,'yyyy') year,city)
5 dimension by (t.dep dep,dep_par)
6 measures(sales, 0 tot_sales)
7 rules automatic order
8 (tot_sales[any,any] = sales[CV(),CV()]+nvl(sum(tot_sales)[any,CV(dep)],0))
9 order by 1,2,3
10 /
YEAR CITY DEP DEP_PAR SALES TOT_SALES
---- ------ ---- ------- ---------- ----------
2006 Moscow DEP1 562 2254
2006 Moscow DEP2 DEP1 457 1692
2006 Moscow DEP3 DEP2 890 1235
2006 Moscow DEP4 DEP3 345 345
2006 Omsk DEP1 23 264
2006 Omsk DEP2 DEP1 154 241
2006 Omsk DEP3 DEP2 87 87
2007 Moscow DEP1 762 2926
2007 Moscow DEP2 DEP1 657 2164
2007 Moscow DEP3 DEP2 1020 1507
2007 Moscow DEP4 DEP3 487 487
2007 Omsk DEP1 34 403
2007 Omsk DEP2 DEP1 213 369
2007 Omsk DEP3 DEP2 156 156
14 rows selected
SQL>
to be continued...