In each case I'll give an opportunity how it can be done without using the model clause. And then - how the model clause could be used. So let's begin.
Input data:
SQL> drop table t;
Table dropped
SQL>
SQL> create table t as (select 'Omsk' city, 'DEP1' dep, to_date('31.12.2006','dd.mm.yyyy') year, 23 sales from dual union all
2 select 'Omsk' city, 'DEP2' dep, to_date('31.12.2006','dd.mm.yyyy') year, 154 sales from dual union all
3 select 'Omsk' city, 'DEP3' dep, to_date('31.12.2006','dd.mm.yyyy') year, 87 sales from dual union all
4 select 'Moscow' city, 'DEP1' dep, to_date('31.12.2006','dd.mm.yyyy') year, 562 sales from dual union all
5 select 'Moscow' city, 'DEP2' dep, to_date('31.12.2006','dd.mm.yyyy') year, 457 sales from dual union all
6 select 'Moscow' city, 'DEP3' dep, to_date('31.12.2006','dd.mm.yyyy') year, 890 sales from dual union all
7 select 'Moscow' city, 'DEP4' dep, to_date('31.12.2006','dd.mm.yyyy') year, 345 sales from dual union all
8 select 'Omsk' city, 'DEP1' dep, to_date('31.12.2007','dd.mm.yyyy') year, 34 sales from dual union all
9 select 'Omsk' city, 'DEP2' dep, to_date('31.12.2007','dd.mm.yyyy') year, 213 sales from dual union all
10 select 'Omsk' city, 'DEP3' dep, to_date('31.12.2007','dd.mm.yyyy') year, 156 sales from dual union all
11 select 'Moscow' city, 'DEP1' dep, to_date('31.12.2007','dd.mm.yyyy') year, 762 sales from dual union all
12 select 'Moscow' city, 'DEP2' dep, to_date('31.12.2007','dd.mm.yyyy') year, 657 sales from dual union all
13 select 'Moscow' city, 'DEP3' dep, to_date('31.12.2007','dd.mm.yyyy') year, 1020 sales from dual union all
14 select 'Moscow' city, 'DEP4' dep, to_date('31.12.2007','dd.mm.yyyy') year, 487 sales from dual)
15 /
Table created
SQL> commit;
Commit complete
SQL>
SQL>select * from tThis is a table of sales for a company in Russia, which has it's departments in two cities: Omsk (Dep 1,2,3) and in Moscow (Dep 1,2,3,4). The data about sales is given for two years: 2006 and 2007.
2 order by 3,1,2
3 /
CITY DEP YEAR SALES
------ ---- ----------- ----------
Moscow DEP1 31.12.2006 562
Moscow DEP2 31.12.2006 457
Moscow DEP3 31.12.2006 890
Moscow DEP4 31.12.2006 345
Omsk DEP1 31.12.2006 23
Omsk DEP2 31.12.2006 154
Omsk DEP3 31.12.2006 87
Moscow DEP1 31.12.2007 762
Moscow DEP2 31.12.2007 657
Moscow DEP3 31.12.2007 1020
Moscow DEP4 31.12.2007 487
Omsk DEP1 31.12.2007 34
Omsk DEP2 31.12.2007 213
Omsk DEP3 31.12.2007 156
14 rows selected
SQL>
Now we want to get sales for the year 2007. But we need not only detailed sales (as in a table), but also subtotals for each city. And a total for the whole company.
Non-Model Solution:
Usually such reports are created by using extensions to Group By: Cube, Rollup & Grouping Sets expression.
So our query would look like:
SQL> select city, dep, sum(sales) sales
2 from t
3 where trunc(year, 'y') = to_date('01.01.2007', 'dd.mm.yyyy')
4 group by rollup(city, dep)
5 order by 1, 2
6 /
CITY DEP SALES
------ ---- ----------
Moscow DEP1 762
Moscow DEP2 657
Moscow DEP3 1020
Moscow DEP4 487
Moscow 2926
Omsk DEP1 34
Omsk DEP2 213
Omsk DEP3 156
Omsk 403
3329
10 rows selected
SQL>
Model solution:
To create the same report using model clause, we have to execute the following query:
SQL> select city,dep,salesLet's elaborate how it works.
2 from t
3 where trunc(year,'y') = to_date('01.01.2007','dd.mm.yyyy')
4 model
5 dimension by (0 total,city, dep)
6 measures(sales)
7 rules upsert all
8 (sales[1,any,null]=sum(sales)[0,CV(),any],
9 sales[1,null,null]=sum(sales)[0,any,any])
10 order by 1,2
11 /
CITY DEP SALES
------ ---- ----------
Moscow DEP1 762
Moscow DEP2 657
Moscow DEP3 1020
Moscow DEP4 487
Moscow 2926
Omsk DEP1 34
Omsk DEP2 213
Omsk DEP3 156
Omsk 403
3329
10 rows selected
SQL>
Using the condition in the WHERE clause - we get only data for the year 2007 in the result.
Then we create dimensions: City and Dep - are the dimensions from the table. But we add one more dimension called Total and by default assign value 0 to all the cells for that dimension.
As a measure we take the field from our table Sales.
So at this stage we get as a result:
SQL> select *Now we need to create extra lines for the subtotals and total.
2 from t
3 where trunc(year,'y') = to_date('01.01.2007','dd.mm.yyyy')
4 model
5 dimension by (0 total,city, dep)
6 measures(sales)
7 ()
8 order by 2,3
9 /
TOTAL CITY DEP SALES
---------- ------ ---- ----------
0 Moscow DEP1 762
0 Moscow DEP2 657
0 Moscow DEP3 1020
0 Moscow DEP4 487
0 Omsk DEP1 34
0 Omsk DEP2 213
0 Omsk DEP3 156
7 rows selected
SQL>
We use rules for this.
You should remember that there are two ways to refer to a cell in a rule of a model clause: positional or symbolic.
Example of positional: sales['Omsk','DEP3'] or sales['Omsk','DEP4'].
When a cell which is positionally referenced exists - it's value is updated. When it doesn't exist - the model clause will create a cell with such dimensions.
Example of symbolic reference:
sales[city='Omsk',dep='DEP3'], sales[city is any,dep='DEP3'] ,sales[any,dep='DEP3'].
Actually the last one [any] is a positional reference - but I put it in the symbolic group because there's a remark in the documentation:
Note that ANY is treated as a symbolic reference even if it is specified positionally, because it really means that (dimension IS NOT NULL OR dimension IS NULL).
A typical behaviour for symbolically referenced cells: when it exists - the value gets updated. When it doesn't exist - no cells are created.
Don't forget about the cells which are referenced symbolically and positionally at the same time. By default they behave like all the references are symbolic.
Well, actually we could use 3 types of behaviour: UPDATE, UPSERT and UPSERT ALL.
For explicit determining we should place the name of behaviour directly after the RULES keyword. In this case it would be adjusted to all the rules.
If we want to specify a particular rule - we can put the name of behaviour right before the rule (we'll see how it works in the next post).
If we don't put any - the default behaviour UPSERT is used. How it works - I mentioned earlier.
When we use UPDATE - all cells are got updated, and if they don't exist - no cells are created, whichever reference (positional or symbolic) is used.
UPSERT ALL is a scpecial case - it can create cells even for symbolic reference.
But not in all cases, for example - it wouldn't create any cells if we put a rule:
sales[city='Voronezh','DEP5']=100 (because there's no city called 'Voronezh' in our table), but at the same time it would create a cell (which doesn't exist) for the rule sales[city='Omsk','DEP5']=100.
Why?
Because UPSERT ALL works in the following way (from the doc):
Step 1 Find Cells
Find the existing cells that satisfy all the symbolic predicates of the cell reference.
Step 2 Find Distinct Dimension Values
Using just the dimensions that have symbolic references, find the distinct dimension value combinations of these cells.
Step 3 Perform a Cross Product
Perform a cross product of these value combinations with the dimension values specified through positional references.
Step 4 Upsert New Cells
The results of Step 3 are then used to upsert new cells into the array
Just look:
SQL> select *By putting RETURN UPDATED ROWS we get as a result only updated or inserted cells.
2 from t
3 where trunc(year,'y') = to_date('01.01.2007','dd.mm.yyyy')
4 model
5 return updated rows
6 dimension by (0 total,city, dep)
7 measures(sales)
8 rules upsert all
9 (sales[0,city='Omsk','DEP5']=100,
10 sales[0,city='Voronezh','DEP5']=200)
11 order by 2,3
12 /
TOTAL CITY DEP SALES
---------- ------ ---- ----------
0 Omsk DEP5 100
SQL>
So as you can see despite we had a rule with Voronezh - no cells were inserted.
Let's get back to our subtotals and total.
We had two rules:
sales[1,any,null]=sum(sales)[0,CV(),any],
sales[1,null,null]=sum(sales)[0,any,any]
But as you remember [any] is treated as a symbolic reference - and won't create any cells if we don't put UPSERT ALL semantics.
SQL> select city,dep,salesIf we put UPSERT ALL semantics, our rules would do the following:
2 from t
3 where trunc(year,'y') = to_date('01.01.2007','dd.mm.yyyy')
4 model return updated rows
5 dimension by (0 total,city, dep)
6 measures(sales)
7 rules
8 (sales[1,any,null]=sum(sales)[0,CV(),any])
9 order by 1,2
10 /
CITY DEP SALES
------ ---- ----------
SQL>
First rule: sales[1,any,null] will look for all dimension values that are referenced symbolically - city dimension: Omsk and Moscow.
Then create a cross product with positionally referenced dimensions: we'll get two cells - sales[1,'Omsk',null] and sales[1,'Moscow',null]
And these cells would be inserted.
BTW the dimension total=1 would mean that these are totals or subtotals.
Next rule sales[1,null,null] - will create a company total cell.
On the right side of the rules we have grouping formulas which will sum up all the sales values needed. By using CV() we reference to the dimension value of the cell on the left side of the rule.
PS
We could get the same result without Total dimension:
SQL> select city,dep,sales
2 from t
3 where trunc(year,'y') = to_date('01.01.2007','dd.mm.yyyy')
4 model
5 dimension by (city, dep)
6 measures(sales)
7 rules upsert all
8 (sales[any,null]=sum(sales)[CV(),any],
9 sales[null,null]=sum(sales)[any,null])
10 order by 1,2
11 /
CITY DEP SALES
------ ---- ----------
Moscow DEP1 762
Moscow DEP2 657
Moscow DEP3 1020
Moscow DEP4 487
Moscow 2926
Omsk DEP1 34
Omsk DEP2 213
Omsk DEP3 156
Omsk 403
3329
10 rows selected
SQL>
Here the rules are a little bit different.
The company Total is calculated from the cities subtotals, that has already been calculated, while in the main part (with Total dimension) we calculated all totals and subtotals on detailed sales data.
to be continued ...
0 comments:
Post a Comment