Tuesday, December 11, 2007

Reports: ratios & percentages

... previous

Let's continue with our spreadsheet report (input data you can find here).

Imagine that we want to add a column that would represent sales percentage of each department in the total of it's city sales. So we need to know how much every department contributes to it's region sales.

Non-model solution:

Developing the group by solution with rollup - we could use RATIO_TO_REPORT function for that purpose.

Our query would look like:
SQL> select city, dep, sum(sales),
2 decode(grouping_id(city, dep),
3 0,
4 round(2 * 100 * ratio_to_report(sum(sales))
5 over(partition by city),
6 2) || '%') perc
7 from t
8 where trunc(year, 'y') = to_date('01.01.2007', 'dd.mm.yyyy')
9 group by rollup(city, dep)
10 /

CITY DEP SUM(SALES) PERC
------ ---- ---------- ---------
Moscow DEP1 762 26.04%
Moscow DEP2 657 22.45%
Moscow DEP3 1020 34.86%
Moscow DEP4 487 16.64%
Moscow 2926
Omsk DEP1 34 8.44%
Omsk DEP2 213 52.85%
Omsk DEP3 156 38.71%
Omsk 403
3329

10 rows selected

SQL>
Let's take a closer look.
First of all we use GROUPING_ID function - to understand when the data is detailed (comparing it to 0). When it is detailed - we return the ratio of the value in the current row to the total sales of the region. Otherwise we return NULL.
But we need to remember that analytic functions are executed after the grouping has been done. So when our ratio_to_report would be applied - we will have doubled sales for each city: on the one hand - the sales of each department, and on the other hand, the sum of all departments sales (as a city subtotal) will also influence our ratio_to_report.
That's why we multiply the function by 2, so we get correct results.
With multiplying by 100 we get percentage values.

Model solution
For getting the same result with Model clause - first of all we need to add a new measure:
cast(null as varchar2(7)) PERC
So it would be our percentage column in the result set:
SQL> select * from t where trunc(year,'y') = to_date('01.01.2007','dd.mm.yyyy')
2 model
3 dimension by (0 total,city, dep)
4 measures(sales, cast(null as varchar2(7)) perc)
5 rules upsert all
6 (sales[1,any,null]=sum(sales)[0,CV(),any],
7 sales[1,null,null]=sum(sales)[0,any,any])
8 order by 2,3
9 /

TOTAL CITY DEP SALES PERC
---------- ------ ---- ---------- -------
0 Moscow DEP1 762
0 Moscow DEP2 657
0 Moscow DEP3 1020
0 Moscow DEP4 487
1 Moscow 2926
0 Omsk DEP1 34
0 Omsk DEP2 213
0 Omsk DEP3 156
1 Omsk 403
1 3329

10 rows selected

SQL>
Now we need to create a rule for our column.
So first of all we need to calculate values for all the cells with detail data. On the language of our model it would be perc[0,any,any]. By using a "Total" dimension with value=0 we will get only cells with detailed data. City and Department could be any. That is what we put on the left side of our formula.

On the right side we need to find for each detailed row a ratio of the current row sales - to it's region sales total.
So we would write sales[0,CV(),CV()]/sales[1,CV(),null]
In the denominator we just put a reference to the cell with a region total. While it has been already calculated - we don't need to sum up all the sales of a particular city again.
SQL> select * from t where trunc(year,'y') = to_date('01.01.2007','dd.mm.yyyy')
2 model
3 dimension by (0 total,city, dep)
4 measures(sales, cast(null as varchar2(7)) perc)
5 rules upsert all
6 (sales[1,any,null]=sum(sales)[0,CV(),any],
7 sales[1,null,null]=sum(sales)[0,any,any],
8 perc[0,any,any]=round(sales[0,CV(),CV()]/sales[1,CV(),null]*100,2)||'%')
9 order by 2,3
10 /

TOTAL CITY DEP SALES PERC
---------- ------ ---- ---------- -------
0 Moscow DEP1 762 26.04%
0 Moscow DEP2 657 22.45%
0 Moscow DEP3 1020 34.86%
0 Moscow DEP4 487 16.64%
0 Moscow %
1 Moscow 2926
0 Omsk DEP1 34 8.44%
0 Omsk DEP2 213 52.85%
0 Omsk DEP3 156 38.71%
0 Omsk %
1 Omsk 403
1 3329
0 %

13 rows selected

SQL>
Ooops. As you can see there are three redundant rows with only a '%' in our new column.
Why did they appear? Because we placed UPSERT ALL.
If we don't want a particular rule to be under a general semantic, which is mentioned after the RULES word, we can put a needed behaviour name directly before the rule. In our case it is UPDATE (no new rows generation -> only update, for more information look Reports: totals & subtotals):
SQL> select * from t where trunc(year,'y') = to_date('01.01.2007','dd.mm.yyyy')
2 model
3 dimension by (0 total,city, dep)
4 measures(sales, cast(null as varchar2(7)) perc)
5 rules upsert all
6 (sales[1,any,null]=sum(sales)[0,CV(),any],
7 sales[1,null,null]=sum(sales)[0,any,any],
8 update perc[0,any,any]=round(sales[0,CV(),CV()]/sales[1,CV(),null]*100,2)||'%')
9 order by 2,3
10 /

TOTAL CITY DEP SALES PERC
---------- ------ ---- ---------- -------
0 Moscow DEP1 762 26.04%
0 Moscow DEP2 657 22.45%
0 Moscow DEP3 1020 34.86%
0 Moscow DEP4 487 16.64%
1 Moscow 2926
0 Omsk DEP1 34 8.44%
0 Omsk DEP2 213 52.85%
0 Omsk DEP3 156 38.71%
1 Omsk 403
1 3329

10 rows selected

SQL>

This is actually what we were looking for!

to be continued ...

0 comments: