Friday, January 11, 2008

Combinatorial problem

The resource is from Russian Forum.

Input:
We've got an alphabet consisting of N unique symbols.
E.g. alphabet='AB'.

Problem:
We need to find all possible variations with length M (so there would be power(N,M) number of combinations).
For our query let it be 4 as in the original source.

Solution:
ALthough there were several other solutions, e.g. using hierarchical queries, I post here my solution with model clause.
SQL> with t as (select 'AB' str from dual),
2 t1 as (select str, level-1 lvl from t connect by level<=power(length(str),4))
3 --
4 select lpad(num,4,substr(str,1,1)) path from t1
5 model
6 partition by (lvl part)
7 dimension by (0 dim)
8 measures (lvl, cast(null as varchar2(4)) num, str)
9 rules iterate (999) until (lvl[0] = 0)
10 (num[0] = substr(str[0],mod(lvl[0],length(str[0]))+1,1)||num[0],
11 lvl[0] = trunc(lvl[0]/length(str[0]))
12 )
13 order by part
14 /

PATH
----
AAAA
AAAB
AABA
AABB
ABAA
ABAB
ABBA
ABBB
BAAA
BAAB
BABA
BABB
BBAA
BBAB
BBBA
BBBB

16 rows selected

SQL>

As Chen Shapira asked me for some comments on the logic.
I can elucidate a little bit.
The first thing we need to understand that when we input an alphabet of N unique symbols and want to create all unique combinations of such symbols with length M - finally we get power(N,M) combinations.
It is kind of combinatorial stuff which is learnt at school, so the description you can find in Wiki Permutations with repetitions.

So the first what we do in our query - is generating the needed number of values we will have in the final result ('with clause' query connect by level<=...).
To distinguish these values we assign ordinal numbers starting from 0 up to the max number of value. Why we start from zero you understand later.

Now what we are going to do with these numbers? We are going to transform them from denary numeral system to the numeral system of the needed base which is length(str) in our case.

So probably if my english is not so good to understand I'll explain it with examples.
So here we're transforming to the binary system:
SQL> with t as (select '01' str from dual),
2 t1 as (select str, level-1 lvl from t connect by level<=power(length(str),4))
3 --
4 select initial_value, num, lpad(num,4,substr(str,1,1)) path from t1
5 model
6 partition by (lvl part)
7 dimension by (0 dim)
8 measures (lvl initial_value, lvl, cast(null as varchar2(4)) num, str)
9 rules iterate (999) until (lvl[0] = 0)
10 (num[0] = substr(str[0],mod(lvl[0],length(str[0]))+1,1)||num[0],
11 lvl[0] = trunc(lvl[0]/length(str[0]))
12 )
13 order by part
14 /

INITIAL_VALUE NUM PATH
------------- ---- ----
0 0 0000
1 1 0001
2 10 0010
3 11 0011
4 100 0100
5 101 0101
6 110 0110
7 111 0111
8 1000 1000
9 1001 1001
10 1010 1010
11 1011 1011
12 1100 1100
13 1101 1101
14 1110 1110
15 1111 1111

16 rows selected

SQL>

The next example is transformation to ternary numeral system:
SQL> with t as (select '012' str from dual),
2 t1 as (select str, level-1 lvl from t connect by level<=power(length(str),4))
3 --
4 select initial_value, num, lpad(num,4,substr(str,1,1)) path from t1
5 model
6 partition by (lvl part)
7 dimension by (0 dim)
8 measures (lvl initial_value, lvl, cast(null as varchar2(4)) num, str)
9 rules iterate (999) until (lvl[0] = 0)
10 (num[0] = substr(str[0],mod(lvl[0],length(str[0]))+1,1)||num[0],
11 lvl[0] = trunc(lvl[0]/length(str[0]))
12 )
13 order by part
14 /

INITIAL_VALUE NUM PATH
------------- ---- ----
0 0 0000
1 1 0001
2 2 0002
3 10 0010
4 11 0011
5 12 0012
6 20 0020
7 21 0021
8 22 0022
9 100 0100
10 101 0101
11 102 0102
12 110 0110
13 111 0111
14 112 0112
15 120 0120
16 121 0121
17 122 0122
18 200 0200
19 201 0201
20 202 0202
21 210 0210
22 211 0211
23 212 0212
24 220 0220
25 221 0221
26 222 0222
27 1000 1000
28 1001 1001
29 1002 1002
30 1010 1010
31 1011 1011
32 1012 1012
33 1020 1020
34 1021 1021
35 1022 1022
36 1100 1100
37 1101 1101
38 1102 1102
39 1110 1110
40 1111 1111
41 1112 1112
42 1120 1120
43 1121 1121
44 1122 1122
45 1200 1200
46 1201 1201
47 1202 1202
48 1210 1210
49 1211 1211
50 1212 1212
51 1220 1220
52 1221 1221
53 1222 1222
54 2000 2000
55 2001 2001
56 2002 2002
57 2010 2010
58 2011 2011
59 2012 2012
60 2020 2020
61 2021 2021
62 2022 2022
63 2100 2100
64 2101 2101
65 2102 2102
66 2110 2110
67 2111 2111
68 2112 2112
69 2120 2120
70 2121 2121
71 2122 2122
72 2200 2200
73 2201 2201
74 2202 2202
75 2210 2210
76 2211 2211
77 2212 2212
78 2220 2220
79 2221 2221
80 2222 2222

81 rows selected

SQL>

So at every iteration in our model clause we cut out the value at the appropriate position in our number. It is general algorithm of Radix Change.
As we generate the amount of numbers that would be unique in the corresponding numeral system we get unique values of the needed length (M) by padding the first letter/digit from the input string.
And instead of '01', '012' etc we can use any other string consisting of unique symbols and we get the needed result.

By the way, if we input '0123456789' string we get transformation to the denary system. So the output would be the same to the inputed numbers:
SQL> with t as (select '0123456789' str from dual),
2 t1 as (select str, level-1 lvl from t connect by level<=20/*power(length(str),4)*/)
3 --
4 select initial_value, num, lpad(num,4,substr(str,1,1)) path from t1
5 model
6 partition by (lvl part)
7 dimension by (0 dim)
8 measures (lvl initial_value, lvl, cast(null as varchar2(4)) num, str)
9 rules iterate (999) until (lvl[0] = 0)
10 (num[0] = substr(str[0],mod(lvl[0],length(str[0]))+1,1)||num[0],
11 lvl[0] = trunc(lvl[0]/length(str[0]))
12 )
13 order by part
14 /

INITIAL_VALUE NUM PATH
------------- ---- ----
0 0 0000
1 1 0001
2 2 0002
3 3 0003
4 4 0004
5 5 0005
6 6 0006
7 7 0007
8 8 0008
9 9 0009
10 10 0010
11 11 0011
12 12 0012
13 13 0013
14 14 0014
15 15 0015
16 16 0016
17 17 0017
18 18 0018
19 19 0019

20 rows selected

SQL>

Hope, now it is more clear. If still not - don't hesitate to ask.

3 comments:

Anonymous said...

Can you explain the logic a bit?

Anonymous said...

Got it!

Thank you for the explanation.
I think the technique of counting in any base could be useful in other cases (but I can't come up with a good example...)

Anonymous said...

Wonderful !!!