via Planet Smalltalk:
Andres
implemented (product of) powerset functionality in Smalltalk:
allProductsTimes: aNumber
do: aBlock
self
allProductsTimes: aNumber
startingAt: 1
do: aBlock
allProductsTimes: aNumber
startingAt: anIndex
do: aBlock
anIndex > self size
ifTrue: [aBlock value: aNumber]
ifFalse:
[self
allProductsTimes: aNumber
startingAt: anIndex + 1
do: aBlock.
self
allProductsTimes: aNumber * (self at: anIndex)
startingAt: anIndex + 1
do: aBlock]
Isn't that absolutely beautiful?
I would not know how to judge beauty of the above code, but on
technical grounds I would have to remark that powerset and
arithmetic set product functionality is intermingled.
On closer inspection, the code consists of a fold (or reduce)
operation over the result of a powerset operation.
Some better factored Haskell code might look like this:
powerset :: [a] -> [[a]]
-- returns powerset of given set
powerset [] = [[]]
powerset (x:xs) = concat [ [s, x:s] | s <- powerset xs ]
With lists as set representation, a powerset is constructed
recursively just like in the Smalltalk code: the powerset of the empty
set []
is the set containing the
emptyset [[]]
, otherwise the powerset is built by picking
an element x
, then for each set s
in the
powerset of the remaining elements xs
, include it once
with and once without picked element x
.
powerset
works for any type of sets.
Trying it out for numbers:
Main> powerset [2 .. 4]
[[],[2],[3],[2,3],[4],[2,4],[3,4],[2,3,4]]
Now, that was only half the work, we still need to calculate the
product. I mentioned already that it can be done with a folding
operation (we will choose a left fold foldl
):
Main> :t foldl
foldl :: (a -> b -> a) -> a -> [b] -> a
We fold the set with multiplication and initial
argument 1
(the same definition as builtin
function product
):
Main> :t foldl (*) 1
foldl (*) 1 :: Num a => [a] -> a
Main> :t product
product :: Num a => [a] -> a
Note that so far, I have written three lines of code, and otherwise
presented you a bunch of type signatures. So, putting it all together:
powerset :: [a] -> [[a]]
-- returns powerset of given set
powerset [] = [[]]
powerset (x:xs) = concat [ [s, x:s] | s <- powerset xs ]
allProducts :: Num a => [a] -> [a]
-- returns arithmetic product of each subset of given set
allProducts = map product . powerset
Main> allProducts [2 .. 4]
[1,2,3,6,4,8,12,24]
Now, isn't that absolutely beautiful close
to the math underneath it?
For what it's worth, I believe the Smalltalk code could be factored
similarly.
Notes:
Andres
looked at my Haskell
powerset code and asked some questions which I try to address here.
I had a hard time understanding the meaning behind the names a, b, x,
s, and what seemed to me as operators of some sort: ->, =>, [a].
I am afraid that's how simple functions are written in idiomatic Haskell.
It's not as bad as it might look on first glance, I can try to explain
some of it. A more detailed
overview
of Haskell syntax can be found elsewhere.
powerset :: [a] -> [[a]]
This is a type signature (denoted by ::
), declaring
function powerset
to take list of a
([a]
) and returning some list of lists of a
(with a
being an arbitrary type).
That is, the function is polymorphic.
Type variables in Haskell are commonly
named a
, b
, c
,... because they
denote any type, it does not matter which.
powerset (x:xs) = concat [ [s, x:s] | s <- powerset xs ]
Here x
, xs
, s
are old-fashioned
variables. x:xs
is a pattern which the first parameter
of powerset is matched against.
It must be a list, with at least one element (which x
will be bound to), and a rest list (which xs
will be bound to).
Patterns like x:xs
or y:ys
can be found all
over the place in Haskell, it's a naming convention akin
to aBlock
in Smalltalk.
More complicated functions usually have more expressive variable
names, as you would expect.
The following is list
comprehension syntax:
[ x | x <- gen ]
It roughly means:
collect all x
where each x
is generated
from gen
.
Again, Haskell syntax tries to be close to math where you would write
something like { x | x in gen }.
Now to Andres' other questions:
-
Does the Haskell version need to be modified to accept the value of the empty product as an argument?
Not sure what you mean by empty product
exactly, but
powerset [] == [[]]
i.e. powerset applied to the empty set yields the set containing the empty set.
-
What would you need to write to iterate through the products with an
arbitrary piece of code (the aBlocks in Smalltalk)?
Haskell supports anonymous functions (lambda's) just like blocks in Smalltalk.
The moral equivalent of Smalltalk(ish)'s
1 to: 42 allProducts do: [ x | frob x ]
would perhaps be
mapM (\ x -> frob x) (allProducts [1 .. 42])
-
Would it work if the collection had stuff other than integers (such
as modular integers, polynomials, functions or matrices)?
allProducts
' type (Num a => [a] -> [a]
)
suggests that it works for any type a
which is in
typeclass Num
(class not quite in OO sense, though), i.e. the complete
numeric tower.
If you define multiplication for matrices or polynomials, the code
would work unchanged for them, too.
-
With a collection of size 20, it would create (I am no Haskell
expert!) 2^20 collections. Is that so?
Let's try it out:
Main> take 10 (powerset [1 .. 1000])
[[],[1],[2],[2,1],[3],[3,1],[3,2],[3,2,1],[4],[4,1]]
Main>
The output is instantaneous.
Clearly, Haskell does not create the full powerset (of size 21000).
Instead, it relies on lazy
evalutation to only evaluate as much as is needed to produce the
result. In the above case I requested only the first 10 elements of
the powerset, so that's what is generated.
Actually, we can do even better (trading some of the presentational
points). With a modified, even lazier version
of powerset
, we could generate the powerset of e.g. the
natural numbers, i.e. an infinite set:
powerset :: [a] -> [[a]]
powerset xs = []:(powerset' xs [[]])
where
powerset' :: [a] -> [[a]] -> [[a]]
powerset' [] _ = []
powerset' (x:xs) ps = let ps' = [ x:s | s <- ps ]
in ps' ++ powerset' xs (ps ++ ps')
The idea behind this approach is to generate the
powerset ps
of all elements before x
, then
add element x
to all subsets s
of ps
, emit those (ps'
) and continue for
the remaining elements xs
in the same way.
Main> take 10 (powerset [1 ..])
[[],[1],[2],[2,1],[3],[3,1],[3,2],[3,2,1],[4],[4,1]]
I believe the same can be done in Smalltalk with the standard OO
trick to emulate infinite lists: a generator/iterator object (which
can be queried for the next element).
Actually, I would love to see such a solution to compare to.
-
Regarding the lines of code, I would suggest that overall we're
pretty close.
Yes, I did not mean to imply that the Haskell version is shorter in
terms of LOC (which is a poor
metric anyway), apologies if it came across like this. My main point
is how close Haskell code is to math: if we
exchange concat
and (++)
for set union and
curlies for brackets we are almost there.
A minor point (and this is really with my code police hat on) which I
made: the original Smalltalk code could have been factored
some more, and the question I have is now would that increase
code readability/reuse/beauty even more?
.
I think the following well-known quote from the Wizard Book says it all:
Programs should be written for people to read, and only
incidentally for machines to execute.