Programming Puzzle: Unbiased Selection Algorithms

Programming Challenge: Unbiased Results

There is a set of sets of characters. Your task is to produce an algorithm which outputs a set of characters by taking exactly one random character from each of the sets of characters with probability 1/(total possible outputs). There is one rule. Once you select a character from a set, that character can’t be selected all remaining sets. For example with
{{A, B, C} {A,B}} if you choose A from the first set, the legal choices from the second set becomes {B} – choosing A would result in a collision and this is not legal. It is assumed that there is always a solution – you never have to worry about there being no possible solutions (so you will not encounter sets such as {{A},{A}}).
The potential combinations are {A,B},{B,A},{C,A},{C,B}. (note that {B,C} and similar is not possible since C is not in the 2nd set (output is in the same order as the sets are in))
You must make an algorithm which can pick these potential outputs with equal probability.
A naive and incorrect solution works as follows:

0. create an empty set for the output
1. start at the first set of characters
2. generate a random number from 0 inclusive to the cardinality of the set of characters exclusive.
3. Select that character and append it to the output set.
4. remove that character from the remaining sets which have not yet had a character picked from them
5. if the cardinality of the output set is the same as the cardinality of the set of sets, terminate. Else, iterate to the next set of characters and go to step 2

The problem of this is obvious. The order of the sets affects the probability of producing certain outputs and makes it wrong. If the order is {A,B}{A,B,C}, the probabilities of each permutation are as follows

A B 1/2 * 1/2
A C 1/2 * 1/2
B A 1/2 * 1/2
B C 1/2 * 1/2

But if the order is {A,B,C},{A,B} the probabilities of output is

A B 1/3 * 1
B A 1/3 * 1
C A 1/3 * 1/2
C B 1/3 * 1/2

as you can see, they don’t all have the same probability of output. this is an error. What is the solution to this problem?


Haskell: Lightning in a bottle.

If I could go back in time to tell myself something 10 years ago, I’d tell myself to learn haskell. This language is beautiful. It’s a purely functional language. It’s not perfect but its pretty damn close. It’s exciting learning it, it’s very different from languages like C and Java. It’s incredibly rigorous and sturdy. The only problem with it? It’s hard to program in. And that’s a pretty big problem.

But it’s not a problem for good programmers. My view is that functional programming will increase in popularity as people realize that having solid code is more important than being able to whip it together quickly. Let me be clear, I don’t think that an experienced Haskell programmer would take longer than a C programmer to make a program. I do think that a beginner haskell programmer would take longer to make a program than a beginner C programmer. So the difference decreases over time.

It’s thrilling learning a well designed language. When I learn a language I try to find flaws in it as I go along. In Haskell I’ve found no flaws in the language, only flaws in my understanding of it. The Wikipedia page on Haskell says the following under the “Criticism” section:

“Jan-Willem Maessen, in 2002, and Simon Peyton Jones, in 2003, discussed problems associated with lazy evaluation while also acknowledging the theoretical motivation for it,[43][44] in addition to purely practical considerations such as improved performance”

The best criticism of Haskell is that some people once discussed problems with it (problems which are not listen) and that lazy evaluation is theoretically sound and very practical as it improves performance. That’s a sign of how strong the language is – the strongest criticism is a compliment.