Programming Puzzle: Unbiased Selection Algorithms

Programming Challenge: Unbiased Results

There is a set of sets of characters. Your task is to produce an algorithm which outputs a set of characters by taking exactly one random character from each of the sets of characters with probability 1/(total possible outputs). There is one rule. Once you select a character from a set, that character can’t be selected all remaining sets. For example with
{{A, B, C} {A,B}} if you choose A from the first set, the legal choices from the second set becomes {B} – choosing A would result in a collision and this is not legal. It is assumed that there is always a solution – you never have to worry about there being no possible solutions (so you will not encounter sets such as {{A},{A}}).
The potential combinations are {A,B},{B,A},{C,A},{C,B}. (note that {B,C} and similar is not possible since C is not in the 2nd set (output is in the same order as the sets are in))
You must make an algorithm which can pick these potential outputs with equal probability.
A naive and incorrect solution works as follows:

0. create an empty set for the output
1. start at the first set of characters
2. generate a random number from 0 inclusive to the cardinality of the set of characters exclusive.
3. Select that character and append it to the output set.
4. remove that character from the remaining sets which have not yet had a character picked from them
5. if the cardinality of the output set is the same as the cardinality of the set of sets, terminate. Else, iterate to the next set of characters and go to step 2

The problem of this is obvious. The order of the sets affects the probability of producing certain outputs and makes it wrong. If the order is {A,B}{A,B,C}, the probabilities of each permutation are as follows

A B 1/2 * 1/2
A C 1/2 * 1/2
B A 1/2 * 1/2
B C 1/2 * 1/2

But if the order is {A,B,C},{A,B} the probabilities of output is

A B 1/3 * 1
B A 1/3 * 1
C A 1/3 * 1/2
C B 1/3 * 1/2

as you can see, they don’t all have the same probability of output. this is an error. What is the solution to this problem?

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s