Programming Challenge: Unbiased Results

There is a set of sets of characters. Your task is to produce an algorithm which outputs a set of characters by taking exactly one random character from each of the sets of characters with probability 1/(total possible outputs). There is one rule. Once you select a character from a set, that character can’t be selected all remaining sets. For example with

{{A, B, C} {A,B}} if you choose A from the first set, the legal choices from the second set becomes {B} – choosing A would result in a collision and this is not legal. It is assumed that there is always a solution – you never have to worry about there being no possible solutions (so you will not encounter sets such as {{A},{A}}).

The potential combinations are {A,B},{B,A},{C,A},{C,B}. (note that {B,C} and similar is not possible since C is not in the 2nd set (output is in the same order as the sets are in))

You must make an algorithm which can pick these potential outputs with equal probability.

A naive and incorrect solution works as follows:

0. create an empty set for the output

1. start at the first set of characters

2. generate a random number from 0 inclusive to the cardinality of the set of characters exclusive.

3. Select that character and append it to the output set.

4. remove that character from the remaining sets which have not yet had a character picked from them

5. if the cardinality of the output set is the same as the cardinality of the set of sets, terminate. Else, iterate to the next set of characters and go to step 2

The problem of this is obvious. The order of the sets affects the probability of producing certain outputs and makes it wrong. If the order is {A,B}{A,B,C}, the probabilities of each permutation are as follows

A B 1/2 * 1/2

A C 1/2 * 1/2

B A 1/2 * 1/2

B C 1/2 * 1/2

But if the order is {A,B,C},{A,B} the probabilities of output is

A B 1/3 * 1

B A 1/3 * 1

C A 1/3 * 1/2

C B 1/3 * 1/2

as you can see, they don’t all have the same probability of output. this is an error. What is the solution to this problem?

### Like this:

Like Loading...

*Related*