# Optimizing Top Finishers in Grouped Races

# Problem

Consider a race between at least runners from at most countries.

Each country that wishes to participate agrees to submit to the race committee a roster containing the number of runners that will participate along with an amount $1,000,000 * (where ) the country will donate to charity for each of its runners that finish in the top runners.

Assuming that all possible race standings are equally likely, find the countries that should participate in the race so that the most money is donated to charity.

# Outline

Ultimately, we’ll use linear-fractional 0-1 programming to optimize this problem, but we’ll start with a description of the dynamics of the race. Then we’ll show the brute force approach to solving the problem in order to supply a baseline for both speed correctness. Along the way, we’ll develop an intuition why linear-fractional 0-1 programming is applicable in solving the problem.

# Prerequisites

For a given race, we can determine the outcomes by creating *runner* / *country* pairs,
and looking at the country breakdowns of the –combinations
of the *runner* / *country* pairs. Because payments to charity do not depend on
which specific runners from a country finish in the top, we can disregard the complexity of
looking at –permutations.

For each –combination, determining the country breakdown tells the amount of money that would be donated to charity. The expected payment to charity is computed by averaging over the possible outcomes.

## Expectation of for a given race

`exp_q_finishers`

returns the expected sum or average values of the top finishers
(depending on the value of `avg_q`

), given the countries participating in the race.
In addition to the value, the function also returns a map of the different
finishes from a country perspective (counts rather than individual runners). The astute
reader will note that the values in the returned `country_finishes`

map follow an unnormalized
multivariate hypergeometric distribution,
which can be verified by dividing each value (count) by the sum of values and comparing the probabilities
to the multivariate hypergeometric distribution probability mass function. We can also run the keys in
`country_finishes`

through the multivariate hypergeometric distribution PMF to verify the probabilities
sum to one, which is an indication the returned race finishes represent the comprehensive set of
possible finishes. To see this in action, consider the following example.

### Example 1

Consider the race between the following three countries where the top = **11** finishers
of the = **13** participants will have a charitable donation in their name with the amount
dictated by the value for a runner’s country of origin.

country | # runners | |
---|---|---|

Algeria | 5 | 0.5 |

Morocco | 5 | 0.5 |

United Kingdom | 3 | 0.31 |

There are **3,113,510,400** unique
race finishes for the top **11** finishers of this race. Again, individual race ordering does
not matter since the problem is only concerned with the *number of finishers* from each country
in the top **11**. Therefore, combinations rather than permutations are appropriate. There are then
unique finishes when order within
the top **11** runners does not matter.

If we let and
, and
inspect the results of `exp_q_finishers(r, q, t)`

, we
get the unique finishes, , and the number of times each finish occurs in .
Note that the column vectors in represent a unique race outcome and therefore each
column in sums to since there are top finishers.

Notice:

and

These are important results because , and are parts of the race definition, which tells us that we don’t actually need to simulate the possible races to determine the expected number of runners from each country that will finish in the top . Determining the expected charitable donation for the race is just $1,000,000 times:

or **$5,017,692.31** or about **$456,153.85** per the 11 finishers.

## Brute force optimization of the problem

`exp_q_finishers`

determines the expected sum of values for the top finishers, given
set of countries entered into a race. `brute_force_best`

, defined below, can be used optimize the
objective in a brute force way by applying `exp_q_finishers`

to each valid country combination
and finding the one with the highest value for `exp_q_finishers`

. This is rather straightforward
and can be done as follows:

### Example 2

Find at most = **3** countries (from the list below) to participate in a race with at least
= **11** runners, where the top = **11** runners’ countries will make a charitable
donation according to the country’s value.

country | # runners | |
---|---|---|

Algeria | 5 | 0.5 |

Morocco | 5 | 0.5 |

Tanzania | 4 | 0.32 |

United Kingdom | 3 | 0.31 |

By looking at the “*# runners*” column, it is apparent that 3 countries are required to fill out the
race. So, the possible races (*i.e.*, country combinations), along with their expected donation
following the race and average donation per finisher are listed below.

rank | countries | exp donation ($1M) = sum | sum / |
---|---|---|---|

1 | Algeria, Morocco, United Kingdom | 5.017692 | 0.45615384 |

2 | Algeria, Morocco, Tanzania | 4.9342856 | 0.4485714 |

3 | Algeria, Tanzania, United Kingdom | 4.3175 | 0.3925 |

3 | Morocco, Tanzania, United Kingdom | 4.3175 | 0.3925 |

There are a few things to call out in these results:

*Greedily selecting countries based on the highest*; otherwise,*values is suboptimal**Algeria*,*Morocco*and*Tanzania*would have been selected to race.*Optimizing the sum of**values for the top**finishers is equivalent to optimizing the mean**values for the top*since is unrelated to the countries participating in the race.*finishers**Since**is a constant in each objective value,**can be removed from the objective.*. (*The objective value is thus:**See example 1 for the derivation*)

# Optimization via linear-fractional 0-1 programming

Given the insight from the two examples, we know that we can write the problem definition as:

**maximize:**

**subject to:**

This is a linear-fractional 0-1 programming problem because the objective is the ratio of two linear functions: and and the decision variables are binary. While this is not currently a pure linear program, it can be transformed into one by applying the Charnes-Cooper transformation.

**maximize:**

**subject to:**

Notice that with the change of variables from to , we lose the binary constraints on . To reintroduce the binary constraint, we can use a trick from the lp_solve documentation, which introduces additional binary variables and three constraints per :

where and are constants and must be greater than and must be greater than in linear programming problem , defined above.

## Example 3: Using PuLP to define LP problem

## Example 4: linear-fractional 0-1 programming formulation

Given the result of the linear program above with no binary constraints, we need to use the values of and and create the additional variables and constraints. This looks like the following:

Once we solve the problem and determine the active variables, we can get the list of countries
in the race via `countries[active_variables(y)].tolist()`

. Sure enough, the countries selected are:
*Algeria*, *Morocco* and *United Kingdom*. The objective value for those countries is given by
`pulp.value(model.objective)`

which gives a value of **0.45615384661**. Notice that this is same as
the *avg_q* of **0.45615384** in *Example 2*. So, it should be no surprise that
(**0.45615384661** × 11 × $1,000,000) = **$5,017,692.31** matches the result in
*Example 1*.

# Conclusion

Linear-fractional 0-1 programming can be used to optimize grouped races where the objective is to maximize some definition of quality for the quickest finishers. In the examples above, the definition of quality is the size of charitable donations, but this same problem has applications in advertising. For instance, if an advertising budget allows for impressions over at most distinct placements, then placement quality over all potential placements could be optimized using the above formulation.

Thinking about the brute force search over the –permutations representing all of the possible race finishes, the linear-fractional 0-1 programming formulation gives a good speed up. Further refinements are likely possible and some of those possible improvements are likely to be found in some of the references that follow.

# References

- Barnhart, Cynthia, et al. “Branch-and-price: Column generation for solving huge integer programs.” Operations research 46.3 (1998): 316-329.
- Charles, V., and D. Dutta. “Non-linear stochastic fractional programming models of financial derivatives.” The ICFAI Journal of Applied Finance 11.6 (2005): 5-13.
- Charnes, Abraham, and William W. Cooper. “Programming with linear fractional functionals.” Naval Research logistics quarterly 9.3‐4 (1962): 181-186.
- Chinneck, John W. “Practical Optimization: A Gentle Introduction.”, Ch 13. Systems and Computer Engineering, Carleton University, Ottawa. http://www.sce.carleton.ca/faculty/chinneck/po.html (2006). (Accessed Aug 3, 2019)
- Glover, Fred. “Improved linear integer programming formulations of nonlinear integer problems.” Management Science 22.4 (1975): 455-460.
- Güngör, Murat. “A fractional 0–1 program for task assignment with respect to preferences.” Computers & Industrial Engineering 131 (2019): 263-268.
- Swarup, Kanti. “Letter to the editor—Linear fractional functionals programming.” Operations Research 13.6 (1965): 1029-1036.
- https://pythonhosted.org/PuLP/
- https://en.wikipedia.org/wiki/Linear-fractional_programming#Transformation_to_a_linear_program
- http://lpsolve.sourceforge.net/5.5/ratio.htm
- https://en.wikipedia.org/wiki/Hypergeometric_distribution#Multivariate_hypergeometric_distribution
- https://en.wikipedia.org/wiki/Mile_run_world_record_progression