Breeding Values or Rankings

The Back Country Runner posted a few weeks ago putting forward the idea of a series of races that could be used as a pseudo championship. There’s nothing new in that idea and there have been many successful ‘regional’ series over the years. This idea was looking to expand on that and opened a poll asking which races people thought should be included in such a series. The poll itself provided me with much amusement. More on that later.

The North Island had a very successful Triple Crown for a couple of years that comprised, The Goat, Toi’s Challenge and The Kauri Run. Points were handed out 10, 7, 5, 3 & 1 for overall and also broken down into categories. No surprises that Sjors Corporaal cleaned up most of the time. He was easily the top trail runner in the area at the time, as were Annika Smail and Ruby Muir when they ran.

Total Sport put on a series based in and around Auckland, also working on a points system that I think gives 100 points for a category win and 1 for coming last. Looks pretty good they get a lot of people to every event but with 6 races, 2 genders and 5 categories per event, it looks more like a lucky dip. Female, Under 20, Mid Course looks like a good option to enter if you’re after a series placing. But if you’re Female, 20-39, you may want to avoid Mid-Course as Emma McCosh wins every time.

Otago had a three race series when I first began running with cumulative time being the decider, so you had to run all three to be included in the result. The actual races varied a little over the years as they came and went but always included the Three Peaks, a very nuggety 26km that is won in around 2:00 each year. I think I placed in the series a few hours behind the winner!!! Mostly because there were only 3 or 4 finishers.

Each of those methods have pluses and minuses. A points system is very easy for everyone to understand and is very clear. But I think it falls down if the aim is to find the best overall athlete in a long series of varied races. Eg someone who never wins a race can potentially win a series by virtue of turning up to more races. The longer and more spread out the series, the greater the possibility that you’re just finding the good athlete with the most disposable income. There’s also the problem that 5th in a race with 6 people looks the same as 5th in a race with 450. While that may look like exaggerating 3 of the races I looked at (see tables below) had less than 20 women running.

Cumulative time wouldn’t work for a long series at all, plus it’s inherently biased towards the longest race where the time gaps are greatest. I really should be pushing for that to be the method and Northburn to be included. Show me the money.

Total Sports method, normalizing the data, fantastic. 60 categories, not so fantastic, I can’t tell if someone is winning because there are 2 or 50 people in category. It would also be possible and plausible to run 4 races, win every single one and get no where as you decided to try all the distances.

So I’ve been thinking. In fact I’ve been thinking about it for a very long time. An independent system that takes all races (or any selection) and uses all the available information to rank athletes. Essentially it is just like breeding values used in a variety of agricultural fields but in New Zealand would most commonly be used in the Dairy Industry. Performance data from a variety of traits collected a Bulls progeny are used rank the sires. It’s a lot more complicated than that but for simplicity, The best bulls are those with the best daughters.

So how to rank a runner. I’m not going to tell,

a) I’d do a poor job of explaining it

b) it’s a work in progress

c) maybe I can actually turn it into something useful for people.

When I was in Cairns for the ISAG conference there were two PhD students who presented work from the Racing and Show jumping industries that were similar to what I’ve been playing with. The show jumping was nearest to what I’ve been doing. Taking the race results, normalizing the data, fitting some variables and out pops order. But some interesting summary data also came out.

What I found amusing about BCRs was the rabid passion for particular races to be included in such a series. It went as far as people requesting and suggesting that people vote for a particular race. The poll audience was biased enough without it becoming a popularity contest. I think for any series to gain credibility there’s a huge number of factors that would go into choosing an event.

I looked at 10 races with the most recent data from 2011 and 2012. The list shouldn’t in anyway be looked at and frowned upon. It was just a proof of concept and reality is that 10, 20 or 50 races could be used. In fact I 1st tried it with 4 races and data from 2008. There is a variety of Long, Short, Medium, Technical and Easy races. Popular and not so much, with a spread from Rotorua down to Te Anau. I also grouped 4 races from the lower south as I knew there would be a number of people running several of them. Or at least that’s what I thought. Ease of data manipulation cost another 3 races being added and a possibly altered position 1 and 2.

So, 10 races, 2030 People but only 161 who ran more than one race. 143 ran 2, and only 18 ran 3. There could be more but I’m not going through the data line by line to check that names are spelt correctly. I have done it for those at the very top though.
Table 1. The races, finishers and percentage that run in another event.

Race

Finishers

F

M

Ratio

% of Repeaters

3Peaks

94

22

72

23%

22%

AvalanchePeak

104

27

77

26%

22%

CaptainCook

151

79

72

52%

19%

Goat2011

571

166

405

29%

5%

Kepler2011

456

127

329

28%

17%

LoopTheLake

307

155

152

50%

12%

Moonlight

64

16

48

25%

25%

Routeburn

306

131

175

43%

23%

Tararua2011

54

12

42

22%

28%

Tarawera

102

19

83

19%

21%

Interesting that only 5% of those that run The Goat, run in other events on the list. That could be heavily biased by not having chosen a race from the Auckland region. The Total sport series is just too complicated to deal with. I really should have looked at Toi’s or Kauri but I also not a fan of events that have different distances which tend to dilute the quality of each field. Tarawera is the exception to that with the best field being in the 100k. Tussock Traverse would eventually make my list of races to include as well.

“Shut up and get to the ranking”

Rank Name Races   Name Races

1

VAJIN ARMSTRONG

3

  WHITNEY DAGG

3

2

MARTIN LUKES

3

  JACQUI GEE

2

3

GRANT GUISE

3

  LOUISA ANDREW

2

4

MARTIN COX

2

  EMMA CRICHTON

2

5

DANIEL CLENDON

2

  REBECCA DRYLAND

2

6

MITCH MUNRO

3

  MEGAN KENNEDY

2

7

NATHAN BELL

3

  SUZANNE DUDDING

3

8

GLENN HUGHES

2

  HELEN GILLESPIE

2

9

JULIAN DAVIDSON

3

  CHARLOTTE BURTT

2

10

DALLAS WICHMAN

2

  VICKY PLAISTOWE

2

 

There you have it. Run well in a big race, with a strong field (Kepler) and you’ll rank higher than running well in a small race. Consistency counts, Grant didn’t beat either Martin Cox or Dan Clendon but was solid in all three of his races. I beat (just) Mitch Munro and Dallas Wichman at the Kepler, had an appalling Moonlight and plummet down the list to 34, thankfully still in front of Louisa Andrew. I’d never hear the end of that.

Intuitively the rankings look to be OK, whats missing is someone who finished say 5-7th in a lot of races. Maybe it’s time to mess with the data and see what that looks like.

Edit#1 I did that, I improved my ranking to 7th J from 6 races.

Edit#2 On a points system Marty goes to #1, Mitch Munro drops out of the top 10 altogether due to the winners of 7 races getting 10 points and jumping up to 6th=

Convicts – no doubt you will want to steal this.

About these ads

7 responses

  1. You have too much time

  2. The scottish ultra marathon series –

    http://www.zen31010.zen.co.uk/sums/index.htm

    scores its races based on the per centage of winner’s time for each race. Reduces vagaries of distance and low numbers. Doesn’t help if a particular race has an abundance or dearth of elite runers though.

    1. Working on how to deal with Little races H and that is more or less how I’m looking at it. 2nd, 3rd and 4th could all have almost identical time sto 1st and 5th could be way off.

  3. What about percentage from the race record time?

    1. Something like that will be worked on fro sure. Could be a good way of measuring quality of the field.

  4. thanks Matt and continuing the data theme: 40 % of the men Top 10 run for Sumner… if any South Island adventure runners reading this get in touch as you could join up then scrape into one of our National Road Relay teams at Nelson then head up Mt Arthur day after…

    1. Rachel hated that list Marty, so there’s a new one just published. Sumner only have 2 in the top 10 now.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 68 other followers

%d bloggers like this: