Ranglijsten beste prestaties 2015/2016

Discussie in 'Lange baan' gestart door EenBrabander, 9 okt 2015.

  1. I have spent some time this last week, doing the final (!) test of how the ELO-model actually worked last season. This last test covers 246 observation of two and two skaters (head-to-head), where the better ranked skater is ranked between 95 and 105 points higher than the other skater before the race. (Those actually are all such constellations I could find over the months from January to March 2016).

    As mentioned before, a difference of 100 points means the better ranked skater should win 64 % of those races - according to the ELO-model.

    The 246 observations are split in three different "frequency" groups, based on how often the top level competitions are held:

    (1) 500 meters. Actual winning probability: 0,625 (n=56)
    (2) 1000-1500-5000/3000m. Actual winning probability: 0,593 (n=123)*
    (3) 10 000 / 5000 m - sprint - allround. Actual winning probability: 0,657 (n=67)

    In total: Actual winning probability: 0,618 (n=246)

    The theoretical winning probability of 0,64 lies within the confidence interval of all three groups, and it also lies within the overall confidence interval.

    I guess this test confirms that the system works, at least good enough for my use!
  2. I will try to do this. Maybe I will answer some of the rest of your comments later.

    I have used the ELO-model to make my ranking lists only for the last season (2015-16)*. So my model for making season best and all time best lists is based on those results.

    One of the extras you get from using the ELO-model is the socalled performance rating. One excample:

    Three skaters have these ELO points before a race:

    A 2300
    B 2200
    C 2100

    The results of this race is:
    1. B
    2. C
    3. A

    This gives us new ELO points:

    A 2300 -21 = 2379
    B 2200+15 = 2215
    C 2100+6 = 2106

    But it also gives us those performance ratings:

    A 2150-400 = 1750
    B 2200 + 400 = 2600
    C 2250 + 0 = 2250

    The performance rating formula is explained here:

    https://en.wikipedia.org/wiki/Elo_rating_system


    My theory was to use those performance ratings to describe the relation between performance, race time and conditions (= air pressure).

    I used this past season to collect data for these three elements. The first two of course gave themselves, and the last one was found by using the air pressure calculator (http://www.mide.com/pages/air-pressure-at-altitude-calculator). I use the default value of 15ºC for temperature.

    This is a small part of my data collected for the 1000 m:

    upload_2016-6-14_18-4-1.png

    For all distances apart from the 500 m for men**, I was happy to find that there was a linear connection between these 3 elements, and so it was easy to make up the formulas for each distance.

    For example, the formula found for the 1000 m:

    time = 63,93 + 0,01221 * OLT – 0, 00365*RP

    - or reversed:

    RP = (63,93+0,01221*OLT-time)/0,00365
    = 17515+3,3452*OLT-273,97*time

    OLT = altitude corrected air pressure
    RP = performance rating

    So this is the formula used to make season's best and, eventually, all time best list! All the formulas for the men's distances are found on the bottom here:
    http://www.skoyteranking.net/45500112



    *) In order to find "start values" for the different skaters and distances I had to use results from previous seasons as well, but that's a different story...

    **) For the 500 m, I had to do some extra work with the performance rating, before I could find the appropriate linear formula.

  3. With "underrrated", I guess you imply that 2000 performance points (RP) on one distance should be about as good a race as 2000 RP on another distance.


    That would be excellent, but is not really necessary to fulfill the purpose of the formula, which was to compare performances under different conditions, but on the same distance.



    But anyway, thanks to your recalculations - I actually found a typing error on my list of formulas. The formula for the 10 000 m should be:

    10 000 m - 12779*2,0113*OLT-15,66*time

    (not 12 279). So your calculated points for the 10 000 m are actually 500 to high... Sorry about that.
  4. There seem to be advantages for skaters who starts in quartets, but not on the 5000 m for ladies:

    5000 men: 2,48 sec.
    10 000 men: 3,71 sec.
    3000 ladies: 0,60 sec.
    5000 ladies: -0,32 sec!

    On the 10 000 m for men, the advantage is not statistically significant itself, and also it doesn't make the formula stronger (considering the socalled adjusted r2).

    So I decided to keep it out for those other distances. The problem is that there are not too many important races on the longer distances, and it is also a bit of a problem that not all skaters give 100 % in every race. Conclusion: The formulas for short distances are probably more reliable than those for long distances.
  5. EenBrabander

    EenBrabander Well-Known Member

    Changed my model a little bit again. Removed sporter specific correction and changed way of calculating track correction factors. With these changes, I can just copy competition results, put in the needed values (air pressure, track factor, track height), and I've got the corrected times. This way, I can correct large amounts of results
    extremely quickly! :D

    With the sporter-specific thingy, I had to put in different values for each skater. That's doable for 5 skaters, but not for 50.
    Skøyteranking vindt dit leuk.
  6. I never really understood what good the sporter specific correction would do, anyway...

    When it comes to the track correction, it makes sense to have correction factors for outdoor, and semi-covered tracks. And even for covered tracks with no heating, I guess.

    But covered tracks with potential to bring the temperature up to 15ºC should be "much the same", shouldn't they?
  7. EenBrabander

    EenBrabander Well-Known Member

    Kolomna has blowers, Heerenveen has most of the time quicker ice than other tracks and Sochi is more aerodynamic than old Berlin. Some Asian tracks don't have slow track records on 500m, but do have slow records on other distances... Well, there's a lot of difference between each rink.
    So my new system is based on results, just on results. What I need, is the top-10 season times of a track on the 5 main distances of the last few seasons. Luckily, that's easy to do with some database sites. Then, I calculate a bit with those numbers and correct for kind-of-competition: sometimes, there's only a Norges Cup in Hamar, but in another season there are World Championships there. Most of the time, with the competition correction (I put competitions in category), every season should give around the same number (after competition correction). A track can't get much faster in a few years, except for when it's renovated or so. With the corrected competition values, I calculate track factors per distance, multiply them with track factors for other distances but the same track, and I have actual the track factor. Sounds REALLY complicated, I know, but all you have to do is putting in some results and a track factor rolls out. Other things in my system work like that too. The formulas are huge, but it's quite easy to handle them. With this system, I can correct large amounts of competitions quite quickly, and more specifically, whole rankings. Correcting the winner's time takes as much time as correcting the results of all the participants.
    So when I put a whole bunch of competitions in, I can get a very extensive list of this season, 15/16, and for coming season, 16/17. When I compare sporters and times on those list, it should be eventually easy to see which sporters skate better or worse than previous season. Only problem: it's getting slow quickly. Having done only a few 500m competitions (WCh Dist., Youth Ol, World Cup Final, WCh Allround), I already have 150 entries. One distance, one gender. When I correct all the world cup races, WCh sprint and some national competitions, I'll be going over 1000 quickly... Dear laptop, brace yourself! ;)
    Skøyteranking vindt dit leuk.
  8. All this is about having enough sparetime, isnt it...?

    Sure, handling the formulas is easy - but creating them, evaluating them, etc is a lot of work if you want to do it properly.

    For me, it is crucial to make the job possible to handle, not only today, or this week, but for years from now. So I probably have to stick to a simpler model, I guess.
  9. EenBrabander

    EenBrabander Well-Known Member

    Well, you haven't heard yet about the fact that I've also been conlanging for four years now. That means: I've created a fully functional language...
    Vdoš, il queg yppatika ät leiiän skävyto, thett mi tnío jäcina 4 årtåre fta fonnus. Miitiršoušin mynno äfío pljætten chiillíolli aidai! (yep, I've translated that sentence above to my language)
    Skøyteranking vindt dit leuk.

Deel Deze Pagina