Posts Tagged ‘site’

Site update 090723

Saturday, July 25th, 2009

It’s been awhile since my last rant, so I now present you the new super-ultra-mega-rant-3000. Feel free to skim for headings of interest, or just read the whole thing to earn 10 extra evil points.

I’m going to talk about recent events, not dumb black/white US shit kind of recent events, but actual recent MMA events. I’ll also discuss a big UFC myth, delve into the UFC business model a bit, and talk about how the mma-elo.com system reacts to new fighters. If this interests you at all then read on, if not, make a suggestion for next week! 8P
(more…)

BTN: System Accuracy

Wednesday, December 24th, 2008

Unlike my last few relationship I’m going to start this rant by being painfully honest. This rant contains math. There I said it, but now that we know, lets try to work through this together.

I’ve tried to provide examples and keep the math simple. If you have any questions please feel free to ask and I’ll attempt to explain further. Even for those that hate math I’d suggest trying to read through it a time or two.

With the disclaimer out of the way, lets move on. The point of this rant is to examine how best to measure and represent system accuracy. I’m going to go over three methods that have been suggested and show how the current system model on the site measures up. There will also be a brief(ish) discussion of what I perceive the pros and cons to be with each approach. Feedback here is much appreciated as I’d like this metric to be something people could quickly use to weigh one system against another.

Method 1: Heads I win, Tails you lose

The first approach is the most basic. It looks at every fight and measures how many fights the higher rated fighter won. It then divides that number by the total number of fights.

Example

Assume there were only 10 fights, and the higher rated fighter won 4 of them, then the result would be 40%. That is to say that the system was correct 40% of the time.

The good, the bad, and the ugly

PROS: One of the upsides of this approach is that the number is very easy to understand. It’s very easy for someone to look at past picks and understand they got 2 out 5 right. Another benefit is that this number is easily compared to other systems.

CONS: The biggest downside to this metric is that it deviates from the entire basis of the site. When talking about sports almost nothing is ever 100%. One of the major benefits of the site is that it is able to recognize that even though one fighter is rated higher that doesn’t guarantee victory or even proclaim vast superiority(such proclamations should be left to the fans and the fighter’s mom).

Another potential drawback is that every single fight is counted.

Thoughts: Although I like the simplicity of the number and how portable it is, I don’t like how it fails to utilize the site’s expected win percentage model. Additionally, I think if an approach like this were to be used it might make sense to specify additional parameters(ex. minimum number of fights).

Site result: 65%

Method 2: Fighter history

The way this approach works is by going fighter by fighter looking at every fight they had past their sixth fight where their opponent also had at least six fights previously. For each of those fights I calculate their expected number of wins, and total their actual number of wins.

The absolute value of the difference between expected wins and actual wins is then accumulated across all fighters and this number is ultimately divided by the total number of fights.

Example

Lets look at a two fighter example to get a better idea of how this works:

Fighter A
Total fights = 2
Expected wins = 1
Actual wins = 1
Absolute difference = 0

Fighter B
Total fights = 4
Expected wins = 3
Actual wins = 2
Absolute difference = 1

In this case our total absolute difference is 1 and our total number of fights is 6. From here we can divide the total absolute difference by the total number fights to get a percentage of incorrect outcomes across all fights.

1 / 6 ~ 0.167

We can then subtract that number from 1 to get a percentage correct for the system:

1 – 0.167 = 0.833 ~ 83%

The Good, The Bad, and The Ugly: The Quickening

PROS: Unlike the previous approach this method factors in the expected win percentage. Which is good because there is a lot of value in the expected win percentage and not just for looking at “pot odds” when placing a bet on a given fight.

CONS: The approach used in this method is definitely more complicated than the first. There is also the possibility that this number doesn’t represent a useful metric for people.

Thoughts: Overall I like this approach better than the first method. It factors in the expected win percentage and is an overall deeper number. There’s no doubt that it would be misinterpreted at a glance by some, but I’m willing to put in the time so that anyone truly interested would be able to understand it.

Basically this number helps give us an idea of the average fighter’s actual performance compared to their expected performance. I’m not sure how useful a measure that is though, so feedback is definitely welcomed on this method.

Site result: 86%.

Method 3: Slice of Life

The final suggested method is an extension of the rant I did a little while ago. How it works is by slicing all fights up into smaller pools of fights based upon the difference in rating between the two fighters. From there it looks at the actual percentage of fights won by the favorite versus the expected percentage of fights that would be won.

In order to get an overall picture of the system on a per fight basis we multiply the number of fights in a given slice, by the absolute difference, accumulate that value across all slices and then divide by the total number of fights from all slices.

I know that might sound complicated, but if I wrote it as a formula with sigmas and stuff you’d hate me even more. Even if the math sounds like gibberish please try to picture what it represents in real world terms. What this method boils down to is showing how close to the expected win percentage various rating slices actually come.

Example

Consider two very broad slices:

1-300 Rating difference:
Total fights = 10
Expected win percentage 65%
Actual Win Percentage 70%
Absolute Difference = 5%

301-600 Rating difference:
Total Fights = 5
Expected win percentage = 85%
Actual win percentage = 70%
Absolute difference = 15%

We then take ((5 * 10) + (15 * 5)) / (5 + 10) ~ 8.33%

That’s means that on average the above system would be off by about 8.33% or put another way it’s about 91.67% accurate in terms of expected win percentage based upon rating versus the actual win percentage.

The Good, The Bad, The Ugly, and The Crystal Skull

PROS: The stat focuses in on the expected win percentage approach. It allows one to quickly see how real life results are measuring up against predicted outcomes. This number helps give an idea of the overall accuracy of the system when it says Fighter A is expected to win 65% of the time versus Fighter B.

CONS + Thoughts: At first I was going to say that it’s more of a system stat than a user stat, but that’s pretty much what we are looking for here. What we want is a way for a user to gauge how accurate the other numbers they are seeing are. It’s great to claim that Fighter A will win 45% of the time, but if in actuality the expected win percentages are off by 40% then the value of the original expected win percentage number is greatly deflated.

Site result: 97%

some words Just random

Whenever you deal with stats it’s important to know exactly what the number you are looking at represents and whether that number is remotely relevant to what you are doing.

For a lot of people all they will care about is method 1. Sadly, that wastes a lot of the systems value. There is a world of difference between a 1601 rated fighter taking on a 1600 rated fighter and a 1950 rated fighter taking on a 1700 rated fighter. To simply say A > B (even if by the slimtest of margins) portents guranteed victory is a mistake.

Being the numbers geek that I am, method 3 is the most interesting to me. It’s a number that helps clarify the validity of other numbers. It’s interesting to see how close (on average) to expected results the system is actually coming. This also provides a degree of cushion when weighing expected win percentage between two fighters of various ratings.

If there are any methods I missed or any additional parameters you would like to see applied to any of the methods, please let me know. I’d really like to reach a bit of a consensus on this in the near future as it will prove useful with a couple future features/rants.

It’s been awhile

Wednesday, December 3rd, 2008

It’s been awhile

22 days to be exact(ish). During that time I’ve actually written a few other rants and had dozens more floating around in my head. Sadly, most of them were either too *mathy* or they were too offensive to some (primarily stupid people). One thing I’ve learned over the years is that for many it’s more about how you say something than what you actually say. That could actually be it’s own rant, but I’ll save that for another time. Instead, I’d like to take a few minutes to update everyone on various site related things. I can’t promise that I won’t slip up and insult stupid people, but I’ll try reasonably hard not to.

In a week or two

There are a lot of things I’m currently working on. There are the previously mentioned rants that I hope to revise into a more acceptable format. I still feel as though a lot of people don’t fully understand how the site works or why an arbitrary ranking system is of much use. My hope is that this rant will be the first in a series of more timely rants that help explain what is going on and why it is being done in such a way.

In addition to the rants progress continues to be made on the next big site feature. The fighter compare feature seems to be working well and most seem happy with it. There are several tweaks I’m going to make to it and I welcome additional suggestions. At this point though my development time has shifted to this other feature. Once I can come up with a super clever name for it like “fighter compare” I’ll speak more about it.

All I need

I am thankful for those that have helped me improve the site thus far and I welcome additional help. Here are a couple of things you can do to help me improve the site that are guaranteed to be better than watching “Twilight” or putting a sharp poker in your eye. OK, not sure it’ll be better than the poker, but definitely better than Twilight.

One of the biggest challenges faced remains accurate information. I go to great lengths to ensure the information on the site is both as plentiful and as accurate as possible. If you see something that is incorrect please let me know. Whether it’s a missing fight/fighter or the incorrect order of fights from a grand prix. It’s important to remove any and all misinformation on the site as fast as possible. To the fighters/promoters reading, please feel free to contact me directly with any fighter/event information that needs to be added or updated.

Another thing I’m working on now that I could use some input on is some sort of system metric. I recently (22ish days ago) talked about expected win percentage and site accuracy. What I would like to do is devise a stat that represents just how accurate the system and parameters currently used on the site are. I’ve got a few ideas, but nothing good enough that I wouldn’t trade it for a hot redhead.

Ideally the stat would allow someone at a glimpse to see how changing one parameter impacted the overall results of the system. I often make special results for people based upon different criteria than the main site uses and it’d be great to have a standardized way of telling how the changed version varied from the currently used version. It’d also obviously prove useful in helping me decide which system changes to push to the main site.

Finally, I’m in the process of adding an FAQ section to the site. If there are questions you’ve had (or still have) that you think others might also have, please let me know.

Fighter Compare

Wednesday, October 15th, 2008

A new feature available on the site is fighter compare (found *here*). This feature will let you compare fighters careers head to head.

The main parts of fighter compare are:

Rating Compare
This shows the current and max ratings for both fighters in addition to their one year and three year mods and their strength of schedule.

Stats Compare
This shows the wins, losses, draws, NCs, KOs, subs, Decisions, and unanimous decisions for both fighters. It also shows the percentage of these numbers in terms of overall fights/wins.

Fight Comparison
This shows head to head fights between both fighters. It also shows all common opponents for the two fighters.

Possible Problems
I’ve tested the page a lot and have had others look it over(thanks!), but it’s possible we missed something. If you run into any issues please let me know.

One known issue is that the auto suggestions can be a bit slow. That is due to the web server the site is currently hosted on and an attempt on my part to make the best of it. In most cases simply typing one extra character or typing slightly slower will make everything work properly.

Examples
Here are a few random examples to get you started…

Anderson Silva And GSP
Here we see GSP with a slight lead in current and max rating. A better one year mod, but a lower three year mod(serra loss…uhh FTL lol). Additionally, GSP has a higher strength of schedule, but I think most knew that.

Looking at the stats we see more wins for anderson, but a higher win percentage for GSP. We see more (t)KOs for anderson and a higher (t)KO%, and GSP has both subs and sub percentage.

Both have five decisions, with GSP having a single split (penn).

They have no head to head fights and share no common opponents.

I think it’s safe to say that both are amazing athletes who have had fantastic careers, but I’m sure some will try to use GSP’s current 3 rating point edge as definitive proof that he rules and anderson drools. *shrugs*

Fedor And Nog
Fedor has higher current and max rating. However, nog has the edge in strength of schedule, one year mod and three year mod.

Fedor has fewer wins, but a significantly higher win percentage.

Nog has the edge in sub victories (19 to 15), but the overall sub percentage is little more than a point different (52.78% to 51.72%).

Head to Head we see the two wins for Fedor and the one NC.

We see 8 common opponents including crocop, zulu, schilt, herring and sylvia. We also see TK on the list who fedor is 1 and 1 against and whom nog has a draw against.

Matt Hughes And Fedor
I know, I know, WTF…The reason for this one is because they actually share a common opponent. This is one of those “trivia”/”amusing” type of things I enjoy finding. Five bucks to the first person who knows the answer before looking and sends me a self addressed stamped envelope. 8)

And then???
No and then~!!!

Go play around with it. Look at rashad and forrest or penn and gomi or herring and sapp. *shrugs* If you have issues please let me know. If there’s additional information you’d like to see let me know. If you find other interesting results, definitely let me know. 8)

How close is second place?

Wednesday, July 23rd, 2008

One of the major problems with most MMA ranking lists is not knowing where fighters stand in the grand scheme of things. Imagine looking at a golf ranking list and seeing Tiger #1 and then whoever in spots 2 through 9. Is second place as close to tiger as third place is to second place? Is 6th place closer to second place than 8th place is to 7th place? If all we have is a simple list numbered one to ten we really have no way of telling.

Imagine if we were to look at a list for the top 5 HR hitters in baseball so far this season and saw:

1) Batter A
2) Batter B
3) Batter C
4) Batter D
5) Batter E

Uhhh, super, there they are. Now what? I mean, even if we accept that this list is 100% accurate right now this second, what else can we really discern from it? Do we know how far ahead Batter A is? Do we have anyway of telling if Batter B is closer to Batter A than Batter E is to Batter B?

On the other hand, imagine a list that looks like:

1) Batter A – 49 HRs
2) Batter B – 48 HRs
3) Batter C – 42 HRs
4) Batter D – 42 HRs
5) Batter E – 41 HRs

Now here is a list that we can do some things with. Looking at this list we can see that Batter A and Batter B are very close, and have a very large lead on the rest of the batters. We can also see that Batter E is closer to third place on the list than batter C is to second place.

The gaps between these positions is what is missing from most MMA ranking lists. It is also these gaps that mma-elo attempts to address by showing the actual fighter ratings used to place fighters at their spot on the list. Taking a look at a list such as:

Current Middleweights

We can see the following breakdown:

1) First *shrugs*
2) -72
3) -129
4) -139
5) -174
6) -186
7) -191
8) -193
9) -200
10) -205

Additionally, we can see that there are about eight other fighters within 50 points of 11th place currently.

Looking at this information we can see that first place is *WAY* out in front currently. We also see a decent gap between second and third, but then things get closer. The gap between third and fourth is only ten points and places five through nine are separated by only 26 points!

This information is much more functional because we are able to get a better feel for where two fighters stand and how fight outcomes will affect them. Imagine looking at a normal list such as:

1) Fighter A
2) Fighter B
3) Fighter C
4) Fighter D
5) Fighter E

and I tell you that Fighter B beat #9 Fighter I and Fighter C beat #8 Fighter H. What does that do to the above list? Since we don’t know how far apart the Top 5 were to begin with and we have no idea how close the fighters ranked at #8 and #9 were we can’t really say what will happen.

If we were talking about a list from this site though where we know how big the gap is between not only fighter B and Fighter I, but also Fighter C and Fighter H, *AND* the gap between Fighter B and Fighter C we are able to have a much clearer picture of where things will stand following the new fights. It is this predictability and this granularity that makes a rating based system so much more powerful and useful than a simple #1 through #10 list.

As you browse the site please make sure to pay attention to not only the *RANKING* of the fighters, but also the *RATING* that put them into that position. Sometimes I will get asked why a fighter is down at #8 and the person asking is missing that the fighter’s rating is only 20 points or so lower than #4. By simply looking at the ranking numbers you are missing out on one of the biggest benefits of the site.