Tuesday, December 04, 2007

Stat Relevance

I have discussed the various relevance of different golf statistics before, but I decided to take a more analytical look at the subject. Bear with me – this post isn’t going to be a bunch of charts and graphs. For one thing, I haven’t figured out how to display those properly on this site (lucky you…).

Assume for this exercise that the 30 players in my final 2007 list are ACTUALLY the 30 best golfers in the LPGA. If a large percentage of them rate in the top 30 of a certain statistic, I would conclude that stat is more likely to be important in the makeup of a good player than one with a smaller percentage of Top 30 players in its Top 30. This can be reinforced by re-testing the stat while limiting the field to the number of Top 20 players who rank in the Top 20 of that stat. I’ve excluded Hee-Won Han from this study as she didn’t play enough events to qualify for the LPGA stat lists, so there are only 29 players being surveyed. After tallying up the numbers for each stat, here’s what I discovered. The most relevant stat is at #1.

1. Money List
2. Top 10 Percentage
3. Scoring Average
4. Total Rounds Under Par
5. Greens In Regulation (GIR)
6. Birdie Percentage
7. Total Birdies
8. Putts Per Green In Regulation (PPGIR)
9. Victories
10. Driving Distance
11. Putting Average (total putts per round)
12. Driving Accuracy
13. Sand Saves

Those first three are a little biased because they directly influenced who got into my Top 30 to start with. I’m going to ignore that bias because frankly, any subjective list of Top 30 players you might give me ought to come out with similar results to this. You may have noticed #6 Birdie Percentage is not one of the listed LPGA stats. I came up with that on my own by dropping the posted numbers into a spreadsheet, calculating the percentages and sorting by the results. To give you an idea of how I ordered this list - 27 of the 29 players were in the Top 30 of the Money List while 19 of the Top 20 players were in the Top 20 of the Money List. Those totals were the best (slightly) amongst these 13 statistics. For comparison, Driving Accuracy totaled only seven of 29 and five of 20.

The finding that most surprised me? Victories ranks only at #9, just behind PPGIR and Total Birdies. After I thought about it, I wasn’t that surprised any more. Let me remind you of what this exercise was for – to find out which statistics are the best indicators of a good (or at least a Top 30) player. Silvia Cavalleri, Young Kim and Meaghan Francella all won a tournament in 2007 while Angela Park, Jeong Jang and Jee Young Lee didn’t – does that make that first group of players better than the second? Of course not. I’m not trying to say “victories aren’t important”, I’m saying that a zero in a player’s Victory column for one season doesn’t mean they aren’t one of the top players. Nobody thought Paula Creamer wasn’t a great player in 2006 when she didn’t win – nobody should think Cavalleri is a great player because she won this year.

I was a little surprised that GIR only came in fifth and PPGIR eighth, but they are in a group of numbers which came out very close together. The gaps in relevance between some of these are vastly different - #1-3 are very close, 4-9 are a little ways back, then there’s a big gap back to 10 and 11 with 12 and 13 close together at the bottom. What I hope to do with these results is develop a more detailed formula to rate the players which could be more accurate, and to allow me to rank them even further down the scale than 30th place. Each stat that I chose to use would be weighted according to these findings to give me a rating system that could work for even the lesser players.

No comments: