Wednesday, December 2, 2009
Let the forcible re-rankings begin ... and end
Warning: The following post is long, esoteric and inscrutable. Read at your own risk.
As you know from about four previous posts -- check my labels if you're interested in reading them -- I am involved in ranking all the movies I've ever seen using Flickchart (www.flickchart.com). At least, all the movies I've seen that exist in their database.
I've been stopping every 10,000 rankings to take a snapshot of where the list stands, noting the changes in the rankings since the last time. I do this by recording the rankings in an Excel spreadsheet, a painstaking process that consumed numerous free moments over the last eight days. My wife thinks I'm crazy. Maybe I am.
The first time I did it, just after 20,000 rankings, I recorded only the ranking itself, along with the director and year. But at 30,000 and 40,000, which is the milestone I just passed, I also recorded the number of duels for each film, as well as the percent won. This made the recording process significantly more tedious -- to get the additional stats, I had to drill down one further level into each movie.
The number of rankings interested me only as an academic exercise. I wanted to take a random selection of 10,000 rankings and see how many times a film would come up randomly for duel in that selection. I guess in my mind this would help test how successfully the site randomized the duels -- and if any of the organizers of Flickchart are reading this, I don't doubt you, I just like this kind of anal number-crunching exercise. (As for the percent won, I recorded it mainly because I was already on the page, so might as well.)
The results validated the success of Flickchart's algorithm. The most-ranked film in those 10,000 was Saw with 24 duels, while only Ang Lee's The Hulk and Joel Schumacher's Bad Company didn't come up once. Based on the 2,565 films I had ranked, every film should come up an average of 7.79 times per 10,000 duels (which feature 20,000 films). Of course, my own ability to transcribe proved plenty fallible. For example, according to my records, Eyes Wide Shut had two fewer rankings at the end of that 10,000 than at the start, which obviously means I recorded its initial total incorrectly. And when I summed the total, I came up with 20,011 films ranked, when it should have been 20,000. That could explain the high total for Saw, which may have had 69 duels at the start of that period, rather than 59.
But as I said, this is all academic. I'm more interested in the project of getting the films ranked in the correct order, and thought I had a new method of doing this, which I planned to implement starting at duel #40,001.
You see, my great disappointment from the previous 10,000 rankings was that there wasn't a single change in my top 20. Several weeks ago I bemoaned that Cocoon had made it to #20, and taken root. Because none of the 100-200 movies that would beat it have randomly come up against it, it's still there. And because Cocoon should only come up 7.79 times out of every 10,000 -- it actually came up 7 times this period -- the chances of it dueling against one of those 100-200 are fairly remote. Not to mention that those 7.79 times are only good for this period of 10,000. Next time I'll have more movies, which means even fewer ranking opportunities.
By the same logic, films that deserve to be in the top 20 -- Raising Arizona, for example -- also fight only 7.79 other titles per 10,000. For the sake of simplicity, let's round that up to eight. Eight divided by 2,564 -- you have to subtract Raising Arizona because it can't duel itself -- is .003, or less than 1/3 of 1%. So on average, Raising Arizona will duel only .3 % of all my movies over the course of 10,000 rankings. Since the top 20 movies represent two-and-a-half times those eight titles, Raising Arizona has about a .75% chance of even coming up against any movie in the top 20. Then it actually has to beat that movie in order to move up, and perhaps it would only beat half the movies in my top 20. Which gives Raising Arizona roughly a .38% chance of actually moving into my top 20 during the course of 10,000 rankings. This math may be a bit fuzzy, and the films presented for Raising Arizona to duel are probably not entirely random, but you get the idea.
The solution? Make the duels less random.
That's right, Flickchart gives you the ability to forcibly re-rank any film at any time. You simply go to the movie's home page, click the link that says Re-Rank This Movie, and boom -- you are presented with three new duels, each of which features the movie in question. Once you've selected the winner in the first duel, you get the second duel, and so forth. In fact, I'm sure this feature exists precisely to address the Raising Arizona dilemma described above.
This is probably as good a time as any to tell you about a second concern I've had. Namely, when my friend Don and I originally discovered the site, we didn't know how to get more titles eligible for ranking than the original 350 or so presented to us. It took us about 8,000 rankings before we figured it out. That means that about 350 films have significantly more duels than the rest of them, which means they have had significantly more opportunity to entrench themselves in the rankings. Only three films in my top 20 -- Glengarry Glen Ross, Poltergeist and Cocoon -- are from outside that original list of 350, having won the lottery that Raising Arizona could not. Five of my bottom six films are also from that original 350. The more films there are -- and there are more and more all the time -- the fewer opportunities any of them has to move. The main difference is, those 350 films had an advantage at one point, and they used it to build a home in the top 20.
The solution, part II? Forcibly re-rank all the films that were not part of the original 350, until they have reached approximately the same number of duels as those 350. To figure out what that benchmark should be, I took the average number of duels of those 350 films, which came out to 67. And decided that any film that did not have at least 67 duels would be re-ranked until it did.
Here was my logic: Although the films with more than 67 duels would still have more, and would inevitably increase their own totals as part of the re-ranking process, they would ultimately gain little ground relative to the films that were being re-ranked dozens of times. And once all the films had reached that threshold, tens of thousands of rankings from now, the advantage of the original 350 would be nullified. Only then would the playing surface be leveled, and would I truly be able to tell which films belonged in the hallowed top 20. If I went alphabetically, it would take quite some time for Raising Arizona to come up, but eventually, it too would have its day in the sun. And by doing all the films rather than just select ones, I'd be preserving the element of randomness/fairness that I consider to be a defining and indispensable part of the Flickchart process.
So yesterday after work, I sped through the last 600 titles of my snapshot in a flurry of blurry vision, alphabetized my titles, and got started. The numbered titles came up first.
Robert Luketic's 21 had plenty of rankings, so the first one up for re-ranking was Mark Christopher's 54. And here was where Don, a veteran of the forcible re-ranking system, had warned me I would run into trouble. He described a fluke in Flickchart whereby you'd get the same titles over and over if you tried multiple re-rankings in a row. As each session of re-ranking involves three duels, and as 54 needed 43 more duels to reach 67, that meant I'd need either 14 or 15 consecutive sessions of re-ranking to reach the threshold. (I apologize if all these numbers are killing your brain).
At first, it seemed I was having the same experience Don had. 54 came up against Basquiat, lost to it, and then won two lesser duels. When I refreshed the page and submitted it for re-ranking again, Basquiat was again its first opponent. The cracks in my seemingly perfect system became clear. But on the third time through, Basquiat was not the first opponent, nor any opponent at all. In the dozen more rounds, it came up again once. I decided I could live with that.
300 and 1408 each had enough duels, so next up for ranking was 2012. And here a different kind of disturbing thing happened. As I drove it upwards from four times ranked to 67, it jumped in the rankings from #1495 to #389. Now, I already feel a bit of shame for liking this movie more than most people whose opinions of film I trust. Having it land in my top 400 of all time was just too much. Nonetheless, I felt satisfied that the process had worked correctly. Flickchart had spoken, with me as its medium, and who was I to contest the results?
So I moved onward: (500) Days of Summer, 10,000 B.C., 10 Items or Less, 101 Dalmations. It wasn't until after I'd finished 13 Conversations About One Thing this morning that I found a problem that did stop me in my tracks. And it related to something I'd only done originally as a curiosity.
Namely, the winning percentage of these films was getting artificially inflated.
You see, the way the re-ranking works, the film does not get put up against totally random opponents. Flickchart assumes that if you want to re-rank a film, it's because you are unsatisfied with its position in the standings. Therefore, it uses a goal-oriented approach to the three duels it presents. From what I can gather, the first is a film that is higher than it in the standings -- not significantly higher, but higher. If your film beats that film, you get another film that is again higher in the standings. If it wins there, you get a third higher film. Either it beats that film and lands one ahead of it, or it loses and stays one ahead of the film in the second duel.
But if the film loses its first duel, there is no reason to make the second duel against a film that is even higher. It's the transitive property: If Film A is better than Film B, and Film B is better than Film C, then there is no way that Film C is better than Film A. So when 54 lost to Basquiat, it got to fight something much easier. And it probably beat that film, and may have beaten the next one as well. (I may be anal, but I can't remember every little detail.) Every time this happens, 54 loses to one good film and then either beats or loses to two films that are not as good. End result: 54 is "randomly" dueling against a majority of not very good films.
And the new stats didn't lie. After 30,000 rankings, 54 had won 18.75% of its duels. After 40,000, it was down to 16.67%. But after 40,043 rankings -- the last 43 of which were duels in which it competed -- 54 had gone all the way up to a 39.39% winning percentage, not only reversing the trend, but more than doubling its peak winning percentage. Reason? Duels against crappy films.
But why should the winning percentage matter, right? It's "all academic," right?
I may have thought so, but now I don't. And it goes back to the randomness/fairness that I think is essential to Flickchart. There was something pure about the 16.67% victory rate of 54, a movie that is probably in the bottom 20% of movies I've seen. It's a victory rate that could only have seemed so perfectly accurate because it was up against a random assortment of movies, rather than a purpose-driven selection chosen to help re-rank it.
And suddenly I felt that my forcible re-rankings of about ten films had somehow diluted the purity of this wonderful process. For this same reason, I'm especially reluctant to just cherry-pick the films I'm rooting for. Forcibly re-ranking only select films would blacken Flickchart's eye even more.
As much as I would like to steer this resource in a particular direction, and as much as I would like to see Raising Arizona carve out its rightful place in the top 20 sooner rather than later, I now think that I may just have to be patient. It may not be a perfect ranking system, but it's the best one I've got, and even describing it in this way is vastly underselling it. More correctly, the very existence of Flickchart is a revelation for a list-obsessed film lover like me. It's just that patience is a hard virtue for someone who's obsessed.
I do still have the nagging desire to level the playing field. I do still regret the inequity between those original 350 films and all the ones that have followed. But the only way I can think of to truly make things equal would be to start over again. To wipe all my Flickchart rankings, and to enter all the titles at the same time. To make them duel with equal footing from the start. But even this would only be a temporary solution. With each new film I see, and each new film I add, a gulf is created between the films that have been ranked more, and those that have been ranked less.
Besides, to consider starting over, after all this work ... well, that might just put an obsessive like me in his grave.
All hail Flickchart. Thy will be done.