Analyzing EPH

By Bruce Schneier: Jameson Quinn and I analyzed the E Pluribus Hugo (EPH) voting system, proposed as a replacement for the current Approval Voting system for the Hugo nominations ballot. (This is an academic paper; the Hugo administrators will be publishing their own analysis, more targeted to the WSFS Business Meeting, in the coming weeks.) We analyzed EPH with both actual and simulated voting data, and this is what we found.

If EPH had been used last year in the 2015 Hugo nominations process, then…

The number of slate nominees would have been reduced by 1 in 6 categories, and by 2  in 2 categories, leaving no category without at least one non-slate nominee.

That doesn’t seem like very much. A reasonable question to ask is why doesn’t it reduce the number more. The answer is simply that the slate was powerful last year.

The data demonstrates the power of the Puppies. The category Best Novelette provides a good example. This category had 1044 voters, distributed over 149 different works with 3 or more votes. Of these voters, around 300 (29%) voted for more Puppy-slate works than non-Puppy ones, and about half of those (14%) voted for only Puppy-slate works. These numbers are also roughly typical. The other 71% of the ballots included under 3% with votes for any Puppy work (this is relatively low, but not anomalously so, compared to other categories).

Despite being a majority, the non-Puppy voters spread their votes more thinly; only 24% of them voted for any of the top 5 non-Puppy works. This meant that 4 of the 5 nominees would have been from the Puppy slate under SDV-LPE or SDV.

(SDV-LPE stands for “Single Divisible Vote – Least Popular Elimination,” the academic name for this voting system. SDV is “Single Divisible Vote,” a long-standing and well-understood voting system.)

To further explore this, we took the actual 2014 Hugo nominations data from Loncon 3 and created a fake slate, then analyzed how it affected the outcome at different percentages of the vote totals:

In Figure 1, we assume perfectly correlated bloc voters. They vote in lockstep (with minimal exceptions to prevent ties), and their five nominations are completely disjoint from the other nominations. As you can see, both SDV-LPE and SDV reduce the power of the bloc voters considerably. Under AV, the voting bloc reliably nominates 3 candidates when they make up 10.5% of the voters, 4 candidates when they make up 12.5%, and 5 when they make up 19%. Under SDV-LPE, they need to be 26% of voters to reliably nominate 3 candidates, 36.5% to reliably nominate 4, and 54% to reliably nominate 5….

Figure 2 simulates a more realistic voting bloc. We sample the actual behavior of the bloc voters in the 2015 Hugo nominations election, and add them to the actual 2014 nominations data. For the purposes of this simulation, we define bloc voters as people who voted for more Puppy candidates than non-Puppy candidates. In this case, the actual bloc voters did not vote in lockstep: some voted for a few members of the slate, and some combined slate nominations with non-slate nominations. For the purposes of the simulation, when they voted for the nth most popular non-Puppy candidate in 2015, we imputed that into a vote for the nth most popular non-Puppy candidate in 2014. In this case, SDV-LPE and SDV reduce the power of those voting blocs even further. Under AV, the voting bloc reliably nominates 3 candidates with 14% of the voters, 4 candidates with 17% of the voters, and 5 with 39%. Under SDV-LPE, they need to make up 27.5% to nominate 3 candidates, 38% to nominate 4, and 69.5% to nominate 5….

The upshot of all this is that EPH cannot save the Hugos from slate voting. It reduces the power of slates by about one candidate. To reduce the power of slates further, it needs to be augmented with increased voting by non-slate voters.

There is one further change in the voting system that we could make, and we discuss it in the paper. This is a modification of EPH, but would — for the slate percentages we’ve been seeing — reduce their power by about one additional candidate. So if a slate would get 5 candidates under the current system and 4 under SDV-LPE (aka EPH), it would get 3 under what we’ve called SDV-LPE-SL. Yes, we know it’s another change that would require another vote and another year to ratify. Yes, we know we should have proposed this last year. But we had to work with the actual data before optimizing that particular parameter.

Basically, we use a system of weighing divisible votes named after the French mathematician André Sainte-Laguë, who introduced it in France in 1910. In EPH, your single vote is divided among the surviving nominees. So if you have two nominees who have not yet been eliminated, each gets half of your vote. If three of your nominees have not yet been eliminated, each gets 1/3 of your vote. And so on. The Sainte-Laguë system has larger divisors. If you have two nominees who have not yet been eliminated, each gets 1/3 of your vote. If three of your nominees have not yet been eliminated, each gets 1/5 of your vote. Each of four get 1/7; each of five get 1/9. This may sound arbitrary, but there’s well over a hundred years of voting theory supporting these weights and the results are still proportional.

Implementing SDV-LPE-SL using the actual 2015 Hugo data:

SDV-LPE-SL comes even closer to giving slate voters a proportional share, with 7 fewer slate nominees overall, and only 1 category without a choice between at least 2 non-slate nominees.

For the perfectly correlated voting bloc simulation:

Under SDV-LPE, they need to be 26% of voters to reliably nominate 3 candidates, 36.5% to reliably nominate 4, and 54% to reliably nominate 5. Under SDV-LPE-SL, they need to be 35% for 3, 49% for 4, and 66% for 5.

And for the more realistic voting bloc simulation:

Under SDV-LPE-SL, they need 36% for 3, 49% for 4, and over 70% for 5.

That’s a big difference.

Here’s our paper. It’s academic, so it refers to the voting system by its academic name. It spends a lot of time discussing the motivation behind the new voting system, and puts it in context with other voting systems. Then it describes and analyzes both SDV-LPE and SDV-LPE-SL.

MAC II Statement on Data Release for EPH Testing

The Sasquan and MidAmeriCon II committees responded to File 770’s query about the transfer of anonymized raw 2015 Hugo nominating ballot data for use in testing the proposed E Pluribus Hugo vote tallying method.

Linda Deneroff, Sasquan’s WSFS (World Science Fiction Society) Division Head, wrote:

Sasquan passed its nominating data to MidAmeriCon II for analysis in the EPH process. Neither Glenn [Glazer], John [Lorentz], Ruth [Lorentz] nor I were involved in the analysis.

Tammy Coxen. MidAmeriCon II WSFS Division Head, explained what was done with the data:

After EPH passed at Sasquan, the MidAmeriCon II Hugo Administration team publicly committed to testing the system so that real data about its efficacy could be made available to WSFS members before the business meeting where ratification would take place. As part of that testing, MidAmeriCon II was collaborating with two researchers (Bruce Schneier and Jameson Quinn) in evaluating the system. As previously announced, it was determined that the data was unable to be sufficiently anonymized for a general release, so the researchers were provided data under a non-disclosure agreement.

There was to have been a coordinated release of the research findings between MidAmeriCon II and the researchers, which would have made clear the circumstances under which the data had been shared. Planning was already underway regarding that release, but as noted, analysis is still occurring. Our intention is to jointly share the research findings when they are complete, which will be well in advance of the business meeting at MidAmeriCon II.

The previously announced concerns that Coxen refers to were discussed here in a September 2015 post, “Hitch in Sasquan Nominating Data Turnover”.

E Pluribus Hugo Tested With Anonymized 2015 Data

By Jameson Quinn: [Originally left as a comment.] So, Bruce Schneier and I are working on an academic paper about the E Pluribus Hugo (EPH) proposed voting system. We’ve been given a data set of anonymized votes from 2015. I don’t want to give all the results away but here are a few, now that people are actually voting for this year’s Hugos:

  • A typical category had around 300 ballots which voted for more puppies than non-puppies, and about half of those ballots were for puppies exclusively. There were few ballots which voted for half or fewer puppies (typically only a few dozen). The average number of works per ballot per category was around 3.
  • There were some weak correlations among non-puppies, but nothing that remotely rivals the puppies’ coherence. In particular, correlations were low enough that even if voting patterns remained basically dispersed, raising the average works per ballot per category from 3 to 4 (33% more votes total) would probably have been as powerful in terms of promoting diverse finalists (that is, not all puppies) as adding over 25% more voters. In other words: if you want things you vote for to be finalists, vote for more things — vote for all the things you think may be worthy.
  • EPH would have resulted in 10 more non-puppy finalists overall; at least 1 non-puppy in each category (before accounting for eligibility and withdrawals).
  • SDV(*) would have resulted in 13 more non-puppy finalists overall.
  • Most other proportional systems would probably have resulted in 13 or 14 more.
  • The above numbers are based on assuming the same ballot set; that is, that voters would not have reacted to the different voting system by strategizing. If strategizing is not used unless it is likely to be rational, that is a pretty safe assumption with EPH; less so with other proportional systems. Thus, other systems could in theory actually lead to fewer non-puppy nominees / less diversity than EPH.

Feel free to promote this to a front page post if you want. Disclaimer: EPH is not intended to shut the puppies out, but merely to help ensure that the diversity of the nominees better reflects the diversity of taste of the voters.

(*) Editor’s note: I believe SDV refers to Single Divisible Vote.

Update 02/08/2016: Added to end of second bullet missing phrase, supplied by author. Corrected footnote, based on author’s comment.

Pixel Scroll 7/19

(1) Jim Davis, who was on the set while they shot the second episode of Star Trek:The Next Generation, recalls “Patrick Stewart’s trailer still had a handwritten sign on it (by him) that said ‘Unknown British Shakespearean Actor’).”

(2) The Catcher In The Rye bar in LA is gives its drinks literary names. Here is a sample of what the menu has to offer.

THE RAVEN

Absinthe, Benedictine, Dry Vermouth, Orange

“Once upon a midnight dreary, while I pondered, weak and weary…”  -Edgar Allen Poe

CLOCKWORK ORANGE

Templeton Rye, Sweet Vermouth, Aperol, Burnt Orange, Orange Bitters

“But what I do, I do because I like to do.” -Anthony Burgess

SLEEPY HOLLOW

Mount Gay Rum, Bitters, Simple Syrup, Pressed Apple Juice

“Don’t you ever go laughing at the Headless Horseman” -Washington Irving

But no Bradbury reference? That seems out of character for a book-themed enterprise based in his home town.

Compare this to the Literati Cafe which made the papers a few years ago by serving a cocktail named the Fahrenheit 451.

(3) Nicholas Whyte has updated his survey 2015 Hugo Awards: how some more bloggers are voting.

(4) Patrick May tested EPH with the 1984 Hugo data (scroll down to comment #299). I still got two Hugo nominations. What more do I need to know?

(5) I may have forgotten to mention that Sarah A. Hoyt and the Mad Genius club don’t write about Puppies most of the time. Dare I say that I usually enjoy the expositions about professional writing?

Consider Hoyt’s “Selling Books To Real People”:

This post has been prompted by my friend Amanda Green’s post on Amazon.  To whit, by the implication that Amazon killed Borders that others have flung up.

This is a touchy subject, because although I was informed that nice ladies don’t discuss politics, religion or coitus in public, I’ve found that the touchier subject is money: making it, keeping it, wanting it….

Did Amazon kill Borders?  Well, only if you look at it as assisted suicide.

Borders grew and became very big by having a system.  The system was ordering to the net.  They ordered only proven sellers.  The way they did this was by looking in the computer at the author’s name, and seeing how many of his hers or its (must be post binary) book they had sold.  Then they ordered just that number.

This system worked magnificently while Borders was a small bookstore, in a small town, and before the publishers tumbled onto it.  Two things Borders didn’t take into account: the variety of regional tastes and the corruption inherently possible in the system….

And this Mad Genius Club report “How to work with artists” based on the advice given by Sam Flegal, Libertycon artist GoH to self-publishers.

Just as we frequently say here that “It’s all in the contract!” and “You are not selling your book, you are licensing intellectual property!” Guess what? When dealing with artists, it’s all in the contract. And when you talk to them about using an image for a book cover, you’re not buying the work, you’re licensing intellectual property. Yep, that’s right: they’re just as concerned about licensing and IP rights when they talk to you as when you talk to a publisher… because in this case, they’re the IP creator, and you’re the publisher!

The shoe is now on the other foot. So, what terms should you offer the artist?

(6) Can it be that the makers of Sky Captain and the World of Tomorrow, pioneers of the “digital backlot” that became the model for producing superhero summer blockbusters, lost so much money they never got to reap the benefits of their own system?

But a few great reviews don’t make a difference if your numbers are bad, and Sky Captain’s were very bad. Cinemagoers, perhaps put off by its black and white visuals or comic-strip tone, stayed away: the film made just $15.5 million on its opening weekend. This would have been fantastic if the film had used the tiny budget for which the brothers had originally asked, but the reported cost of $70 million made its eventual worldwide takings of $58 million a catastrophe…..

Kevin Conran has worked in the art department on films including Bee Movie and Monsters Vs. Aliens, and as a production designer on Dreamworks’ Dragons, a TV spin-off of the hit movie How To Train Your Dragon. As he muses on where the Sky Captain experience has led him, he says. “I think sometimes that there’s a world where we might have made this thing for $3-4 million and there would be a whole different story to tell.“

Kevin would never say this himself, but the Conrans’s contribution to cinema is huge. “You can absolutely draw a line from Sky Captain to the look and feel of many of the big blockbusters we see today,” says Ian Freer. “Its use of a digital backlot is now the dominant M.O. for production design. Films like 300, Sin City, Avatar and Alice In Wonderland have all created worlds built on the ideas put down by Conran.”

As much as the big budget movies have taken the techniques the Conrans developed, still very few people have really done what they set out to do: eradicate the need for giant budgets on fantasy films. Their plan was not to make things better for James Cameron or George Lucas, it was to give opportunity to the guys nobody had heard of – guys like them – and to have moviemaking be restricted only by your imagination not your bank balance.

“Conran crystallised the idea of the one man film studio, taken up by the likes of Robert Rodriguez and Gareth Edwards (director of Monsters and later Godzilla),” continues Freer. “But there are other ways in which Conran was ahead of his time. Sky Captain is a film built entirely on nerd love by a nerd director. With its in-jokes, old school visuals and pastiche of old genres, Sky Captain is the ultimate ‘geekgasm’ years before the word was invented.”

(7) Prediction: the Scooby-Doo & KISS: Rock & Roll Mystery will never be on Kyra’s bracket.

scooby doo and KISS COMP