E Pluribus Hugo Tested With Anonymized 2015 Data

By Jameson Quinn: [Originally left as a comment.] So, Bruce Schneier and I are working on an academic paper about the E Pluribus Hugo (EPH) proposed voting system. We’ve been given a data set of anonymized votes from 2015. I don’t want to give all the results away but here are a few, now that people are actually voting for this year’s Hugos:

  • A typical category had around 300 ballots which voted for more puppies than non-puppies, and about half of those ballots were for puppies exclusively. There were few ballots which voted for half or fewer puppies (typically only a few dozen). The average number of works per ballot per category was around 3.
  • There were some weak correlations among non-puppies, but nothing that remotely rivals the puppies’ coherence. In particular, correlations were low enough that even if voting patterns remained basically dispersed, raising the average works per ballot per category from 3 to 4 (33% more votes total) would probably have been as powerful in terms of promoting diverse finalists (that is, not all puppies) as adding over 25% more voters. In other words: if you want things you vote for to be finalists, vote for more things — vote for all the things you think may be worthy.
  • EPH would have resulted in 10 more non-puppy finalists overall; at least 1 non-puppy in each category (before accounting for eligibility and withdrawals).
  • SDV(*) would have resulted in 13 more non-puppy finalists overall.
  • Most other proportional systems would probably have resulted in 13 or 14 more.
  • The above numbers are based on assuming the same ballot set; that is, that voters would not have reacted to the different voting system by strategizing. If strategizing is not used unless it is likely to be rational, that is a pretty safe assumption with EPH; less so with other proportional systems. Thus, other systems could in theory actually lead to fewer non-puppy nominees / less diversity than EPH.

Feel free to promote this to a front page post if you want. Disclaimer: EPH is not intended to shut the puppies out, but merely to help ensure that the diversity of the nominees better reflects the diversity of taste of the voters.

(*) Editor’s note: I believe SDV refers to Single Divisible Vote.

Update 02/08/2016: Added to end of second bullet missing phrase, supplied by author. Corrected footnote, based on author’s comment.


Discover more from File 770

Subscribe to get the latest posts to your email.

407 thoughts on “E Pluribus Hugo Tested With Anonymized 2015 Data

  1. As one of the sponsors of 4 and 6, I intend to propose an amendment if EPH passes, changing it to 5 and 6.

    That’s testable with real world data. Which counters the admins’ argument that there is no point releasing the data to the 4/6 proposers. (I don’t support handing out the data at all, but now that it has been given to one faction and politicized, perhaps something must be done to restore a degree of neutrality.)

  2. I have just read through this thread, and I note that there is at least one direct request for my comment, and also several issues on which I could offer useful comments without referring to anything I’ve learned from looking at the data I have. However, I will opt not to say anything further in this thread relating to EPH or voting systems in general. The only reason I’m posting here is now to say that I believe that most of these questions will be answered once the full analysis is shared; that is, before MACII.

  3. While I agree that it would be useful to see the results of combining EPH with 5 and 6, using the actual 2015 data, I am happy for Jameson’s team to carry out that analysis. I don’t see any benefit from having a different group model that.

    I’m also not in a hurry to see the results of that analysis, as long as it comes out far enough in advance of the Business Meeting for people to think it over.

  4. @Jameson, Since apparently I’m not the only one who proposed the unique ballot filter, I hope that your report includes the results of that combined with FPTP and EPH. The most compelling argument against the counter-intuitive nature of the method is it’s effectiveness. Unfortunately that may be an artifact of the ’84 data’s innumerable spelling errors and the limitations of artificial slates. Without having real world effectiveness to point to it would be impossible to argue for effectively.

    Irregardless, thanks for your work on this and I hope the haters don’t get you down.

  5. I don’t feel No Award is a defence against slating. I found the “victory” of No Award to be a great defeat. The purpose of No Award, in my view, is for fans to express that “this year, no work published (nominated or not) was worthy of the award.” I also feel that when there was one non-slated work and 4 slated, the automatic win for the non-slated was also a failure, and there will forever be an asterisk in people’s minds around those Hugos.

    You could view write in as “5 and 20” but as I have proposed it, it is not. I propose a slight alteration of the system so that the 20 are sorted. Today the 5 are unsorted. I do think unsorted is superior to sorted in terms of fairness, but I think the difference is minor enough to be a far lesser loss than what we have now. Under EPH we will see slates get 3 to 4 of the 5 nominees in the categories they care about. EPH 4/6 switches them to 3 of the 6 in most cases and is thus superior.

    If you want to think of write-in as 5/20, if they are sorted it turns into “5/however many the voter wants to consider.” And in particular, if the voter believes some works high among the top 20 were slated, I think the typical and rational voter turns this into their own “5/the-real-top-5” — which is precisely what we are looking for, it is the system we had before.

    If you really think that sorting the 20 hurts the award more than having 3-4 puppies out of the 5 or 6 nominees, I must admit I am surprised, but would like to hear reasoning. There are a few options one could apply to soften this, which is to say “the top 5 are listed unsorted, the bottom 15 are sorted.” This conveys to voters which works would have made the final ballot under the old system but no other information, and lets the voter know which would have been the “almost made.”

    Expressed as a 5/20 system (which I am happy to do to defuse those who feel that it’s somehow a logistic “nightmare” to count write-in)

    The number 20 is arbitrary. The real goal is, “Enough so that fans can judge for themselves who the real top 5 are.” That number is probably 9, but I am loathe to code current understanding into a set of rules which take 2 years to change, which is why I like leaving it to the democratic action of the fans and depending on their intelligence.

    On the other hand, there is some math we can do. If a slate has half the nominators, for example, we can calculate how many of the top slots they might win, add 5 to it, and set that as our maximum number. That’s because if they get more than half, I concede them victory, and they will also select the Hugo winner too.

    Slightly more controversial could be to allow the Worldcon itself every year to set a dynamic number “between 5 and 12” and to set it on a per-category basis. I know many don’t like the committee exercising any human judgment, even though in the real world you only fight human attackers with human judgment. In particular, I like systems that, because they are dynamic, convince the attackers “you can’t ever win.” because that makes most attackers go away.

    EPH of course has other flaws, aside from its complexity it slightly penalizes groups of works which have a natural fandom. This was discussed at length, and I do agree it’s minor (as are the flaws of providing a sorted list.)

    I would state that in my mind the goal is, “To allow voters to select the Hugo from the set of 5 nominees that would have been produced by a fair nomination process based on the independent (no collusion) opinions of interested fans.”

    The secondary goal is “to make it clear to those who would attempt to game the system that they will fail, and thus make it less likely they will even try.”

    Do people feel these are not the right goals?

  6. @Nigel:

    They want to burn the Hugos down and make SJWs heads explode. Any good faith about honouring good SFF they love has yet to be earned.

    Yeah. First off, I specified Puppy leaders, who are professionals. The Puppy movements are writers’ revolts, not fan results, just as Delany predicted would happen in his “Racism and Science Fiction” essay. And considering that Vox Day’s RP1 announcement specifically advanced the goal of denying acclaim – and money – to “SJW” authors, and that John C. Wright’s announcement of the RP campaign led with an approving quote and link to that very argument, and that Larry Correia kicked this all off by trying to get himself an award, slagging off the writers and works that were winning in the process, the claim that Puppydom is somehow about fans honoring what they love is a vicious fucking lie.

  7. Brad Templeton: The purpose of No Award, in my view, is for fans to express that “this year, no work published (nominated or not) was worthy of the award.”

    I’m sure everyone votes No Award in accord with their own idea of what it means, however, I can’t adopt yours because in 2015 the nominating system wasn’t allowed to function in a way that I could reasonably think the finalists in some of these categories represented the best in the field. There were some categories where the nominees did not even reflect what the slaters considered best, merely having been listed for reasons of friendship or professional advancement.

  8. My personal take is that it would last about 20 seconds before the vast majority of attendees send it off to permanent la-la land.

    At the Business Meeting, nothing takes a mere 20 seconds.

    I think there’s a Rule of Order against it.

  9. Mike, I certainly agree that people used No Award in 2015 for other than the purpose I cite. I feel we used No Award because it was the only tool the rules offered, but I doubt if you had asked anybody 3 years ago what the purpose of No Award was they would have listed “defence against a slate taking all the slots” high or anywhere on their list. Yes, in a pinch, you can use No Award as a defence, but it is a sucky one.

    In particular, the presence of any slate candidate means that an otherwise presumably worthy candidate has been deprived of the nomination fans actually intended to grant. I am most sad for the writers for whom 2015 through 2017 were years of great work who had their chance for recognition taken from them. That the puppies were denied Hugo awards is a minor consolation. We destroyed the village in order to save it.

    What I propose is the only thing I have seen which gets us back to our goal. That fans see, and can vote on a reasonable set of candidates as would be chosen from the honest and independent views of the nominating fans. My estimates suggest there are between 1,000 and 1,800 puppies who were members of 2015 and can nominate at no extra cost in 2016. I don’t know if that number will be sustained for the 2017 nominations; I hope it isn’t. My own math and the results in the OP suggest that means 3-4 slate nominees per category. EPH dropped the number by around 1. 4/6 increases the non-slate number by 1. It’s a poor fix, and that’s before the attackers have had a chance to regroup and adjust their strategy as they will in 2017.

  10. Hmm. Here’s the germ on an idea that might be more palatable to EPH fans while still aiming at what I think is the goal. I don’t like it nearly as much because it attempts to use algorithms rather than democratic judgment. I don’t believe I saw it discussed.

    As we know, EPH attempts to penalize works which appear together with other works on too many ballots. If your ballot has 2 many winners on it, its entries count for less.

    The alternative is to define a formula which detects such grouping, but does not penalize any works. Rather it rewards others by expanding the number of nominees when it is detected.

    ie. “If 2 or more works among the winning set appear together on a disproportionate number of ballots, then the number of nominees is increased until at least 5 works not meeting this test are on the ballot.”

    You then need to write a suitable test for appearing together on a disproportionate number of ballots. EPH is effectively such a test implemented as a weighting method. I would have to play around with suitable tests but my intuition is that one could be written. It might be more complex than EPH, which at first seems a flaw, but what is much less complex is that nominators have no need to understand it. They will know that the N nominees (sometimes just 5, perhaps as many as 10) were the top N nomination getters. There is no strategy to their nominating because of the formula.

    One minor negative consequence (which does not trouble me too much) is that this may well increase the number of slate nominees. Ie. let’s say there is a slate of 5, and the top 5 consist of 3 from that slate and 2 stronger works not from the slate. This algorithm would probably increase the total nominees to 10, including all 5 slate nominees and 5 non-slate. Fans would then vote as they saw fit.

    In a category with no slates, there would be only 5 nominees.

    I still prefer write-in because write-in just leaves the power in the hands of fans. It is not a response to slates, it is a response to _any_ attack. This approach above is a response to slates, and it is superior to EPH in that all the works the fans meant to nominate make the ballot and get a good shot at an award, instead of just half of them. But as a response only to slates, it does nothing about other attacks we have not yet considered.

  11. the unique ballot filter

    Can you explain why valid nominations should be thrown out because someone else posted the same works? It sounds like an excellent way to kill the Hugos.

  12. @p Nominations aren’t thrown out, extra copies of duplicate ballots are. Unlike say EPH, where every round throws out some ballots.

    Removing duplicates doesn’t appear to affect the results except for removing slates, which is a good result to many people.

    In the 84 Best Novel category for example:

    Startide Rising,137 becomes 125
    The Robots of Dawn, 75 becomes 73
    Moreta:Dragonlady of Pern, 54 becomes 52
    Millenium, 52 becomes 50
    Tea with the Black Dragon, 55 becomes 49

    The same 5 nominees either way, but if 200 time traveling puppies put in ballots for Vox Day’s high school novel, they would become 1 vote and still wouldn’t place.

    Or say best Dramatic Presentation where there were many 1 or 2 nominee ballots and thus more combined.

    Return of the Jedi, 226 becomes 154
    The Right Stuff, 87 becomes 65
    Wargames, 85 becomes 74
    Brainstorm, 74 becomes 60
    Something wicked this way comes,50 becomes 42

    Again the same 5 nominees either way, but the 200 time lost puppy ballots for Yor, Hunter from the Future are combined into 1 and fail to affect anything

    Actually popular works, especially those supported by diverse sectors of fandom, aren’t much affected, but a slate’s influence is completely blunted. Hard to see how this would “kill” anything.

  13. Some things:

    a. Clearly this does not show what the result of EPH if actually implemented would be. It may happen that support for slates will decline significantly, in which case EPH will produce better results than this study shows. On the other hand, if Mr Day’s 525 supporters actually materialise (and it certainly hasn’t been proved that they won’t), and regular nominations don’t increase proportionately in all categories, then the results may be worse than this study shows. Greater diversity of nominations (whether that is, in itself, a good or a bad thing, on which views seem to differ) would also make matters worse. What we can say is that the improvement brought about by EPH in itself, while real, is not enough to turn the system into a satisfactory one. If results like those produced by this study continue for the next ten years – and we can’t be sure that they won’t – I don’t think we could really point to the Hugo process as revealing the best in science fiction.

    b. Of course, any improvement, even a small one, would make adopting the system worthwhile, if it had no disadvantages. It seems to me, though, that it does have disadvantages: (i) the perspicuity problem which Cally raised (ii) the Joe Smith problem – i.e. the way the system may boost the chances of fan-clubs and single issue campaigns (which I am not yet convinced isn’t a problem, but it will take me a bit of time to work out a clear way of explaining it) (iii) the way in which this system might benefit slates, because if they continue to do well under a system worked out to make voting fairer, they can say ‘Look, we were right! The works we support really are more popular and deserve to win!’. (Which wouldn’t be true – even if EPH were a totally proportional system, which it isn’t – because slate voting doesn’t represent people’s real preferences.) So we have to weigh the gain involved in adopting EPH against these things.

    c. Whenever I have raised questions about the effectiveness of EPH – not about whether it is a good thing, just about whether it does enough to solve our problems – people have replied ‘Yes, but it is better than what we have now, and no one has proposed a better idea’. And so far, that seems to be true. On the other hand, people have sometimes suggested that EPH buys us time in which we can try to think of ways of improving it. But we know some, at least, of the failure modes of EPH now, so we should be thinking now of ways of improving it. Brad’s proposal above might be such a way, but I will have to think about it more.

  14. Tasha Turner: Slating has happened in the past.

    When was this? There have been single-issue campaigns (the Scientology one being the most notable), but have there ever been slates, in the sense of an organised bloc vote for a group of works (which is the only thing that EPH can weaken)?

  15. @Brad Templeton: Interesting idea that I personally don’t think I’ve heard yet. I’m going to mull that one over for a bit.

  16. I would disagree that ” any improvement, even a small one, would make adopting the system worthwhile.” Changing the Hugo rules is (intentionally) so hard that a poor measure probably blocks out others. You want to get as close to the goal as you can, you don’t get to easily iterate an any algorithmic approach.

    We can’t perfectly predict how the slates will do. They are, fortunately, not a unified front. In 2015, there look to have been around 300 to 350 nominators (out of around 2,000) with some slate adherence, but perhaps only 150 truly dedicated people who voted the pure slate. Of course, while there were 2,000 nominating ballots, most categories got around 1,000 nominations or less, which allows the group of 150 to 300 to dominate, under old Approval or under EPH.

    In 2016, with the huge influx of supporting memberships, I estimate there are 1,000 to 1,800 Sasquan members with slate affiliation with 5,000+ final ballot voters in the major categories. That number is capable of taking 3-4 nominations in categories they care about under EPH, and 4-5 (mostly 5) without it.

    The question is, will they cohere, and how much? 1,800 nominators can still heavily dominate. Will some large number of them say, “This was fun once but I’m bored now?” Will some actually realize they are not a positive force and stay away, or nominate independently? Will some, with the taste of blood, attempt to go even further?

    At the same time, what happens to all the voting newcomers who joined Sasquan to fight the puppies? Do they remain long term members of the Hugo community? Do they get bored and go away too?

    Remember, traditionally only a small minority of worldcon members bothered to nominate. Nominating is hard work. It requires dedicated fans who read a lot of stuff. A good nominator has read/experienced dozens of works and is able to fairly name a top 5. There are very few of those, and the number of those is probably not increased by this surge of anti-puppy supporting members.

    So it’s difficult to call. Nonetheless, solutions that depend on a puppy collapse of a truefen rally are risky.

  17. Kurt Busiek: At the Business Meeting, nothing takes a mere 20 seconds.

    I think there’s a Rule of Order against it.

    *snort*

    I did see at least one proposal at Sasquan which was seriously a “Don’t Blink Or You’ll Miss It” case.

  18. Steven desJardins on February 12, 2016 at 11:35 pm said:

    As one of the sponsors of 4 and 6, I intend to propose an amendment if EPH passes, changing it to 5 and 6. That should unambiguously increase the effectiveness of EPH in resisting block voting. The parliamentarian ruled at last year’s Business Meeting that that would be a lesser change and would not delay ratification.

    That was indeed the Parliamentarian’s opinion (it’s in the minutes), and I as Chair of the 2015 meeting agree with it; however, the final call on whether it actually is a lesser change can only be made by the 2016 meeting.

    Kurt Busiek on February 13, 2016 at 11:32 am said:

    At the Business Meeting, nothing takes a mere 20 seconds.

    I think there’s a Rule of Order against it.

    Well, twenty seconds is perhaps a bit much, but in 1993, four particularly egregiously bad proposals were dispensed with by Objection to Consideration in less than four minutes, even with additional procedural hurdles attempted by the single loose cannon (and the stooge he hauled along to second his proposals). When you’re being outvoted on non-debatable motions by a vote of everyone-2 on every motion, you lose very quickly.

    In the case cited, it would take around five minutes, inasmuch as both sides have two minutes to debate a Postpone Indefinitely. Objection to Consideration (no debate, 3/4 vote required) should be reserved for things that would be actively harmful to the organization to even discuss, as were three of the four proposals we killed back in 1993.

  19. Kevin, understood. As I understand it, as one of the makers of the motion, I have the right to be recognized first when the amendment comes up on the agenda. I had intended to begin by making a Point of Parliamentary Inquiry, explaining the amendment I intended to make, and asking for a ruling on whether the amendment constituted a lesser or greater change; then, presuming the ruling was as expected, immediately making the motion to amend. Is that the correct terminology and procedure?

  20. errhead, that didn’t answer my question:
    WHY do you think we should throw out valid nominating ballots just because more than one person nominated the same works?
    If you’re going to claim it doesn’t make a big change, then WHY should it be needed?

  21. On the other hand, if Mr Day’s 525 supporters actually materialise (and it certainly hasn’t been proved that they won’t), and regular nominations don’t increase proportionately in all categories, then the results may be worse than this study shows.

    If the 525 supporters do actually nominate then they would almost certainly have a majority of the nominating ballots cast. I don’t see how you can have a functioning system that doesn’t allow a group of nominators to dominate in that situation.

  22. Steven desJardins on February 13, 2016 at 3:47 pm said:

    Kevin, understood. As I understand it, as one of the makers of the motion, I have the right to be recognized first when the amendment comes up on the agenda. I had intended to begin by making a Point of Parliamentary Inquiry, explaining the amendment I intended to make, and asking for a ruling on whether the amendment constituted a lesser or greater change; then, presuming the ruling was as expected, immediately making the motion to amend. Is that the correct terminology and procedure?

    I think you are right on all counts, but do coordinate your actions with the other proponents of the motion, for only one of you gets preference in recognition before saying something like:

    “Mr. Chairman, as a parliamentary inquiry, if I were to make a motion to strike out ‘4’ and insert ‘5’ in the pending motion, and should the meeting adopt the amendment, would that be a lesser change and thus not require an additional year of ratification to the amended motion?”

    I don’t presume to speak for Jared, but assuming that he does state that if would be a lesser change (that cannot be appealed at that moment because it’s an opinion, not an actual ruling), you could then say:

    “Then Mr. Chairman, I move to amend the proposal by striking out ‘4’ and inserting ‘5.’”

    The amendment becomes the pending question and you get first crack at it because you made it. Assuming it passes, the pending question becomes the ratification vote. When the ratification vote comes up, a member could make a point of order that the change would require re-ratification, the Chair would rule it not well taken, and that could be appealed, in which case the meeting could overturn the chair’s decision and declare it to be a greater change and thus need re-ratification. I personally think that series of events is a bit unlikely, though.

  23. Aaron on February 13, 2016 at 4:47 pm said:

    If the 525 supporters do actually nominate then they would almost certainly have a majority of the nominating ballots cast.

    I don’t think that’s a valid assumption anymore, not with the huge upswing in interest and something in the neighborhood of 15,000 people eligible to nominate. (We don’t know the exact number of unique natural persons in the union of the 2015-17 Worldcon membership, but we know it’s more than 10,000, and it’s probably less than 20,000.)

    Until recently, you’re right that 500 people could (and did) swamp the nominations. But the conditions have changed, dramatically. We’ll not know for a while what the effect of having thousands of new potential nominators is.

  24. @p ????

    “except for removing slates,”
    “but if 200 time traveling puppies put in ballots for Vox Day’s high school novel, they would become 1 vote and still wouldn’t place. ”
    “but the 200 time lost puppy ballots for Yor, Hunter from the Future are combined into 1 and fail to affect anything”
    “a slate’s influence is completely blunted.”

    TL;DR(again):It will prevent purely slated works from getting nominated no matter how many puppies attempt to stuff the ballot box.

  25. Kevin Standlee: Your observation about the wide number of potential nominators this year makes me wonder how many might have participated if sent PINS. Since I didn’t get a PIN yet as a Sasquan member I suppose only those asking for them will receive them.

  26. BTW, I am curious as to the motive to turn 4/6 to 5/6? It is my understanding that today, many nominators already do not enter a full 5 nominations in many categories; is there any data on this? If that’s the case, a switch to 5 nominations benefits only the most dedicated, and of course those following a slate. It also permits those following a slate to enter a few of their own choices, and enter more of the slate. The puppies are not a lockstep army. I estimate less than half of puppy affiliates stuck with the whole thing, but I do that without seeing the data.

    I presume goal is to let those fighting slates enter more entries to try to push up the number of natural (independent) nominations, and that is reasonable if it works, I suppose.

    For each category, you want to look at what the “natural percentage” is for the top nominees. Consider Novelette for 2015 — around 1,000 nominations, and the natural nominees getting from 54 to 72 nominations. About 250 slaters nominated or 25%. Under EPH a rough prediction shows the slate will easily command 3 of the nominations, because 250/3 > 72. We’re close to the wire on the slate capturing 4 of them. 250/4 is 62.5 so “Each to Each” at 69 nominations probably gets in — but not if 13 of the ballots for the 2 top natural nominees had both, or if the slate gets a few more, or the natural gets a few less.

    The slates will never act in lockstep, so the slate candidates don’t usually have exactly the same number of ballots, and as such, they will not be eliminated all together as some EPH advocates believed. Rather, in this case the slate would make a go at taking 4, and if it failed, the lowest slate entry would be eliminated, and the slate would very handily take 3. But it’s on the borderline of taking 4 in this and several other examples. The slate numbers we see may also consist of impure ballots, choosing 3 or 4 from the slate, not 5. If there were some of those, the slate is actually stronger than it looks, and stronger under EPH.

    Getting all 5 under EPH does require that the slate have around 5 times the support of the leading natural candidate. That is unlikely but not impossible. As such, under EPH they will know they can only get 4 in most categories, and if they are wise they will set their slate to 4 rather than 5, whether there are 4 or 5 nomination slots in the category.

  27. @Andrew M

    Tasha Turner: Slating has happened in the past.

    When was this? There have been single-issue campaigns (the Scientology one being the most notable), but have there ever been slates, in the sense of an organised bloc vote for a group of works (which is the only thing that EPH can weaken)?

    I was referring to single-issue campaigns which have been called slates in the past (I think? I could be confused). Your correct those won’t be effected by EPH. As far as I know, no single-issue campaign has come back a second year. They’ve gotten the message that No Award sent. SP/RP are the first to my knowledge to ignore the social sanctions (NA) and keep coming back making their slates bigger each year. Leading Worldcon to need to change things again.

    My understanding is No Award came to be after a single-issue campaign.

  28. In writing up my “expand the pool if the algorithm detects slates” approach I discovered a vulnerability. It is present in all algorithmic systems, including EPH, and it is hard to carry out, but it exists.

    To apply it, the slate needs to know its strength and the likely strength of natural nominees in the category. Using the example I gave above from Novelette, if it has 250 nominators and the natural strength will be 60 to 80 nominations, it has to make a decision of how many it can safely take, in this case probably 3. If it has strong loyalty from its members, it can then divide up its 3 choices among them, and tell them to cast single-vote ballots, so there are 83 single vote ballots for A, 83 for B, and 84 for C, and no algorithm can detect or respond to that.

    Now, I think this is a very hard attack to carry out. The members of the puppies are not that cohesive, fortunately. And it can be hard to predict your strength. Predict it wrong and you could get fewer than you could have. Under EPH you use the same strategy regardless of your strength.

    Nonetheless, it verifies my own instinct, that algorithms only fight the last war. It is still much better than EPH, but it is not as robust as more non-algorithmic approaches.

  29. It seems to me that the “Highlander” idea only works so long as the Puppies all nominate in perfect lockstep. If they know that it’s in force they can easily get around it by coordinating: each of them nominates a randomly-chosen 4 works out of the 5 on the slate, and then for the fifth item puts some one thing that is known to be unique. So “Highlander” is in practice the same as “4 and 5”.

  30. @David, The possibility of second generation slates with semi-random distributions was brought up earlier, and was one of the things I was hoping to make testable with my app. How well EPH would deal with these kind of slates is an interesting question.

    Of course, there is a major difference between slate voting being based on a publicly announced list of the suggested best, and each slate voter being assigned specific nominees by a centralized controller. It would remove any fig leaf of legitimacy from the voting bloc and make it clear that it is nothing but fraudulent gaming of the system. It would seem a much less likely role for people to voluntarily undertake, and orders of magnitude more work to coordinate. If there are enough minions who consciously choose to vote for whatever their master bids, there is likely no system they can’t game.

  31. Correct. All algorithmic approaches face the problem that the attacker can adapt to the algorithm. Particularly if it takes 2 years to change the algorithm.

    I have summed up my analysis and my best algorithmic proposal at: this page

    BTW since EPH does cut the slates down to 3 or 4, it makes it easier for them to easily randomize without much coordination. There are even ways to get people to do that without being slaves, but telling them it is in their interest to roll dice take a role in the “fight against the SJWs”

  32. Until recently, you’re right that 500 people could (and did) swamp the nominations.

    Read what Jameson Quinn wrote alongside Kempner’s analysis. The voting behavior of the puppy groups was, if anything, marginally more in lockstep than that of small groups of fans who have voted similarly to each other in the past.

    Just 150 people, coming from two different groups of fans acting independently (readers of Vox Popoli and the readers of Correia and the Band of Geniuses), “voted a slate.” They voted exclusively for items appearing on one or the other of those two recommendation lists. We DON’T know whether most of them read those things and actually loved them, or voted a five-item slate in perfect lockstep. My personal opinion is that if they had voted in lockstep, Mr. Quinn would mostly likely have said so in his executive summary.

    That’s why it is important to have the analysis checked by an independent team – to provide reassurance that no one puts English on the ball tossed to the Business Meeting.

    Yes, those 150, presumably eager to get “their” authors some recognition, nominated without being bothered to look for good stuff on their own. Yes, that pushed The Parliament of Beasts and Birds, A Single Samurai, Rolf Nelson and so forth onto the ballot. Yes, that’s bad, since Andy Weir and The Breath of War, etc., didn’t get the fair shot they deserved. Paulk at least is doing things much differently this time in recognition of that.

    But that wasn’t 500 people swamping the ballot.

    Whether 500 people might NOW choose to swamp the ballot, perhaps in reaction to how they feel they’ve been treated, is a different question.

  33. Kevin Standlee: Your observation about the wide number of potential nominators this year makes me wonder how many might have participated if sent PINS. Since I didn’t get a PIN yet as a Sasquan member I suppose only those asking for them will receive them.

    I had to email to get my PIN. But when it arrived, it didn’t work. Turns out it’s a duplicate PIN.

    It’s been several days and it still hasn’t been resolved.

    So yes, I can imagine people not putting up with the hassle. I’ve gone out of my way to read outside my usual authors/habits this past year, and so I’m going to nominate hell or high water. But I am underemployed and have the time to pursue a working PIN (after all, how better to de-stress while waiting to hear back about a second interview than chase down a working PIN?). How many people let it slip away?

  34. I read on Making Light that MACII has determined that there’s been a systemic problem with Hugo PINs and that they’re trying to fix it so that they can re-send PINs. I’m guessing what’s going on here is that no Worldcon has ever had to send out so many e-mails in so short a time, and they’re running afoul of mass e-mail traps.

  35. Just 150 people, coming from two different groups of fans acting independently (readers of Vox Popoli and the readers of Correia and the Band of Geniuses), “voted a slate.”

    Brian, why do you think there were only 150? I read as about 140 Sad Puppies and 160 Rabid puppies, though only about 200 total “hardcore” full slate.

    But by voting time, there were 586 who put Vox Day FIRST on their ballot for short-form editor, which means pretty committed to rabid. 1672 voters put a slate candidate first for Novelette (4 slate candidates) and 1842 put one first for Novella (no slate candidate) Yes, there were some fans who were in the “If it’s on the ballot, I will read it and rank it if it’s decent” camp, but the vast majority of voters appear to be those who bought supporting memberships to vote on one side or the other. (Voters jumped from typical 1500-2000 to 6000.)

    I don’t know if all these new members (puppy and non) will participate at the same level, but if they do, there are between 1,000 and 1,800 puppies, which is enough for sweep everything in 2016, and get 3-4 per category under EPH. This is also true if both factions reduce participation similarly — ie. if half the puppies come back and also half the non-puppies come back.

  36. Brad Templeton,

    “Roughly half” of 300 ballots which had at least 50% items from one or the other of the two (SP and RP) recommendation lists had only items from the recommendation lists, and no additional items.

    Jameson Quinn doesn’t say how many of those were people who voted for Jim Butcher and nothing else, or how many just voted for the Analog stories and Black Gate, or how many liked tuesdays with molakesh the destroyer plus goodnight stars plus lines of departure and nothing else, or how many voted for The Lego Movie and Guardians of the Galaxy and nothing else.

    Some fraction of those 150 “puppy rec only” voters recognized on the lists the names of works/authors they liked/had read and nominated them accordingly (without expending the effort to read widely and find their own cool stuff, which is the cultural expectation for Hugo nominators).

    Some fraction of those 150 went out of their way to additionally read the recommended works/authors and nominated if they liked them (without expending the effort to read widely and find their own cool stuff, which is the cultural expectation for Hugo nominators).

    Some fraction may have picked a bunch of things from the list they hadn’t read yet and nominated them (without expending the effort to read widely and find their own cool stuff, which is the cultural expectation for Hugo nominators).

    Some fraction may have voted in lockstep for a five-item “slate”.

    At the moment we just know the total number of ballots which had only items from the two puppy rec lists and no items not from the two puppy rec lists: roughly 150.

    As for over 500 people voting for Vox Day for editor, some of those people presumably actually liked Riding the Red Hose.

  37. To be clear, while the puppies are the enemy here, like all enemies, they don’t have Bond-villain meetings where they do evil-overlord laughs. Some are griefers, but many believe in what they are doing, but at the same time are obviously willing to read and vote for non-slate items. They are just happy, I suspect, to get a chance to put “their” kind of SF at the top by coordinating with others, not caring how that violates the system. It’s less clear to me what the 1,000-1,500 extra puppy supporters who appear to have bought supporting memberships feel.

    For this reason, there will be a natural ordering within the slates. Which actually makes them work slightly better under EPH. Some EPH promoters imagined that a typical slate would consist of 500 identical ballots, which EPH would eliminates en masse. (In fact, this strange supposition remains on the EPH page.) In reality, the variation means the least popular of the slate choices is eliminated first, transferring its support to the remaining ones, and so on, until we reach the number roughly equal to the (slate size / mid-range natural nomination count).

    Of course, I do hope these estimates for numbers are wrong, or that they scatter. I don’t know how many of their number felt triumphant vs. defeated by the No Award results.

  38. If the 525 supporters do actually nominate then they would almost certainly have a majority of the nominating ballots cast.

    525 would not have been a majority, or near it, in Best Novel last year. It would have been close to one in all the other categories, though. While there will certainly be an increase in nominations this year, it will probably be very disproportionate between categories, with the biggest growth in those categories which are biggest already. There is certainly room for a result where slates get a sweep even without getting a majority, though – an increased slate support that’s much less than 525 would probably do that in the smaller categories.

    I don’t see how you can have a functioning system that doesn’t allow a group of nominators to dominate in that situation.

    Dominate, yes, but not necessarily sweep. With a genuinely proportional method (which I know there are perfectly good reasons we can’t actually have) a bare majority would take roughly half the ballot. With EPH it would be likely to take more than that, and not improbably everything.

  39. Some EPH promoters imagined that a typical slate would consist of 500 identical ballots, which EPH would eliminates en masse. (In fact, this strange supposition remains on the EPH page.)

    @Brad: EPH page? There’s an EPH page? Could you point to it?

    If you are referring to the Making Light threads, are you asserting that someone who there is reason to believe did good analysis said this weird thing? Jameson or Keith? It also matters when in the discussion this claim was made, since the proposal went through multiple iterations and multiple analyses. In point of fact, one of the selling points of EPH is that it doesn’t eliminate any ballots, ever.

  40. Yes, I refer to the ML main EPH page at Making Light which says in its last iteration:

    0. How does this system eliminate slate or bloc voting?
    It doesn’t, exactly, nor should a nominee be automatically eliminated just because it appears on a slate. On the other hand, any slate which nominates a full set of five nominees will find that each of its nominations only count 1/5 as much. With “non-slate” nominating, some of your nominees will be slowly eliminated, so your remaining nominees get more and more of your support. Since slate nominees tend to live or die together, they tend to eliminate each other until, in general, only one slate nominee remains. With a large enough support behind the slate (five times as much), the slate may still sweep a category; however, if that many voters support the slate, they arguably deserve to win, and no fair and unbiased system of nomination will prevent that. The answer in that case is, simply, to increase the general pool of voters. Regardless, with E Pluribus Hugo, slates will never receive a disproportionate share of the final ballot, as occurred in the 2015 Hugos.

    Truth is, they’re not very fond of what I wrote about EPH and puppygate over at ML, and their style of attack did not bring out the best in me either, and so my attempts to point out that this was erroneous were not well received. A shame, really — I’ve been involved in online debates of every stripe for almost 40 years, but the character of conversation there surprised me, and my mismeasure of it meant this important message was not heard and I bear partial fault.

  41. @Brad Templeton
    I’ve gone and read the beginning of your discussion over on ML. You make your point, others point out a few reasons why it’s not a problem, you make your point again, rinse and repeat. I stopped reading after the third cycle.

    I agree with the counterpoints made. If some nominators decide to try strategic nominating by only nominating a single book most years they’ll be depressed because their strategy didn’t work. On the rare years a fan group gets a work/person on using that strategy one of two things will happen:

    1. So what? It will be treated as a legitimate nomination and everyone is happy. Well I’m sure like Wheel of Time fans will kvetch but we always find something to kvetch about

    2. It will be treated as a single-issue campaign (if it looks like one, quacks like one), NA it at voting time, and fewer fans will do it in the future

    I think your frustration is from most of EPH supporters not coming around to your way of thinking when it’s so obvious a flaw to you. I hate it when others don’t see what is obviously right to me. The world would be a better place if everyone agreed with me all the time. I’m not being sarcastic. If you search File770 you’ll find me saying that a number of times.

  42. Certainly not my goal to re-hash old threads here (or elsewhere.) My main goal is to find the best path to make the Hugo process robust against both the slate attack, and other attacks. Out in the real world, nobody imagines that you can be robust against attack through public algorithms that take 2 years to change. As a mathematician, I fully understand the attraction of algorithms and game theoretical approaches, but I know their limitations, and was surprised that others didn’t.

    But getting back to the main goal — one difference of opinion is that I feel that even one nominee getting on the ballot through collusion and displacing a legitimate winner is a failure, and surprisingly many seem to be more willing to consider that (or even two of them) to be acceptable losses. Perhaps some will say even 3 such incursions is acceptable losses, and thus accept EPH as adequate. That continues to seem wrong to me. I think people went into shock and said, “Ohmigod, the swept several categories, the sweep is what we must prevent.” As such, I believe the right course is to find a method where the fans use their own direct democratic will to fix any problem. I am not a general fan of direct democracy but I think it works here.

    No matter what algorithmic approach is applied, a group of about 10% of the nominators in almost every category can push a single nominee on, and push off somebody else. A group of 20% can put on two, no matter what the algorithm, if they are coordinated enough. You probably need 35% to put on 3 against strong algorithms. (EPH allows this without much coordination, other algorithms are a bit stronger.)

    It certainly seems possible to me that Vox Day might rally supporters who are 10% of the nominators, and that Torgersen might rally a different 10%, and get 2 on, and displace two of their enemies no matter what the algorithm is.

  43. Yeah your attack language is a big part of the problem. The whole enemy thing is also problematic.

    I think the biggest problem your having is understanding we are simply trying to limit how many places an individual slate can take.

    You are looking at the whole thing like a puppy as a war. For the rest of us this is a short blip. You need to read John Scalzi on puppygate to get a better feeling for where the rest of us are coming from. Fandom we are all part of it. Well maybe not VD but he could have been if he hadn’t decided destroying it would be more fun because once 10+ years ago his feelings got hurt by someone who initially was an ally.

    We don’t want to take measures where we identify bad people having wrong fun and throw them out. We just want to make sure there is room in the sandbox for all of us.

  44. I have to say I don’t read nearly so positive a view from Scalzi, or Martin or others who wrote on this, or most fans. The terms “attack” and such come from the computer security world and oddly, the academic and game theory worlds. There they are simply terms of art, not attributions of special malice.

    I do know that there are folks who are simply trying to limit how many places a slate can take, and aside from disagreeing with that, I do wonder if they would say that if offered a means do much better than that. Perhaps they would. The proposal I made here earlier and at http://ideas.4brad.com/fears-confirmed-failure-fix-hugo-awards is crafted, I think, to satisfy that wish as well as the clear wish of thousands of Hugo voters that slates lose to No Award, and that a chance to vote for more natural nominees is strong. It is designed so that nobody is pushed off the ballot, not slaters, not natural nominees. To make room in the sandbox for all.

    One way I know I differ from some – I view the Hugos not as a contest or a vote, but as an attempt at a survey. A survey to learn as best we can what the aggregate view of fandom was on the best SF/etc. of the year. Surveys are not robust, though — an attack on them ruins them quickly. Those who see the Hugos as an election — and they are indeed run that way so I don’t blame people for that view — will be more tolerant of attempts to game the election within the rules. When a survey is gamed, you just say, “Oh well, that survey was invalid.” And you hand out No Award, as it turns out.

  45. Brad, didn’t we go through this entire argument last year on the EPH threads at Making Light? And you were told then that we weren’t going there.

    It’s 2783, and there are still arguments about How It Should Have Been Done.

  46. No, I am not interested in returning to Making Light or the arguments there. This thread here is based on the apparent confirmation of models which showed EPH would still mean 3-4 nominees in many categories would be decided by slates if the slates continue to match the proportions shown in the 2015 nominations and vote. I got the impression that those drafting and supporting EPH did so expecting it would perform better than that. If 3-4 puppies per category (instead of 4-5) is indeed a satisfactory outcome, then I would not expect any change of mind, of course. Is it satisfactory to you or others?

    Note that EPH does improve things, so it should be ratified with 4/6, but it should also be replaced for 2018 in my view as I don’t find 3, or even 2 slate displacements a satisfactory outcome, though 3 or 2 of 6 are more tolerable but still not satisfying.

    Of course, the models and the test referenced in the OP are based on slate strength of 2015. The strength in 2016 and 2017 is harder to predict. My goal is to find the best path going forward.

    In addition to the new information in the OP, I also have a new draft proposal referenced above, which may very well suit the desires of those who like EPH and who differ with me on the goals of the system.

  47. @Brad Templeton
    Yeah pretty much you are starting to restate what you said on ML which I’ve read. And yes you seem to view the Hugos very differently than many of us. Your thoughts in some ways line up more with puppy leader thinking just on the other side. There is a reason we use nominate, vote, and talk as if it were an election rather than a survey. This is an election. We give out awards. Or don’t. As appropriate.

    I know what a security attack is. People are not security attacks they are legal nominators and voters. We take people who give $$ as supporting members seriously as part of us even if we detest their behavior. We believe their nomination and vote has some value. It should be equal to their numbers though. Not outweigh everyone else.

    What we have in this thread is the beginning of an analysis which is incomplete and a bunch of people making lots of assumptions based on that. It’s rarely a good starting point.

  48. Yup, I know there are many views — and it doesn’t surprise me at all that people treat it as an election since it’s done like an election. But which of these two goals do you think best describes the Hugos?

    “We hope to learn what the aggregate of opinion of the Worldcon fans is on the best work of the year (if there was any), ideally an informed opinion where the fans have actually read/experienced as many of the works as practical. If you deliberately introduce error into that aggregation, you’ve acted unfairly”

    or

    “We want to have a contest, where works compete and politic and fans vote to decide which they want. If you play within the letter of the rules, and win, you’ve won fairly.”

    (The former is a summary of how somebody else put it on ML, but it’s also my view. Both are valid viewpoints, and they do lead to different conclusions, but I’m not at all sure that one view is particularly dominant.)

    People are not security attacks, but collusion in order to dominate the results (ie. have them give the wrong answer to “what do the fans like this year?”) is an attack, in the game theoretical/security sense, and I think in the sense of many, or there would not have been so much fuss about it.

    I do await further details in the paper, of course. I already ran my own analysis last year based on other data and came to similar conclusions which are verified, but not completely, above. Though I concluded slightly more puppies than what is reported.

    But again, to move past all this, because it doesn’t seem to be very productive and get to the most practical question. I believe that the proposal to expand the nominee pool in the presence of slating rather than attempting to de-weight the ballots of slaters or others who vote for many in common better meets both the goals I have espoused and the goals I have seen others espouse.

    In deference to Harry Harrison, let’s call my proposal, “Make Room, Make Room!” My prediction is that with a slate bloc similar to 2015, with 25% to 30% of nominators, the results are roughly as follows in several categories:

    Current: 4-5 slate candidates, sometimes 1 non-slate
    EPH: 3-4 slate candidates, 1-2 non-slate candidates
    EPH 4/6: 3-4 slate candidates, 2-3 non-slate candidates
    MRMR: 4-5 slate candidates, 5 non-slate candidates

    Based on 2015 voting, I think there would be a strong preference for the MRMR result. The voters who all voted No Award over the slate (a majority) would, I believe, been very happy to find the 5 non-slate candidates available as choices, and would have read and voted for them, with some also reading and voting the slates, and they would have given them awards, rather than No Award.

    Do people disagree with that interpretation?

  49. Very late to the party here, but responding to a mention way upthread about write-in candidates and Worldcon Site Selection: Although write-in bids are allowed on a Worldcon site selection ballot, they cannot win unless they file valid papers as defined in the WSFS Constitution. The rules require counting all write-in bids, qualified or not, but if no qualified bid (None of the Above is a bid for counting purposes) has a majority at the end of the first round, all invalid candidates (non-qualified write-ins) are dropped simultaneously and their next preferences applied.

    This isn’t academic. The Hawaii in 1993 bid ran as a write-in, and placed second behind San Francisco and ahead of Zagreb and Phoenix, all three of which were on the ballot. The “I-95 in ’95” NASFiC bid filed a just-barely-compliant write-in bid at literally the last second, and they could have won that election. They didn’t (Atlanta/Dragon*Con won), but because RoadKillCon was a qualified bid, we didn’t drop their votes at the end of the first round of voting when no bid had a majority. That might have changed the result of the election. (We’ll never know; I destroyed the ballots after the election was final.)

    The point of all of this is that Site Selection has a qualification process for write-in bids that in general keeps the number of total candidates manageable. Write-in bids for the Hugo Awards would almost certainly not have such a qualification process; the existing Hugo Award nomination process is the qualification round that keeps the total number of candidates manageable.

Comments are closed.