They’d Rather Free Ride: Hugos and Game Theory (Proposal Discussion Thread 4)

By Jameson Quinn

1. Intro and disclaimers

This is a summary of the discussion thread on the post by Kevin Standlee discussing three proposals for expanding the Hugo nominations process in order to help avoid problems like the current list of Hugo finalists.

In summarizing a thread that’s over 500 comments long, and that refers extensively to two other threads with hundreds of comments, I am of course going to simplify matters. In particular, I’m going to overemphasize the points of agreement, without listing every qualification, caveat, quibble, or outright objection that was brought up. Obviously you can read the thread yourself, but if you don’t, please imagine it to have all the disagreement and misunderstandings (as well as off-topic filk, execrable puns, cute references, hastily-constructed codes, etc.) that you’d expect.

2. Initial options: A+2, DN, and 3SV

Of the three initial proposals by Kevin, one of them (nicknamed “+2” or “A+2”) relied on giving new discretionary powers to Hugo administrators. In the discussion thread this idea encountered significant opposition, so it will not be discussed further here.
The other two proposals would both create a new intermediate round of voting, in which the initial votes have been used to create a “longlist” of 15 works which is publicized and voted on in some fashion in order to get the list of 5 finalists. In the Double Nomination with Approval Voting (“DN”) proposal, nominators would vote for (“approve”) the longlist items they liked the best; and in the 3-Stage Voting (“3SV”) proposal, a majority of nominators would be able to disqualify some of the 15 works, without affecting the ordering of the remaining works from the nominations phase.

3. Consensus: 3SV (+1?)

Many of the participants in the discussion began with a preference for DN; they felt better about voting for the longlist works they liked than about voting to reject the ones they felt were illegitimate. However, as the thread progressed, and the strengths and weaknesses of the two proposals were analyzed in more depth, the consensus shifted, and by the end of the thread proposals based primarily on 3SV were the clear winners. Many supported a proposal nicknamed 3SV+1, described below, which integrated some aspects of DN onto the base of 3SV.

4. Why 3SV and not DN?

Why did people’s preferences shift from DN to 3SV? Several reasons:

A. Under DN, voters would have just weeks to assimilate and vote on a list of 255 longlisted items. Many of the most careful nominators would barely vote for any; while the most prolific voters would probably be going mostly by kneejerk reactions. This is true for 3SV, too, but it is less of an issue as explained below.

B. DN conflates two questions: “Do I like this work and feel it may deserve a Hugo?” with “Do I feel that this work’s presence on the longlist or in the list of finalists would be a legitimate result of honest fan preference?” In 3SV, those questions are separate, and votes to disqualify a work are based on the second question alone — one which does not require fully reading/reviewing every longlist work.

C. Unlike DN, 3SV would deal decisively with the issue of “troll finalists”: that is, works promoted by slates explicitly in order that their shocking and/or offensive nature might cast discredit on the awards.

D. 3SV would be similar in spirit to the “no award” option, except that works thus eliminated would not take up space on the list of finalists, and awkward moments at the awards themselves would be minimized.

E. DN would open up new kinds of attacks on the list of finalists, such as actually increasing slate voter’s capacity to act as “kingmakers” and/or perform “area defense” against certain kinds of works. All they’d have to do was to have enough voting power to reverse the gap between two works which both have significant organic (non-slate) support. But under 3SV, actually eliminating a work would not be possible without a relatively high “quorum”* of voters, and we hope that community pressure would lead to a low background level of organic rejection votes, so a minority of slate voters would be unable to use rejection as a weapon.

So, tell me more about how 3SV would work

The details are still up for discussion, but the basic idea is as follows:

3-Stage Voting (3SV) adds a new round of voting to the Hugo Award process, called “semi-finals,” between the existing nominating ballot and the existing final ballot. In 3SV, the “longlist” of top 15 nominees (as selected by the same process as the finalists will be selected; that is, EPH or EPH+ if those have passed) are listed in a way that doesn’t show how many nominations they received. Eligibility for this voting is being debated (see below), but the original proposal is that it would be restricted to members (supporting and attending) of the current Worldcon (not the previous and following Worldcons). Eligible voters presented with this list, with a question on each of the fifteen semi-finalists in each category: “Is this work worthy of being on the Final Hugo Award Ballot?” with the choices being YES, NO, and ABSTAIN.

If a work gets more than a “quorum” of no votes, it is not eligible to become a finalist. There are several proposals for how to calculate the quorum. The formula may involve such things as the number of eligible voters, the number of “YES” and/or “ABSTAIN” votes, the turnout in round 1 or in previous years, etc. The idea is that the quorum should be high enough so that a minority of slate voters will be unable to reach it, but low enough that a clear majority of fans can pass it, even given reasonable turnout assumptions.

During the “semi-final” voting period, admins would also be checking the eligibility of the works on the longlist, and accepting withdrawals from the author (or other responsible party; henceforth, we’ll just say “author”) of those works. The admins would make a good-faith attempt to contact authors, but note that since the longlist is public, the admins may assume that non-responsive authors have heard of their presence on the longlist, and thus that any authors who do not explicitly withdraw their works would accept becoming finalists.

The finalists will be the top 5 works from the longlist that have not been declared ineligible by vote, found ineligible by the admins, or withdrawn by the authors. EPH or EPH+ would not be re-run after ineligible works or withdrawals.

After the Hugos are awarded, admins would publish the usual statistics (that is, for EPH, the votes and points-when-eliminated for each of the members of the longlist). They would also publish the reason for ineligiblity (voted out, ineligible, or withdrawn) for any work that otherwise would have been a finalist. They would also publish the anonymized set of vote totals for each longlist item in each category, where “anonymized” means that they would not indicate which title was associated with which vote total.

What’s this “3SV+1” you mentioned earlier?

The “+1” means that, in addition to 3SV, all voters who were eligible for round 1 may add a single nomination per category, for something from the longlist they did not already nominate, to their existing ballot, during the same “semi-final” voting period. These combined ballots would then be counted by the usual process (that is, the current system, EPH, or EPH+) to find the finalists.

How would all of this interact with EPH, EPH+, 4/6, and/or 5/6?

The short version is that without EPH a realistically-sized, well managed slate could hope to entirely take over the longlist in many categories; with EPH, it could take about 2/3 (around 10 slots); and with EPH+, it could take over about half (6-8 slots), or possibly a bit more with cleverer strategies (but not as much as EPH, even then). 4/6 or 5/6 don’t change that story by much, though they help a bit in keeping organic slots among the finalists in spite of “kingmaker” slates. And +1 helps push slates towards around 1-2 finalist slots; hurting them in the common case that they had been going to get more, but actually helping them if they had miscalculated and were heading for less than that.

Here’s a graph of how many slots slate nominators could have gotten in 2014, as a function of number of slate nominators, if they’d split 3 ways and had the same level of coordination as they did in 2015. Note that 2016 had many more nominators than 2014 so it would have taken more slate nominators to get the same effect.

pseudographSL5%20and%2015

 

Here’s a similar graph, but assuming the slate nominators are better coordinated:

pseudographSLp5%20and%2015

 

Here’s yet another graph, but assuming the slate voters split only 2 ways. In theory, this is better for them if there are fewer of them, but worse if there are more, because they max out at 10 slots. However, as you can see from the graph, it’s really not that much better even for small numbers; the random “bootstrap sampling” effects almost overwhelm any advantage:

pseudographSLx5%20and%2015

 

Have you thought of any downsides? What about…

Yes, kinda. We have people (both honest supporters and honest opponents) thinking of attacks. And there’s always room for more on this “red team”. So far, here are the criticisms we’ve come up with. First, for “3SV” (we’ll talk about “+1” below):

Couldn’t slate voters take over the shortlist? As you can see in the graphs “take over 6-10 slots of the longlist” is the only one that we think is a concern if (as we expect) EPH or EPH+ is in place.

Wouldn’t this just increase negativity? There are several safeguards against this becoming merely an excuse for people to campaign against works they happen not to like. First and most important is social pressure; it should be clear from the outset that this is a just safeguard against outright bad faith, not a chance to express differences in taste, and I believe that any Worldcon members who promote disqualifying a work just because they don’t like it will not get much support. Second, there’s eligibility. Various rules, discussed below in “open issues”, have been proposed to prevent a campaign to bring in Worldcon outsiders after the longlist is public. Third, there’s the quorum; if participation in the second-round voting is low, it will not be enough to pass the threshold to eliminate any work. Fourth, there’s the relatively short period of the semifinals, also discussed below. And fifth, there’s the fact that elimination votes for a specific work would never be publicized; only anonymized distributions of votes for each category. (In some cases, of course, the identity of which work got a certain vote total would be easy to guess, but that would still be just a guess.)

Wouldn’t this fundamentally change the nature of the Hugos? They have already been changed by the slate. Many of the people in the discussion felt that this change, though it would not go back to exactly as before, would still be a change in the right direction.

Would this be more work for the administrators? In some ways, yes, of course. However, in at least one important way, it would actually simplify their lives. Since the longlist would be public, it would be much easier for them to contact authors. On a related note, authors could not leak their status as finalists, because until the list of finalists came out, they would know no more than the public at large.

Would this allow some unanticipated downside? Obviously, we can never rule that out 100%. However, we do think we’ve been pretty thorough at exploring all the angles. Again, you can read the thread and decide for yourself.

And, downsides for the +1 addition:

Would this be tough to administrate? Not if EPH or EPH+ were in place, since any program capable of doing either of these would already be able to associate multiple nominations with the same nominator and make sure that invalid votes, such as a single nominator nominating a given work multiple times, were not counted. It has been suggested that a proposal to institute +1 should say that this change will sunset (require re-approval) if EPH or EPH+ ever does.

What are the open questions/issues?

Eligibility: who should be eligible to vote on the semifinal round, and who should be able to add +1? The former question is more fraught. Several people said that they would want eligibility to be restricted enough that outsiders can’t come in and get memberships after the longlist is published. Others said that it is important to let the community respond and that if membership spikes on seeing the longlist that could be a healthy thing. One compromise proposal (suggested by yours truly) was that in order to be eligible, you would need to be a current member (attending or supporting) of this year’s Worldcon, AND also have been eligible to nominate (whether or not you actually did so). So if you were a member of the prior or following year’s worldcon, you could sign up after seeing the longlist; but not if you weren’t.
Quorum size/formula: There’s been various discussion of this issue.

+1 as attached or separate: The overall consensus seems to be that, if we propose +1, it should be in a separate proposal from 3SV, even though they share certain aspects (such as the concept of a semi-final round). However, there are varying opinions on whether +1 is a good idea, either in general or as a proposal for this year in particular.

Whether to try EPH+ this year: I’ll talk about this more in comments.

The admin discretion 14-18 longlist thing: I was thinking that, if the time between closing round 1 nominations and announcing the longlist is short, admins might have a hard time cleaning the data perfectly. In that case, it helps them be more certain of the list they publish if the next work below the list was not a near-tie with the lowest work on the list. To allow them to avoid such near-ties, especially in cases where they aren’t 100% sure they’ve cleaned the data perfectly, I suggested allowing them discretion to decide how long the longlist for each category would be, between 14 and 18 works.

http://file770.com/?p=29020&cpage=14#comment-436327: Define “nominating membership” in a way that legally allows for the possibility of a code of conduct that could lead to one year’s worldcon revoking voting privileges from a member of the prior year’s; in other words, closes the loophole whereby prior-year members are immune from any consequences for their actions.

Moderated Shortlist for Member Consideration: http://file770.com/?p=29020&cpage=16#comment-436437 This is an alternative to 3SV, where an administrative committee would tentatively suggest eliminating or adding certain works, and that suggestion would have to be ratified by an up-or-down vote of the members of the current Worldcon.

Where do we go from here?

A group led by Colin Harris that includes Kevin Standlee is writing up a proposal are writing up a proposal and are surely watching these threads. When that proposal is ready, we could write up +1 as an amendment or as a standalone proposal. Decide about EPH+ and deal with that. Anything else?

281 thoughts on “They’d Rather Free Ride: Hugos and Game Theory (Proposal Discussion Thread 4)

  1. @Tasha Turner: I’m concerned about those tactics. Doxxing is not the only online harassment which hurts people.

    3SV’s crowdsourced moderation should not be underestimated. With the electorate responsible for the vote, the usual suspects are stuck sniffing each other’s rear ends and barking at imaginary Marxists. It’s magical.

  2. @Jameson Quinn:

    How about: the ballot consists of one “REJECT” checkbox per work. A work is rejected if it gets “REJECT” from over 20% of eligible voters AND from over 50% of those who submitted a vote.

    I’m against the 50% part because of vote decay across categories. This year’s nominations decayed from 3695 ballots (novel) to 1073 (fan artist) – that’s a 70% decay. If the same voting pattern occurred in round 2, rejections would not be possible in 7 categories.

    And about the “20% of eligible voters” part – I remember seeing somewhere that the number of eligible nominators this year was actually 2x the total ballot count. In that case the fan artist category might come perilously close to the 20% number.

  3. Bartimaeus on May 24, 2016 at 5:41 pm said:

    And about the “20% of eligible voters” part – I remember seeing somewhere that the number of eligible nominators this year was actually 2x the total ballot count.

    I’m not sure what you mean. There were about 4000 nominating ballots cast in this year’s Hugo Awards nominations. I don’t have a firm figure, but depending on how much membership overlap there is between the 2015-2017 Worldcons, I estimate that there were between 16,000-20,000 people eligible to nominate this year. Voter turnout for nominations was therefore between 20-25%, which is good, historically speaking.

    The eligible electorate for the proposed 3SV semi-final is the voting membership of the current Worldcon only, not the union of the previous/current/following Worldcon, so it’s a smaller group of people.

    In that case the fan artist category might come perilously close to the 20% number.

    Using the actual numbers from the 2015 Worldcon, 20% of the eligible voters would be about 1,100 people. It would take 1,100 REJECT votes in any category, no matter how many nominations were cast. 20% of eligible voters is a very high bar actually. Go look at the detailed voting statistics from the 2015 Hugo Awards, showing how many people voted on each category, and bear in mind that even that huge voter turnout was only about 50% of the eligible voters. (And 50% is an incredibly good turnout, historically speaking.)

  4. @Jameson: Actually, that does bring up one aspect of your consolidated proposal that I’m uncomfortable with, and that is not publishing the rejection counts associated with specific works. I think that sunlight should be the default here, and hiding that information (even if partially) should have a stronger justification than “it might hurt some people’s feelings to know how many negative votes they or their works got.” If that information was publicly available, I would tend to trust the judgement of the crowd about whether negative voting was being misused in the future, and if so, whether the threshold should be adjusted at that time.

    I very much appreciate the work you and Bruce Schneier did in analyzing the effects of EPH and EPH+ on the recent ballots, and I appreciate the reasons for the NDA you had to agree to. But I wish there had been a way to respect the privacy concerns of individual voters while still making anonymized data more widely available for analysis by lots of people. I don’t like the idea of having to rely on a small select group of people who are the only ones who get to really know how well the rules we propose are working, and that despite the fact that I do trust the specific people who have been given access to that data this time. I expect this feeling is even stronger for those non-Griefers who might oppose EPH in good faith.

    So is there a good reason for keeping the rejection totals for specific works non-public? I think “protecting individual voters from retaliation” was a good reason for the administrators to be concerned about the possibility of information leakage from inadequately anonymized nominating ballots, but I don’t think that applies to rejection totals. We certainly don’t sugar-coat how non-winning finalists do against No Award today, win or lose. So I’m not sure what value there is in trying to protect authors from knowing how many people voted against them. Is there another reason for trying to keep rejection totals semi-secret that I’m missing?

  5. @Kevin Standlee
    Ah groups and places I’m not a part of. Simple is too easy.

    @Mokoto
    There are different proposals being discussed. I should have been clearer. We have too many different proposals running around.

    It’s the MSMC proposal where I’m concerned the Hugo Admins might come under harassment fire directly by griefers who are already targeting authors and editors they dislike.

  6. On rejection totals:

    First, let me explain the proposal I’d made more clearly, so I know we’re talking about the same thing. In my proposal, the data published would look something like the following:

    —-
    Category: Awesomest Fictional Dinosaur

    Longlist, in reverse EPH elimination order

    Work, nominations, points when eliminated, rank, eligibility
    Grimlock, 100, 100, 1, eligible
    Indominus Rex, 80, 70.5, 2, withdrawn
    Baby Bop, 60, 60, rejected
    …..
    Barney, 55, 11, 15, rejected

    Number of elimination votes, from most to least: 1300, 1088, 1066, 1060, 430, 91, 5, 5, 5, 4, 4, 3, 3, 3, 1
    ——
    In the above case, there is clearly a 5-work slate of which Barney is the weakest member and Baby Bop is the strongest. We can surmise that Barney is probably the one that got 1300 rejection votes, and Baby Bop therefore probably got between 1060 and 1088. That’s really all we need to know in order to fine tune things in the future. It’s unclear which dino got 91 rejection votes but knowing that wouldn’t help anything useful.

    In terms of “attacks” that would be enabled by knowing which work got which total: I don’t think there really are any, except that publicizing this data would help griefers keep score / count coup.

  7. @Kevin Standlee: Oh I see, I was confusing eligible nominators with eligible voters. (That 2x figure was the ratio of eligible voters to voter turnout last year.) Then 20% poses no issue.

    (If anyone wants the numbers: last year the category with the lowest ballots was fan artist, with 3476 voters and 576 nominators. I’d expect the semi-final count to be in-between those numbers, but closer to the voter count, due to increased interest. The 20% cutoff for REJECT votes would have been ~1100 as Kevin Standlee noted.)

  8. @Jameson Quinn:

    In your proposal there are only REJECT checkboxes. So requiring a cutoff of “50% of those who submitted a vote” would pose an issue if any category had less than 50% participation. I’m calling this “decay”.

    Last year the fan artist category had 3,384 votes out of a total 5,950 votes; and 296 nominations out of a total of 2122 nominations. That’s 57% voting decay and 86% nomination decay. I’d again expect the semi-final decay to be in-between, but closer to the voting decay – so a 50% semi-final cutoff could work. But it’s a bit close for my liking.

  9. Errata: That should read 3476 voters and 296 nominators, in both my comments. The decays should be 42% (voting) and 86% (nominations). The conclusions hold.

    (I clearly need more coffee. I lost a version of my comment and hastily retyped it.)

  10. I don’t see a problem with having a toggle labled “Accept/Reject”. Start it with all on the longlist toggled to “Accept”. Then there’s no ambiguity.

  11. I would have qualms about an Accept/Reject toggle: I would want an “Abstain” option for works where I genuinely have no opinion. (Say, a slated work that got good reviews, but which I haven’t personally read. I could be uncomfortable voting either explicitly to accept or explicitly to reject such a work.)

    I do want an explicit Accept option, though. Yes, it’s unlikely to ever matter, but on a purely theoretical basis I find the voting structure more appealing, and if it does ever matter I think I’ll be glad we have the rule in place.

  12. @Stoic Cynic
    At that point though the griefers have crossed into criminal activity and possibly criminal conspiracy.

    If Gamegate is any guide, they are crimes that are being committed with near impunity.

    If we’re taking the possibility of illegal actions into consideration we might as well shut WorldCon down. . . . Do we call the whole thing off?

    It’s not a binary, either-or situation. The enemy is tactically sophisticated, so it is prudent to consider likely attack vectors when examining possible rule changes, and work to minimize them.

    @ Joshua K.

    Yes, the Oscars and the Hugos are different in many ways. But the differences you are pointing out (for the most part) aren’t really germane to the question “are there advantages to having different nomination procedures for the different Hugo awards?” Short Story seems more vulnerable to gaming than Best Dramatic, for example. I don’t know the answer, but the question seems worth asking.

    Requiring works to be submitted for Hugo eligibility could be problematic in at least two ways: (a) the creators of some worthy works, especially in the Dramatic Presentation categories, might not care enough about the Hugos to submit them in advance of the nominations period (even if they might take an interest in the Hugos later upon being nominated);

    True. When the Dramatic awards were first presented, SF/F films were still mostly B-movies and were not mainstream. A Hugo may have tended to grant them legitimacy. Now, nominees for the award (long form, especially) are some of the highest-profile works that Hollywood has to offer. If Paramount doesn’t deign to submit the newest Star Trek movie for consideration, and the award ends up going to something lower profile (think of the recent Predestination, or 2004’s Primer), that would be different than what happens currently, but I can’t see that it is worse. The award would elevate the work, and possibly vice-versa.

    But maybe Paramount does submit Star Trek 3: Electric Boogaloo. Then their publicity machine gets behind the Hugo – a win-win.

    and (b) some fans might consider it egotistical to actively submit oneself for Hugo consideration in categories such as Fanzine, Fan Writer, and Fan Artist.

    Then they are at a disadvantage. Just like now – all else being equal, self-promoters are better known, and more likely to win.
    But it may be appropriate for 3rd parties to submit in these categories.

  13. Jameson,

    Sigh.

    When you have something to say besides “EPH bad, sad puppies misunderstood“, I’ll be here reading it and responding if appropriate.

    How about you try to think about what you said, and say it again in a form that means something specific (if in fact there is any specific potential problem you’re thinking of)?

    I’ve described multiple specific problems – start with my summary at the top of the thread – which you have ignored or less frequently countered with a personal attack. It seems you have aspirations to do better. I look forward to it.

    But if I expressed the point you just misunderstood poorly, I’ll try again. Ways to “attack” the Hugos have been in plain sight for decades. Anyone can buy themselves a Hugo – not literally in dollars, now, but in social currency on the internet. The reason they usually don’t is that anybody foolish enough to actually care about the outcome of the Hugo Awards wants it to mean something.

    Nobody’s obligated to vote according to Vox’s sample ballot any more than they’re obligated to vote for grrm’s picks, or Abigail Nussbaum’s. Those who did so don’t want to give all the Hugos to Castalia House until the end of time. Look again at Vox’s spread of recommended publishers. It is not their fondest dream that their authors will win your beloved award. That was the Sad Puppies who swept by accident, immediately altered their approach, and you forgot to thank them. They’re doing it to make a point.

    If you change the rules so that a slate of five things doesn’t accomplish much, they are going relish the challenge. There are at a conservative estimate 700 voters allied (at least temporarily?) with Vox’s pirate crew who would love to puncture your sails right now. If you keep up this talk of blacklisting and banning, that number will go through the roof.

    And they’re not stupid. Even if they were, you’ve created a helpful diagram showing why, under your proposal, they ought to collaborate and generate a set of unique ballots designed to accomplish their goals instead of publishing one or two or three lists of five things on a blog and asking voters to follow one.

    Most importantly, if they do decide to game the system in earnest – which is what you are asking for, in my opinion – the victory condition would still be getting works that they like, which also have support from other fans, on the final ballot, and then playing kingmaker, so that the whole ship starts to tack in the direction they want to see it go in.

    If some of those disgruntled voters want to get a handful of highly provocative works on to the long list (under your system, those wouldn’t even need to crack the top 10), and goad you into banning them for it, that would be icing on the cake.

  14. ” Because shouting was all he could do by that point – and he switches to the Hugos instead.
    Shut down the capacity to do mischief and the griefers move on (aside from sporadic complaints).”

    As long as he can put things on the finals list or even create outrage, it is enough mischief. Beale is a stalker. See for how long time he has stalked Scalzi. There is no reason to believe he will stop harassing the Hugos unless his works can be dealt with quick and efficiently at an early stage.

  15. Hampus Eckerman on May 24, 2016 at 11:16 pm said:
    As long as he can put things on the finals list or even create outrage, it is enough mischief.

    I suspect so. The threshold will be will it make lots of people talk about what he had done (or speculate on whether he was the one responsible) if so then he (or an equivalent troll) will do it assuming he has the resources to do so.

  16. You could also have this:

    1) A checkbox that is by default set to Accept.
    2) If you click once you remove the check. (i.e not accepted).
    3) If you click again you grey it (i.e abstain).
    4) If you click again you turn it to Accept again.

  17. I read the rules to the Oscars a few days ago. They are not applicable to the Hugos, they are a total different ballgame. Unless we want very different rules for different categories, there is nothing of interest to be found there.

  18. I see that Brian Z still is totally unable to find any specific faults, it is only argle bargle.

  19. Good grief Hampus.

    I don’t need to mathematically prove for you that 3SV and EPH helps a minority of voters to shape the overall composition of the final ballot in order to play kingmaker and while also goading you into banning or worse by placing a couple provocative things in slots 10-15. Jameson drew you a map already.

    And that was assuming they were stupid enough to try to do it as inefficiently as humanly possible, by running two to three competing slates. And they’re smart.

  20. Gosh, BrianZ, it’s almost as if people are treating you like someone who’s spent months on end posting bullshit like EPH sux cuz it guarantees at least one slated work on a ballot! in an extended campaign of pro-Puppy trolling. It’s almost as if people think there’s no friggin’ point to taking anything you say at face value, or, indeed, as anything but Yet More Pro-Puppy Trolling. How sad for you.

  21. Brian Z:

    “I don’t need to mathematically prove for you that 3SV and EPH helps a minority of voters to shape the overall composition of the final ballot in order to play kingmaker and while also goading you into banning or worse by placing a couple provocative things in slots 10-15. “

    I’ve never heard that imaginary creations, fantasies and pure lies could be mathematically proven. I would be thrilled to see you try.

  22. Shorter Brian Z:

    “I still want Hugo voters to negotiate with terrorists.”

  23. @Hampus Eckerman:

    1) A checkbox that is by default set to Accept.
    2) If you click once you remove the check. (i.e not accepted).
    3) If you click again you grey it (i.e abstain).
    4) If you click again you turn it to Accept again.

    As someone who builds web apps and works with user interfaces all day – no, please no. No.

    It’s hard enough try to explain to people that they don’t need to double click links to open a web page (many generally think everything needs double clicking). Add in the visually impaired and these sorts of interfaces are just bad news.

    (Also, I don’t think the implementation details of toggles vs checkboxes vs buttons, etc. really belongs in a rule proposal anyway. Only what the options are.)

  24. (Also, I don’t think the implementation details of toggles vs checkboxes vs buttons, etc. really belongs in a rule proposal anyway. Only what the options are.)

    I strongly agree. (I’d say “+1” if we hadn’t coopted that term to mean something else in this thread.)

  25. Steven desJardins on May 24, 2016 at 8:32 pm said:

    I would have qualms about an Accept/Reject toggle: I would want an “Abstain” option for works where I genuinely have no opinion….I do want an explicit Accept option, though. Yes, it’s unlikely to ever matter, but on a purely theoretical basis I find the voting structure more appealing, and if it does ever matter I think I’ll be glad we have the rule in place.

    Even though, with the threshold actually being 20% of the eligible members (not 20% of those people voting) choosing REJECT, votes for ACCEPT and ABSTAIN are functionally the same choice? Or if you could abstain by simply not picking anything at all?

    This is what I mean about people feeling more comfortable with certain labels even though the things attached to those labels are exactly the same thing.

    Ken Marable on May 25, 2016 at 6:33 am said:

    Also, I don’t think the implementation details of toggles vs checkboxes vs buttons, etc. really belongs in a rule proposal anyway. Only what the options are.

    Good grief, yes. Leave the implementation details to the Worldcons. Go look at the existing rules and notice how few of the things that are actually involved in the ballots that are not directly specified in the Constitution.

  26. @Tasha Turner: It’s the MSMC proposal where I’m concerned the Hugo Admins might come under harassment fire directly by griefers who are already targeting authors and editors they dislike.

    Yes, and also yes!

    Currently, puppies lash out at whatever looks responsible for their unhappiness. It could be a shoe, a paper bag, Marxists– All of the things that make puppies unhappy. If they weren’t sick, it would be silly.

    Once they have a name, they’ll focus everything on that name. There are many examples of how that turns out.

  27. Ways to “attack” the Hugos have been in plain sight for decades.

    Well, okee-dokee then.

    Since Brian is reduced to FUD about how EPH might not fix every problem that already exists the current system (but watch out because it would Wake the Dragon!), this seems like a good Speaking of the Hugos, I think it’s worthwhile to explain the logic behind EPH+, rather than just saying “Saint-Laguë and not D’Hondt”.

    Consider just the ballots in round 1. For simplicity’s sake, let’s say there are 3 easily-distinguished kinds of candidates — slate candidates, viable organic candidates,¹ and nonviable organic candidates — and that ballots can be divided into 3 corresponding piles. So, essentially all slate ballots nominate essentially all slate candidates and no viable organic candidates; viable organic ballots nominate one or more viable organic candidates along with perhaps some nonviable organic candidates; and nonviable organic ballots nominate exclusively nonviable organic candidates.

    There’s nothing the vote-counting system can do with the nonviable ballots. They might as well not exist. If the overall breakdown is 30:30:40 between slate/viable/nonviable, we might as well just have 50:50 slate/viable. In that case, we could say that the “effective proportion” of slate ballots is 50%.

    Both EPH and EPH+ are proportional voting systems. That means that the goal is that the breakdown between slate and viable finalists should proportionally approximately reflect the breakdown between those kinds of ballots. Of course, even in an ideal world, it can only be approximate. For instance, with 5 finalists, you can never have a 50:50 proportion; the closest you can get is 60:40 or 40:60.

    The basic purpose of this post is to explain the differences between EPH and EPH+. But in order to understand what follows, we need to understand what’s similar about them too. So I’ll digress about those similarities for the next few paragraphs.

    As succinctly as possible, how do EPH and EPH+ work? Each ballot has at most 1 point worth of voting power, which is divided up between the non-eliminated candidates it supports, depending on how many of them remain. At each step, you look at the two candidates with the fewest points, and eliminate whichever of them was nominated by fewer voters (even if it has fewer points). Then points are redistributed for the next step.

    Why does EPH look at total nominations, and not just points, when eliminating a candidate? After all, between two candidates with similar number of points, one organic and one from a slate, the slate one will have more nominations. Isn’t taking nominations into account helping slates, exactly the thing we want to avoid?

    Here we come to the basic problem in voting system design: strategic voting. It’s easy to design voting rules which, for a given fixed set of ballots, give the kind of result you want. But if those rules are too ham-handed, they will cause voters to react strategically, and all too often the result is exactly the opposite of what you wanted.

    (I’ve been designing and discussing voting systems, on such venues as the electorama voting list, for just about 20 years now, and so I’m pretty familiar with the community and history of voting theory, both academic and amateur. Certainly, I’m aware that I’m not the first to recognize the principle I stated in the previous paragraph. However, I believe I am one of the first to state it in such clear and general terms. And it deserves a name. Thus, in the spirit of Stigler’s law, I hereby pretentiously dub it “the Quinn paradox of voting system design”.)

    In this case, the Quinn paradox applies to SDV,² the system which divides and redistributes points like EPH but simply eliminates bottom-up on points without looking at total nominations. SDV would appear to favor non-slate works. But if voters react to SDV with the (quite rational) strategy of bullet voting (nominating only one work), there will be more non-viable ballots, so the effective proportion of slate ballots will go up. A rule that is intended to benefit non-slate candidates ends up, due to strategic voting, benefitting slate ballots.

    Avoiding Quinn’s paradox is tricky. In my experience, the best way to do it is to be balanced and subtle. For every aspect of the rules that favors one side, there should be some aspect which favors the other side. The combination should lean in the direction you want, but still offer productive outlets to opposing tendencies. That’s the idea of looking at total nominations in EPH; by giving each ballot the full benefit of the doubt when choosing which of the two lowest candidates to eliminate, the system minimizes the incentive for bullet voting, and thus, as much as possible, helps minimize the problem of nonviable (long tail) ballots.

    So, digression over; let’s return to discussing the difference between EPH and EPH+. To talk about this, it’s useful to have a quick and clear way to express a voting system scenario. Let’s simplify by assuming there is no overlap in support between different organic candidates; this is unlikely to be strictly true, but it is close enough for our purposes. In that case, we can express the amount of support each group has using a simple series of numbers; the slate voters first, then the viable organic candidates in descending order of support, and finally an “x” to represent the fact that there are still nonviable organic candidates after that but for our purposes here it doesn’t matter how many there are. So, “25;10:8:7:x” would represent a situation where slate voters were half of the total, and the organic votes were divided between three viable candidates.

    Note that I’ve “normalized” by scaling the numbers so that they add up to 50, or 10 for each finalist slot. So it’s easy to see what the “fair” (that is, proportional) share of each group is here: two and a half finalist slots for the slate voters, one for the biggest organic candidate, and a little less than one for each of the next two organic candidates. Essentially, I’m assuming that there are always 50 meaningful (viable) ballots.

    What if one of the organic candidates has more than 10 in this system? In that case, the excess ballots above those needed to get “10” are just as irrelevant as the nonviable ballots. So to keep these extras from changing the numbers for other groups, we’ll express organic numbers above 10 as “1X”. Thus, “21;1X:10:9:x” is a situation where the slate has just over 2 proportional slots, and the organic candidates have, respectively, more than one; one; and just less than one. So “1X” means essentially: “10 meaningful ballots, plus X more which we’re ignoring”.

    Under EPH, each ballot always gives out one full point, between all the surviving candidates it supports. If there is one such candidate, it gets 1 point; if there are 2, each gets 1/2; for 3, each gets 1/3; and so on. This series is known in voting theory as “D’Hondt divisors”. Thus, in the “21:1X:10:9:x” scenario above, the slate candidates collectively would have 21 points at any step of the eliminations.

    As I explained in the digression above, when EPH pays attention to the total nominations, the direct effect is to favor the slate candidates. Essentially, this works to “round up” the slate’s proportional share; so 21 points would be enough to garner 3 finalists. But that’s not really what we want to have happen, is it? Instead of “rounding up” the slate candidates, we’d like to “round up” the non-slate ones.

    What would that imply? Say we had a scenario like 26;6:6:6:6:x. If we round each of the 6s to 10, that would leave just 10 for the slate.

    That’s where the Saint-Laguë divisors for EPH+ come from. I’m skating over some tricky math here, but the key idea is that, as in the example just above, “rounding up” a small group can increase its voting power by as much as a factor of two; and so in order to achieve that “rounding up”, you should use divisors which are as much as a factor of two smaller than the D’Hondt ones by as much as a factor of two (in the limit as the number of slate candidates and the total number of finalists both go to infinity at the same proportional rate).

    Skating over yet more tricky math, the moral of the story is: EPH guarantees that an ideal slate (perfect overlap within, zero overlap outwards) will get at least its proportional share rounded up and thus that organic candidates will get at least their proportional shares rounded down, while EPH+ guarantees that the set of viable organic candidates will get at least its proportional share rounded up and thus that the slate will get at least half its proportional share rounded up. These are the strongest proportionality guarantees possible; any system outside these limits would no longer be proportional.³

    So, what about Quinn’s paradox? Won’t EPH+ provoke exactly the problems it’s trying to fix? The answer is, probably not. To see why, we have to look at two possibilities.

    First, what if organic voters respond to EPH+ by bullet voting, whereas in EPH they would not have done so? This would, indeed, be a problem; it would tend to increase the number of nonviable organic ballots, and thus increase the power of the slate. But I don’t think this is likely to be widespread; though there are circumstances where bullet voting is a rational strategy under EPH+, they are rare and contrived enough that I think that most voters will simply nominate whatever they think is worthy.

    Second, what if slate voters respond to EPH+ by strategically dividing into sub-groups of bullet voters? For instance, what if the puppetmaster of the slate says: “If you were born on the 1st-10th of the month, vote for X; if it’s 11-20, vote for Y; and 21-31, vote for Z.” This would basically make EPH+ work similar to EPH. But it requires tricky coordination, it can backfire badly if the puppetmaster overestimates the total strength of the slate, and it does remove the “rounding up” power of having artificially inflated total nominations for slate candidates. And even if everything works, it only gets EPH+ back to EPH, which isn’t the worst result in the world. So basically, I’m not worried about this.

    OK, that’s a lot of words, but I hoped this helped clear up what was going through my mind as I designed EPH and EPH+, and makes them more transparent and less “black boxes”. I’d be happy to answer any questions that this raises (though I reserve the right to ignore FUD.)

    ¹ I realize that when I begin by labeling certain candidates “viable” and “nonviable”, it may seem as if any logic that follows will only be circular. I hope that by the end of this, it’s clear why that’s not the case.

    ² In the EPH+ thread by Bruce Schneier here, there was one commenter who asserted that we should say “STV” instead of “SDV”. This is not true; STV, or single transferrable vote, is a well-known system, but not what we’re talking about here. “SDV” stands for Single Divisible Vote; a far more obscure system than STV, but one that, unlike EPH, had been proposed before the puppies existed.

    ³ OK, fine; that statement contains a very tiny lie. But it would take about half a page of heavy math notation to even explain what that lie is, and even then, with a further such dense half page of math, I could rigorously prove why counterexamples to that statement will never make good voting systems. In other words: trust me, most people reading here don’t want me to go there. For those few who do, here’s a hint: you could get a very slightly stronger guarantee than Sainte-Laguë, and still remain proportional for the scenarios we’re talking about, if you used a sequence like 1, 1/4, 1/6, 1/8, 1/10; but those proportional guarantees would break down in the case where there are three or more competing slates of which two or more but less than all contain more than one candidate.

    (note: I mistakenly posted this comment under a typo name, and it got caught in moderation. Mike, please remove the duplicate; sorry.)

  28. Aak. Crap. The version that is not in moderation is a little bit worse than the version that is in moderation. And my edit button never showed up so I can’t fix it (typical!). So…. Mike, I know this is a headache, but can you put the version that I posted under the name “e” under my name, and delete the other version?

    (If you think that long explanation deserves a new front-page post, I can write it up as one, and I’d have a few extra things to say in such a post. Email me if you want me to do that.)

    ETA: Mike apparently already deleted the “e” version. If it’s gone, no big loss; the only differences were fixing the incomplete sentence in paragraph 2, and bolding about 4 key ideas in the rest of the text.)

  29. @Kevin Standlee

    Even though, with the threshold actually being 20% of the eligible members (not 20% of those people voting) choosing REJECT, votes for ACCEPT and ABSTAIN are functionally the same choice? Or if you could abstain by simply not picking anything at all?

    This is what I mean about people feeling more comfortable with certain labels even though the things attached to those labels are exactly the same thing.

    To be fair, if the ballot does contain all three options, it makes it possible for some future business meeting to make rule changes that would take advantage of them, based on actual data about how people used them in practice.

  30. @Jameson Quinn

    If it’s gone, no big loss; the only differences were fixing the incomplete sentence in paragraph 2, and bolding about 4 key ideas in the rest of the text.)

    I had no trouble following it, but I would be interested to see the proof that EPH really tends to round the slate up whereas EPH+ rounds it down. Perhaps you could share a link to something. Or just e-mail me a PDF, if I’m the only one who wants to see it. 🙂

  31. I don’t have that proof LaTeX’d up. But you’ve stated the thing to be proved a bit wrongly. It’s not that EPH+ rounds the slate down, it’s that (roughly speaking) it rounds the organic stuff up, so that the slate can end up with as little as (half, rounded up) of its share.

  32. @Brian Z
    I usually try to give you the benefit of the doubt, but I have to say that I’ve been unable to follow your arguments. I understand that you think that changes (even EPH) will give the slates more power, not less, but I can’t figure out why you think so.

    It might help if you offered links to some of the things you assert. E.g. you say that a “conservative estimate” of the number of slate voters is ~700, but I think 300 is a conservative estimate and 700 is laughable. If you provided a link, I might find a math error, or I might discover that we’re talking about the difference between hard-core slate voter vs. would-ever-vote-for-any-slate-candidate. Lacking that, it’s easier to assume that your information is incorrect and all the assumptions that go with it.

  33. Alright, edit button. You caused me to post duplicates, and I put up with it. You robbed me of my carefully-bolded main points, and I squealed only a little. But leave me unable to cull a stray apostrophe from “its”? This. is. war.

  34. @Jameson Quinn
    I don’t have that proof LaTeX’d up. But you’ve stated the thing to be proved a bit wrongly. It’s not that EPH+ rounds the slate down, it’s that (roughly speaking) it rounds the organic stuff up, so that the slate can end up with as little as (half, rounded up) of it’s share.

    Even a formal statement of the problem would help. I’m comfortable with set theory, abstract algebra, real analysis, and probability (including measure-theory). I don’t know the usual conventions of voting analysis, but I’d be happy to read a summary paper if that would help.

  35. Jameson Quinn: I have retrieved your extra apostrophe and return it to you for future use — ‘

  36. Even a formal statement of the problem would help. I’m comfortable with set theory, abstract algebra, real analysis, and probability (including measure-theory). I don’t know the usual conventions of voting analysis, but I’d be happy to read a summary paper if that would help.

    Sorry, for right now I have to retreat to nontraditional methods of proof. I hope I’ll be able to come back to this later, but for now, just imagine me waving my arms harder. (The proof I’m imagining is just some elementary set theory and algebra. The only non-trivial part is setting up productive notation so that it doesn’t take 3 lines to say the simplest thing. That’s something that’s manageable, only several hours of work, but it’s the kind of thing where just stating the problem cleanly takes 90% of the time.)

  37. Greg, thanks. I’ll respond in detail later. I didn’t say they give “slates” more power.

    Think about the fact that Jameson’s simulations assume the items on the “puppy slate” have no support from other fans. Does that jive with what Vox did this year? Does it jive with what you think he’ll do next year? How does that affect the simulations?

    The estimate 700 is the 810 who no awarded Graphic Novel minus the 80 or 90 who tended to do that anyway in recent years.

  38. I could run sims where the slate favors the even-numbered candidates according to the non-slate preferences, and see how much power that gives them over both longlist and shortlist. Posting graphs like that would be fair game under the NDA.

    I probably won’t get to that before tomorrow.

  39. Mokoto on May 25, 2016 at 7:39 am said:

    @Tasha Turner: It’s the MSMC proposal where I’m concerned the Hugo Admins might come under harassment fire directly by griefers who are already targeting authors and editors they dislike.

    Yes, and also yes!

    Currently, puppies lash out at whatever looks responsible for their unhappiness. It could be a shoe, a paper bag, Marxists– All of the things that make puppies unhappy. If they weren’t sick, it would be silly.

    Once they have a name, they’ll focus everything on that name. There are many examples of how that turns out.

    A panel would be a target potentially for griefers. However, admins in general already are a target – just not one that the griefers have exploited fully. I’m not keen on laying out nasty ways of disrupting the Hugos but suffice to say making being an admin a really shitty job is a way to do it for a group behaving sociopathically.

    Doesn’t a panel have the same issue? Possibly and a potential issue with any panel is nobody wanting to be on such a thing. However, it has some advantages:
    1. it could be the only task that people on the panel are doing (i.e. so griefers targeting it are disrupting the work of the panel but not disrupting the overall work)
    2. not everybody is equally vulnerable – though that possibly may produce a bias in the panel (and an unfortunate one, as people less vulnerable to the sort of attacks we are discussing may be more inclined to see attacks on others as relatively harmless)
    3. But the main thing is that panels can shut down griefing quicker than passing things on to members. Anything with a second stage means the griefers can draw out the hate for longer or leave somebody who is the target of the hate via some nominated work in the unenviable position of having fandom vote on whether the work attacking them attacks them enough to be thrown out.

  40. Personal nostalgia: here’s a link that proves I was already thinking in terms of the Quinn paradox as early as 2002. (As far as I remember, I thought of it in 2000, but I can’t prove that.)

    Context: without strategy, IRV is more favorable to extremists than Condorcet. But with voters strategizing to avoid spoiled center-squeeze elections, IRV can get stuck in the center so badly that it stays there even after the true center has moved.

  41. ACCEPT is only functionally equivalent to ABSTAIN if turnout is below 40%. It seems highly probable that turnout will remain below 40% for the foreseeable future, but I’m uncomfortable writing rules that assume “eh, things probably won’t change that much, good enough”. Might as well write a ruleset that works in the more general scenario.

  42. Brian Z on May 25, 2016 at 10:52 am said:

    Greg, thanks. I’ll respond in detail later. I didn’t say they give “slates” more power.

    Think about the fact that Jameson’s simulations assume the items on the “puppy slate” have no support from other fans. Does that jive with what Vox did this year? Does it jive with what you think he’ll do next year? How does that affect the simulations?

    That’s only because he was discussing simplified cases. That VD is already trying to make his votes more organic like is a good sign of a robust reform.

    What I mean by that is a good reform should result in people who are trying to circumvent the reform behave MORE like people who aren’t. So if slaters make their slates less slate like then that is a plus even if the reform didn’t stop them completely.

  43. @Brian Z
    The estimate 700 is the 810 who no awarded Graphic Novel minus the 80 or 90 who tended to do that anyway in recent years.

    Alternate interpretation #1
    Organic/non-slate voters thought the field of Graphic Stories was pretty bad last year.

    Alternate interpretation #2
    Publicity brought in many new Hugo Voters. As a group, they disliked Graphic Stories much more than traditional Hugo Voters.

    There is no reason to assume that all 700 new Graphic Story No Award votes are slate-aligned.

    Any other process I can come up with to estimate the number of slate voters comes up with a number much less than 700. 700 is an outlier. It is not a “conservative” estimate — it is the opposite of conservative.

  44. @ Camestros
    However, admins in general already are a target – just not one that the griefers have exploited fully.

    Right now, actions taken by the admins apply evenhandedly to the field at large. Once they take a specific role in dealing with griefers, they become a much bigger target.

  45. @Brian Z

    The estimate 700 is the 810 who no awarded Graphic Novel minus the 80 or 90 who tended to do that anyway in recent years.

    Yeah, that’s an error. You need to look at two other factors. First, you want the percentage of voters who typically no-awarded Graphic Novel, not the absolute number. There were a lot more voters in 2015 than in 2014.

    Second, you want to look at how much NA voting increased due to the fact that the taboo against it vanished. For that, take a look at Best Fan Artist, where there were no slate nominees.

    When you combine those factors together, you get about 580, which is very close to the number of people who voted Vox Day for Best Editor. This is higher than the actual number of slate voters (which seems to be closer to 450), and I think that reflects the fact that there are people who broadly agree with Vox Day but who aren’t willing to let him dictate their votes overall.

    He appears to have lost about half of those people at the nomination stage this year, which really surprised me, given that they didn’t have to pay any money to do it. It might be that some of them had really believed they could win this thing and simply moved on to other projects when it became clear they couldn’t win the final vote. Or it could be that some were disappointed by the choice of works. (For example, a true homophobe should be upset by “Space Raptor,” which describes gay sex/bestiality as fun and pleasurable. A true burn-it-down firebrand would be unhappy about mainstream works on the ballot, since it means the fans don’t have to no-award so many categories.)

    To impact the final vote this year (and, by extension, the intermediate vote should 3SV pass), he would need to get people to purchase memberships. It will be very interesting to see if he can get more than 160 people to do that. We’ll know after the awards ceremony when the numbers are all released, but, frankly, I don’t see any evidence right at the moment that he’s capable of increasing the number of his followers at all, and I doubt that any action the business meeting takes is likely to change that.

  46. Bill on May 25, 2016 at 12:24 pm said:

    @ Camestros
    However, admins in general already are a target – just not one that the griefers have exploited fully.

    Right now, actions taken by the admins apply evenhandedly to the field at large. Once they take a specific role in dealing with griefers, they become a much bigger target.

    I think you are mistaken but I don’t want to disrupt the discussion further with what maybe inadvertent concern trolling on my part.

  47. Seeing Jameson’s long post above lead me to the Electorama wiki, and from that, a discussion of an election using IRV, where a candidate who took the second-highest number of 1st place votes in Round 1 ended up winning the election overall (Burlington VT used IRV in its 2009 mayoral election. In 2010 they repealed IRV.)

    This seems to have happened in 2012 (John W. Campbell Award) when E. Lily Yu got 274 1st place votes, yet defeated Brad Torgersen, who got 293 1st place votes.

    Was this widely (or narrowly) perceived as the “wrong” outcome? (and did it have something to do with Torgersen getting involved as a Puppy?)

Comments are closed.