
The Seattle Worldcon 2025’s WSFS Division Head Cassidy, Hugo Administrator Nicholas Whyte, and Deputy Hugo Administrator Esther MacCallum-Stewart today announced their resignations from the committee in the following statement:
Effective immediately, Cassidy (WSFS DH), Nicholas Whyte (Hugo Administrator) and Esther MacCallum-Stewart (Deputy Hugo Administrator) resign from their respective roles from the Seattle 2025 Worldcon. We do not see a path forward that enables us to make further contributions at this stage.
We want to reaffirm that no LLMs or generative AI have been used in the Hugo Awards process at any stage. Our nomination software NomNom is well-documented on GitHub for anyone to be able to review. We firmly believe in transparency for the awards process and for the Finalists who have been nominated. We believe that the Hugo Awards exist to celebrate our community which is filled with artists, authors, and fans who adore the works of our creative SFF community. Our belief in the mission of the Hugo Awards, and Worldcon in general has guided our actions in the administration of these awards, and now guides our actions in leaving the Seattle Worldcon.
Cassidy
Nicholas Whyte
Esther MacCallum-Stewart
The Seattle Worldcon’s WSFS Division administers the Hugo Awards, Business Meeting, and Site Selection. The committee’s remaining WSFS Division leadership includes Deputy Division Heads Kathryn Duval and Rosemary Parks (who is also Site Selection Coordinator).
Once before Nicholas Whyte was part of a group resignation from a Worldcon WSFS Division, in June 2021 when he was DisCon III’s WSFS Division Head (see “Another DisCon III Hugo Administration Team Resigns”).
See additional coverage here: “Responding to Controversy, Seattle Worldcon Defends Using ChatGPT to Vet Program Participants”, “Seattle 2025 Chair Apologizes for Use of ChatGPT to Vet Program Participants”, “Seattle Worldcon 2025 ChatGPT Controversy Roundup”, and “Seattle Worldcon 2025 Cancels WSFS Business Meeting Town Hall 1”.
Discover more from File 770
Subscribe to get the latest posts sent to your email.
They are extremely different. A search engine that doesn’t present an AI summary is just listing results with links. It isn’t replacing the need to go to those places to get the information you seek. LLMs gobble up vast information from others and present answers directly — some accurate, some inaccurate and some completely made up — in a tone of complete confidence.
When people in this discussion tout AIs ability to vet people, do they consider that AI is going to reflect conventional majority biases on what is controversial about someone’s background? Like people of color who were finding that AI screeners rejected their job applications at stage one far more often, there’s a lot of potential harm in replacing human judgement with collective wisdom reinforcement machines.
It isn’t useless but it’s usefulness is dangerously overrated because of the certainty of its tone. Judge it by asking about things you already know with high expertise. It’s like watching a movie about your profession. As a former newspaper reporter I often can’t make it all the way through movies about journalism. There’s too much goofiness that bears no resemblance to reality.
bill:
Don’t you think this passage, while it happens to be built out of actual facts (hardly a guarantee for an LLM), is just a little bit misleading?
McCarty did not resign as Hugo administrator for the awards in question, which is absolutely what any human would understand that sequence of words to mean. He resigned from WIP, and maybe from other things I haven’t kept track of.
This is one way out of several that LLMs don’t even work.
I don’t think “help us or we’ll do it with AI” would attract more volunteers or reduce drama. It would make people mad the idea was even under consideration and generate lots of threads like this one. These kinds of controversies are demoralizing, both within the con and within the larger community the con serves.
bill:
Don’t you think the above passage, while built of true material (hardly a guarantee with LLMs) is just a little bit deceptive? To any naive human reader, it sounds like McCarty resigned as Hugo administrator.
McCarty did not resign as Hugo administrator of the awards in question. He resigned from WIP and perhaps from other things I have not tracked.
This is among the ways that LLMs don’t even work.
The timing coincides with the negative reaction to AI vetting of Worldcon panelists.
But it also aligns with the traditional season of internal disputes over how to identify and handle bloc voting, and what can be said about it.
“We firmly believe in transparency…”
“Our belief in the mission of the Hugo Awards… now guides our actions…”
You all might be interested in this look at how a LLM fared at identitifying SF stories (a few years ago): https://scifi.meta.stackexchange.com/a/13985/
In 20 cases, there was one success.
I’m sorry they resigned and am still mystified as to why (I’ve read the conjectures in comments). Thanks for all you’ve done, and I’m sorry this was the only solution you could find, Nicholas, Esther, & Cassidy!
To echo John B and Bill:
I also tested it on one of the models allegedly used with the following parameters “Tell me if [insert-name-here] has engaged in any public activities which could be in conflict with our CoC [link] or otherwise bring our event Worldcon [link] into disrepute and tell me if I should investigate deeper”
Each example I could think of gave me a pretty succinct yes/no response. Now sure this could lead to false negatives where somebody problematic is not flagged but I couldn’t find any examples where it didn’t nail it. I didn’t do a 1000 but I did a dozen or of different types.
It was also clear when it didn’t think it could find enough information to be useful OR where it wasn’t clear if the person it did find was the right one – which included myself who went through the con process.
Finally, ignoring the stupid and often incorrect “AI” summary in search results – anybody who doesn’t think that the way the results are identified and ranked now isn’t using LLMs underneath, at least on Google and Bing hasn’t been paying attention to the mess that SEO is these days.
Final point – any reference to how an LLM did at ANYTHING even a few months ago is already helplessly out of date – which includes that NYT article that came out yesterday.
For example purposes, one of the tests with ChatGPT with ‘search’ enabled was:
ChatGPT said:
5 listed items – (all of which were correctly identified and summarized)
I checked the sources referenced and they were all real and correct and included:
– NPR
– Publishers Weekly
– File770 (wretch hive of villany that we all know it to be)
– Reddit
and… a blog I’d never heard of but did exist and did reference the public incidents – I tried a few others but it was all much the same.
I don’t think they should have done it this way but asking if it works based on assumptions that you can easily test doesn’t help the discourse either.
What evidence do you have that any such internal dispute has taken place in the era since EPH was ratified?
And remember that the point of this isn’t to have the AI make the decision. It’s to have AI alert human beings as to the people who should be investigated further so the humans can make a decision. It’s using the LLM as a tool, not a free agent.
Also note that Dave’s carefully worded prompt got a better response than Bill’s. Like with any tool, skill of the operator matters. People can do a lot of random damage with a chainsaw, and they can also create works of art, depending on their skill level.
Andrew (not Werdna) – I have to say that as a person who has been known to set quizzes, from time to time, if I set some of those questions (Ask AI to Identify SF stories) for a non-artificial audience I’d get run out of the bar!
“Searching for a novel about space exploration”???
Of the ones it got wrong where there was enough data to have a go, GPT4o seemed to work, mostly, but some of those ‘incorrect’ answers I’d argue with because honestly, the plot of the AI answer was close enough that a team would argue with a quiz master over them due to vagueness!
@Daveon: You’re looking only at the titles to the questions, not the questions themselves.
“Searching for a novel about space exploration” is the title of this question:
I’d still argue that is pretty vague and could be argued about by people too as are several of the others where I’d reckon a team of 4 could come up with several answers based on the question 🙂
I’d have opted for Bova or Baxter as a human myself.
@rcade — you answered a different question than the one I asked. I was addressing the issue of AIs massively scraping copyrighted material, that is often referred to here as “stealing” the material. I think the Google search engines do the same thing (and evidence from the “Authors Guild v. Google” court case tends to confirm that).
Yes, the output of the two different systems is different. But both are useful to the person who is trying to identify controversies associated with individuals. Neither is appropriate for the actual decision of whether to include someone or not; that should be made by a person who can use judgment about the veracity of the issues identified, and if they amount to a reason to exclude someone.
Kathy Bond’s statement makes it clear that: They were sceptical of the AI results, and followed up with “normal” searches to confirm AI-generated results and to identify things that AI missed; that the use of AI was a “test”, meant to find out if it was useful, and was not how proposed panelists were vetted; and no panelists were denied a panel spot based solely on AI results.
“it’s usefulness is dangerously overrated ” — by whom? Seattle?
@clauclauclaudia — Yes, Gemini got a detail wrong, but it was accurate in identifying problems in a general way that one would want to know about before letting McCarty on a panel. Do you thing that puttting McCarty’s name into Google or Duckduckgo or any other search engine would return 100% accurate information? Of course not.
I’ll say again, the problems with using AI are not obviated in any way by using routine search engines instead.
Cross-references to items pertaining to ChatGPT and Seattle 2025:
https://file770.com/responding-to-controversy-seattle-worldcon/
https://file770.com/seattle-worldcon-2025-tells-how-chatgpt-was-used/ — actually, this item has a “See additional coverage here” block at the bottom with more article references.
“Lawyers using LLMs to get cites to use in their briefs have had their licenses suspended or been disbarred, because the very plausible legal cites turned out to be completely fictional.”
I can find multiple cases where attorneys have been fined, and have had harsh things said to and about them by judges, for putting made-up stuff in their filings that had been generated by AI (although often it appears that the judges were actually mad not because an erroneous thing was put into a brief, but because, after being caught, the lawyer didn’t admit it and/or doubled down on the error, or blamed someone else, or was not contrite about having done so). In general, for the cases I saw, if an attorney used AI improperly but then accepted the blame and made good faith efforts to fix the problem, judges were pretty forgiving. In fact, in one case, “the judge indicated that he, in general, was open to attorneys’ use of AI, calling it “the new normal.””
https://www.lawnext.com/2025/03/federal-judge-in-virginia-declines-to-sanction-lawyer-who-filed-ai-generated-erroneous-citations.html
Well, one was leaked by a whistleblower:
I will clarify: beyond the perennial bloc vote suspicions, problematic vetting and approval of finalists, generally, is a traditional and seasonal source of unfairness and lack of transparency.
The statement says AI was not used for vetting. Not that their resignation is unrelated to the vetting and approval of finalists – whether associated with bloc voting, or with scandals.
Pingback: Robot Hallucinations | Cora Buhlert
@Daveon: I almost hate to come back to this discussion because it seems like over the last 24 hours enough has happened to make this thread old news, but I think we’ll have to agree to disagree. The original question-poser for the question we’ve been discussing agreed that Russian Spring was the right answer (based on the argument* the human provided, which matched both the “Battlestar America” and the ice cream details to that book). In general, I wouldn’t expect a question from someone trying to identify an unknown book to be as well designed as a trivia question (for which the questioner knows the answer from the beginning).
I’m actually kind of surprised the LLM did so badly in that example, since googling “battlestar america” “Science fiction” in Google book search finds “Russian Spring” and reviews of “Russian Spring” straight away (I suspect that there’s more written about Bova’s series of novels than about the Spinrad, which may affect the results (the LLM didn’t even identify a particular book; it chose a novel series, by the way)).
*Unfortunately, I don’t have access to the actual text of the answer the LLM provided – it would be interesting to see what it wrote.