Playback speed
×
Share post
Share post at current time
0:00
/
0:00
Transcript

1) Summer Fluff: Marquez/Tom of Finland

On Fire Island, Volume 1, "Bicentennial Men Summer"

Exhausted with reading relentless catastrophizing in substack, I turned back to pure gay fun, images of half-naked men, summertime vacation stories, and generative AI. No catastrophes, violence, name-calling. Just a slow glide which looks at what generative AI does, organized as a ‘sampler’. Not for everyone, but if you’re curious, watch and read on. I’m sad that there’s a dearth of gay material which shows men being men with men and so on. I’d like to fill that gap a bit.

The series “On Fire Island” starts with the premise that you can watch gay history evolve over a 50 year period by following a group of men, “Tales of the City’ style, in novels detailing their summer retreats from 1976 to 2025, a 50-year, post-stonewall period. I took that premise, and handed it to generative AI tools I built, and asked it to write and gently illustrate a series of books to see what I would call ‘latent’ history resides - pure pop cultural history - in the most popular tool, OpenAI’s gpt-4o.

My toolkit is unique, I’ve not seen any people generate or share full novels reading research or sites like Medium, much less a 50-volume set of 300-page books. I call it a matrix of AI’s which communicate with each other to generate and manage the central request, and gradually expand it to a target size. One cycle has an AI generating instructions for the next cycle of AI’s to complete, expanding the text exponentially until it fulfills the ruleset which I gave the tool. The tool, of course, was written by gpt-4o also - Python and bash scripts, and as well the scripts orchestrate communication with Stable Diffusion to generate the illustrative images.

A 350 page novel dropped on Substack may not excite many people, so I requested a Python script to extract summaries of each chapter for each novel, as well as whimsical gay touchstones, send part of them to ElevenLabs to narrate, and then roll the bundle together with the base illustrations for each chapter to make a light visual eye-candy version of the book. The only thing not synthetic, is the synthesizer background which is music taken from porn films of the period, which I may adjust.

The ruleset for the synthesis of the book and images for the series has couple of constraints. One was to choose a writer for each book to guide the style. AI’s are excellent as a ‘universal simulator’, and following a writing style, vocabulary, choice of words and sub themes is quite nice. The full range of the output I’ll put on Amazon as I edit the finished book, but it reads fascinatingly. The second rule was choosing a different whimsical artist as the illustration source, one which would render men at the beach in bathing suits nicely. The writing for each subsection of the book is summarized visually and sent along to the Stable Diffusion tool, which generates 5-20 images per section. I manually edit these to remove horror show pictures, multiple legs and arms, flesh looking like boiled meat, grotesquely fused faces and torsos. The tools are still a little ‘off’ for visuals, and I didn’t feel like running them through a classifier to rate each image and automatically remove them if they were damaged. Most image classifiers would see expanses of human skin with a skimpy bathing suit, deem the image pornographic, and refuse to classify it. C’est la vie.

Authors for the series range from William Burroughs to Virginia Woolf, Hemingway and Edward Gorey, Poe to Raymond Chandler, making each one peculiar in a fun way. Nothing like eldritch gods language from H.P. Lovecraft describing men at the beach with margaritas. Visuals for the series range from Richard Corben to R. Crumb, Artemisia Gentschli to Noyobushi Araki. The limitations of image generation were a startling tour of art history and representation, I’ll mention the problems as I get to specific artists.

This first volume is styled after Gabriel Garcia Marquez of ‘100 Years of Solitude’, the magical realist Latin American masterpiece. I paired it with images generated in the style of Tom of Finland, the noted gay artist of the period (1976). While synthetic Marquez generated relatively flexible output aligned to the ruleset, Tom of Finland didn’t.

Tom of Finland very rarely depicted night scenes, and unless prompts defined high Chiaroscuro in the image for nighttime, nothing was ever dark, or starlit, thus hinted at the limitations of the image generation systems. If they had no example inputs in a style or genre, they were very resistant to creating output aligned to a particular visual model. Tom of Finland originals are very stylized, the men somewhat metallic looking due to his choice of medium (pencil, graphite) and his illustrative style. While the image generation caught the general look (pompadour) of the men, the repetitive Kake character flattened face was never present, I got more vulpine Finland Men.

What I share here is PG. While there is a lot of brouhaha about AI language tools being woke, well, they aren’t. If you don’t constrain generation to a specific idea set or language model, you will get an mishmash of everything under the sun, which is biased to more recent trained material, which has two decades of woke. That’s easy to remove with alignment rules I supply in my toolkit. The books themselves have no constraint on content whatsoever, I model works generally after gay pulp fiction, which is, to say the least, racy. I didn’t share the fully aligned output here, I’ll decide if there’s interest to share some of the more interesting output.

Another claim is that these AI’s don’t generate truth, people have complained about them ‘making shit up’ on the fly. Of course they do, they’re neither a search engine indexing information, nor do they simulate generating facts, unless you specify. For the gay anecdotes driving the character narrative for the year period, I had to go through three cycles. First, generate whimsical gay cultural touchstones for the year. Then for each result, send it back to an AI ask if the result is true, and ask for a link to a page which can confirm the content. Then, systematically go through the rated ‘true’ results, and links, and load the page text and compare with the anecdote and see if they are logical related and true. What a human would do. Still, I’m sure there are oddball elements. Fortunately I didn’t have to write the code to do all this. The gpt-4o did all the coding, and structuring results.

Enjoy watching the images cycle with the spoken narrative and tidbits on gay culture which are bizarre and funny, and occasionally moving.

If you’re interested, a bit more about AI.

Let’s go back in time, 30 years. I have been experimenting with generative AI since roughly 1993, when I parsed project Gutenberg books, and built elaborate markov chain models of English text, and incorporating euphemistically-titled gay pulp fiction, could generate quite humorous stories. Yes, you read 1993. The result - imagine Alice in Wonderland meets Moby Dick, and retreats into “Truckin’ Naked” land. I put it on a website which allowed you to select your source model (author set), give it a few words, and off it went. In LLM parlance, I had a model with a context window of 3 tokens, and perhaps 100,000 parameters. OpenAI GPT-4o today has a contest window of 128,000 tokens, and a trillion parameters. Lovely. Unfortunately computing capacity was limited in those dark ages, and nothing substantial could be generated in 1993/4.

Begin a constant reader in technology and science, I suddenly saw emerge 6-7 years back Large Language Models along with computing capacity to generate really significant works. I used models published through hugging face (early GPT models) and with compute capacity volunteered from Google, I could generate entire novels. Also with AI now, significant image generation was possible, so I could (re) construct novels, illustrations covers, and so on. My first efforts allowed me to generate a novel in roughly 30 minutes, automatically typeset it, and send it to an on-demand printer and have a printed book in my hand in 7 days. My first generation trials generated around 65,000 books, before I turned the machine off. I printed a dozen of them, my niece loved “Half Virgin”, from the Gay AI pulp series. I toyed with releasing them under the pen-name of select senators on Amazon, but thought better of it.

Nobody was generating entire books this point, by the way. What I was doing experimentally was extremely “out there”.

Then came gpt-3.5, gpt-4turbo, gpt-4o, and a constellation of image generators. These tools gave me the ability to play with ideas at a scale and depth that was simply inconceivable 30 years ago, much less 5. I’ve generated a range of “AI Autobiographies” which are fascinating, they have an inherent narrative structure which makes them very coherent. I’ve generated other varieties of pulp fiction - romances, mysteries, adventure books, and slowly learned how to create large-scale coherence in the book. I’ve applied my toolkit to non-fiction, which creates startlingly good instruction and explanatory text. My toolkit generates entirely convincing scientific papers. I’ve tried screenplays which are extremely easy - a typical screenplay is 100 pages of very simple narrative and dialogue. After a few hundred higher-quality books, I decided to try my hand at complex multi-volume narratives, to see if I could keep the arc of the narrative over thousands of pages, something similar to what in AI is called ‘Attention’.

For the voice, ElevenLabs allows me to find recording of a voice I like and clone it. Some of the narratives you’ll here later go off the rails slightly, but that makes it more interesting to me in a way, not being perfect and sounding slightly crazed.

I don’t like to write software particularly. GPT class generators have written all the python and bash scripting required to communicate with the AI engines via internet, construct a model of each book as a gigantic JSON structure containing all the generation rules, visual rules, content iterations, and typesetting information to make a finished novel, or any rendering of the structure that is necessary.

This Volume 1 consists of roughly 3,000 files which are the instructions and results of the matrix of AI’s communicating with each other, which results in a 12.4MB data structure of the resulting book. Non-fiction books I’ve generated which have charts, diagrams, tables and other illustrations in the chapters and sections, are much larger. A chart is typically created through the AI writing a python script which then creates a chart graphic. I found the best way to do so was for the tool to write 10 scripts in parallel, run them and then compare the chart output for reliability to prompt (all AI’s talking to AI’s). With a normal non-fiction book using 10-20 graphs per chapter, it expands auxiliary content quite a bit.

It takes roughly an hour to generate full 350 page book from the first prompt, typically a storyline. The final result is then taken by an (AI written) program through Adobe InDesign, which generates a perfect print-ready of completely professional quality. I’ll share some in upcoming postings. Watching the book being typeset is wonderful, like watching a washing machine somewhat, only extremely fast. Hypnotic.

How much role do I play? Zero to full letter-for-letter generation. Zero is where I say ‘generate a storyline idea round <X>’ and then the machine clicks into place. I may write the storyline. I usually specify what I call alignment rules like language, author style, character dialog style; chapter style, section style. Mandatory or optional content at any level. I can completely specify the main characters in very fine grained detail, or I allow the toolkit to synthesize the backstories. I can write a book synopsis from the storyline, or let the toolkit to so, and then start expanding the content geometrically. I can stop the kit and edit at any stage. If it generates crap, I delete it and it starts over. I can edit the content at any stage, and any level of completion. I can specify post-generative iterative refinements and edits as automated, or manual. The main thing is that I just don’t have to write it all. In practice, I found that I can record a conversation with someone, and supply that as a core, and allow the kit to synthesize any kind of wrapper (screenplay, novel, biography, non-fiction) around it.

Is this good or bad? I don’t know yet. I love Nabokov, and he doesn’t have many novels he’s written in English. I had him write a gay version of Orpheus and Eurydice based on Donkey Kong and Super Mario dropping Molly and going to a rave, poor Super Mario Searching with the aid of several interesting men to find him in the backrooms of the event, you know, like real-life. Beautiful language. I could generate Nabokov novels to my heart’s content. I wrote a short-story collection for a friend about AI, written by Kafka, which was unsettlingly wonderful, in both German and English simultaneously, illustrated by Edward Gorey. I make no attempt to claim these are works of anything but AI, exploring ideas. But not everyone will be me. AI is a superb simulator, and there are things which weren’t meant to be simulated perhaps.

So in ending, this is what I call pure Pop Culture. If you’ve worked doing actual research, or being an anthropologist or historian, or any endeavor where you must go to original sources of facts to examine a hypothesis or conjecture, and do the actual work of science which requires texts, AI is, well, not there yet. The millions of pages of materials in libraries is not digitized, has not been trained into AI’s. Medical journals are locked behind paywalls, ephemera are barely catalogued in private collections, hardcore engineering is also locked behind paywalls. I suspect we’ll see those special-purpose trained or fine-tuned AI’s in the future costing a lot to access. But for a pop-cultural autobiography of, say, Harvey Milk, or Gertrude Stein, perfect. There’s just enough popular information in gpt-4o to synthesize interesting takes, and play with pop ideas with tools that permute ideas in hundreds of ways to see how they play.

So this is Volume 1 of 50, I’ll drop year-by-year over this summer, meant only to be for a pure, strange enjoyment of seeing what these engines do.

Discussion about this podcast