Playback speed
×
Share post
Share post at current time
0:00
/
0:00
Transcript

4) Summer Fluff: Sinclair Lewis/Peyo

On Fire Island, Volume 4: "Coming Out Training Summer"

“Coming Out” is a pivotal event for gay men’s lives, hard to describe to those who won’t have to go though it. In 1979, coming out was still fraught with fear and hesitation, and in its ineffable logic my Book Writing toolkit decided 1979’s Fire Island vacation would be devoted to training gay men to come out. This video is a synopsis of the 350-page AI-generated book, following our crew of 8 gay men, year 4 of 50, tracing out 5 decades of gay history. The book text style target is Sinclair Lewis which I thought might get appropriate potboiler dramatic style, and the image style target is Peyo, a Belgian nom-de-pen creating cuddly characters more associated with Smurfs than gay men on a beach. Belgians are the best comic artists IMHO. It’s very fluffy viewing, enjoying lots of gay bulging muscular men on the beach, at discos, searching for treasure, the usual things you do at the beach in the summer. Enjoy!

As you follow these droplets of AI, you know that everything is AI-generated from the voiceover, to the text and images, to the very software which wrote and rendered the videos itself. When I work with people and AI, it’s hard to describe to them how far it can go, beyond what people imagine - 350 page novel! Two hour long audio book! Bing Bang Bong! It’s hard to imagine it can also write the software to generate the video using extracts from a book. Believe me, I couldn’t write video editing software to save my life, but there it is, I call it ‘firebook.py’, 5 modules which synthesize and sequence images from the original generation, with the typesetting, synchronized voice, even down to the little highlights that bounce over words. You just have to know what to ask for, in this case a kind of ‘Sampler’ of the technology.

Unbiased Images: I wrote briefly on the problem of European feature bias in Volume 2 of the sampler, in targeting paintings by Ford Madox Brown, the pre-Raphaelite. If the artist hadn’t made images with those feature sets, it’s near impossible to get out. The Peyo target spontaneously generates innumerable feature types of multiple ancestral origins, which is nice. In fact, virtually all late 20th-century cartoonists render people of almost all ancestries and feature types. It is unusual however to see Asian feature sets until much later, when Manga starts to become prevalent.

While Peyo images are the great fun, they show a set of problems with the current ‘state of the art’ of the tool I use called Stable Diffusion. It’s still astonishing to me that it can generate these images at all, much less track the character attribute consistency over a few thousands different cycles of generation for visual illustrations aligned to the text. What does become a problem is differentiation of Peyo from other illustration styles, and the ability to manage faces in densely-populated images. I mentioned how ‘natural’ Tintype photographic images were in Volume 3, well, comic illustrations are the most natural of styles for non-photographic illustrations with Stable Diffusion. I suspect of the training images used to create Stable Diffusion, the predominant sets are these style comic illustrations. Peyo style is predominant, but often diffuses in to generic muscular body/face pen-and-ink, similar to Steve Ditko (think Spiderman) and others. You’ll see this in future volumes where what I call the ‘House Style’ illustration creeps into the output.

Another problem is for the generation system to manage densely populated images. Faces drift very easily from human-featured into, well, horror-movie wax-melting models. The one thing I have to do with all illustration work is edit severely. there’s a cycle in the AI matrix around a book, and I insert visual summarization at the sub-chapter ‘scene’ level with 20-50 distinct scenes. As the AI tool writes completes a scene, it generates a prompt for the visual summarizer AI, who then specifies characters and actions as input segments for a prompt to the Image Generator AI. To make videos which sort of animate and move gently, I ask for 5 versions of 4 images for each scene varying strength of characterization, so 5 versions x 4 images x 20 scenes x 9 chapters = 3600 raw images generated for the chapter illustration. I use another tool written by gpt-4o to visual scan every single image, and tap a button to kill it if it’s garbled. Ugh I hate the numbers = x 50 books. How much time have I spent! Where images morph very fast, I had to delete garbled bodies. But I left just a few 3-legged, four-armed men, or two-heads, or six-fingers, or fascinating feet. This process is still very much at the early stages.

Another problem is ‘dead eye’. It seems that if there are two many wide-open eyes looking towards the camera, the iris and/or pupil either become tiny or absent. It’s generally controlled enough for most of what appears here, it’s most disturbing when it appears in photographic rendering. The technology is still at early stages. Normally I used a multi-loop feedback process. I send the image to another AI and ask it to provide feature information - “art there anatomically wrong hands, legs, arms, feet, fingers…” then from the result, I could automatically cull bad images. However that brings me to another very annoying bias.

The Skin/Flesh bias: Imaging systems are terribly afraid of sex, it’s almost as through they’re all “No Sex Please We’re British”. While most tools I use can generate semi-nude bodies - you see them here - and sometimes fully nude - a few pop in to spice things up - if I send it a body which isn’t basically 75% covered, it will reject classifying the image as ‘inappropriate’. Whatever that means. I decided that their tools can easily detect large expanses of skin which are smooth (see my smooth issue notes in Volume 2), and it cries ‘Danger Danger’ and that’s that. I have both text generators and image classifiers on my laptop, instead of commercial ones, which don’t have biases, but they’re teeth-gratingly slow. So I delete bad ones I personally classify and move on.

I had hoped Sinclair Lewis would be more potboiler-ey than was the result, but it’s perhaps the period vocabulary and information structure he comes from:

In the tumultuous year of 1979, the legendary gay bar known as The Stud uprooted itself and found a new sanctuary within the vibrant heart of San Francisco, California. The air was thick with excitement; the city streets awash with colorful characters who flocked to this beacon of freedom. With neon lights gleaming and music pulsating through its walls, The Stud established its current residence, forever etched into the soul of a metropolis celebrated for its unabashed embrace of gay culture.

This has what I feel is a log of residual AI hoo-hah, words liked ‘etched’ and ‘unabashed’. Anyone who uses gpt-4o knows there’s a house style, which in the case of Sinclair Lewis, is so close to house style that tell-tale words pop up in the resulting text.

The beats were dying down, the lights flickering softer as the last disco ball spun its final twirl. Amid the laughter and the gradual calming of the music, Miguel, with that sly grin of his, tossed out a challenge that immediately captured everyone's attention.

“How about we strip the night away with a dip in the ocean?” he shouted, his voice ripe with mischief.

No one could ever refuse a Miguel challenge. As we all shed our layers, the cool night air kissed our bare skin, sending shivers that weren’t entirely from the cold. Our feet padded across the soft, fine sand, racing towards the beckoning ocean that glimmered under the moonlight like a vast, welcoming mystery.

One by one, everyone plunged into the gentle waves, the ocean swallowing our howls and laughter. The water was shockingly cold, biting at our warmth, but that only added to the thrill. We splashed and yelled, teasing and taunting in a way only old friends could.

I’m going to have to go back and read “Babbitt” again. Perhaps 20’s writing styling doesn’t suit aggressively sexual novels… Hmm.

Discussion about this podcast