OpenAI Developer Conference

November 6 will probably usher in a new era

Nov 06, 2023

OpenAI will present their first ever developer conference in San Francisco today. Will I don’t know what’s on the agenda other than some “huge” announcements, I’m sure it will have a big Impact. ChatGPT was released only eleven months ago and it’s had a profound impact. I chose this image because Midjourney added an extra finger on Altman’s right and I frequently wonder if he comprehends how much power he has and if he’s interested in tipping the scales of justice.

Why I’m Dropping Dalle-3

To be clear, OpenAI has done a great job with their integration of Dalle-3 into ChatGPT. It’s a great leap forward in terms if the evolution of image prompting interfaces as ChatGPT is actually taking your suggestion and converting it into a better prompt. It’s almost like working with an Art Director. And then, you can continue to refine the image with an extended dialogue until you get (something close to) what you want. At $20/month for both tools, this is an incredible value and if it fits your work needs, I would still give it my highest recommendation. The only real issue is that I like Midjourney better. Yes, I hate using Discord, but after 9 months of struggling with all the silly little rules I can produce a broad range of images really quickly and I can almost always get exactly what I want. Oh, and the new style tuner is awkward, but really exciting, see below.

Midjourney’s New Style Tuner

So clearly the folks at Midjourney didn’t get my memo that they need to make their tool more usable, but they did devise a way to create and apply consistent styles. While I will do a longer article later this week, it goes something like this. Instead of writing /imagine, you begin your prompt with /tune and then enter your desired prompt. In this case I wrote: Scandinavian folk art illustration, linocut, ink. Midjourney will then ask you how many variations you’d like to factor into your tuned style, a choice of 16, 32, 64, or 128. Note, when you select the first option, ‘16’, that’s how many options you’ll end up with after selecting half of a grid of 32 thumbnails. MJ will then warn you that this will cost you in terms of your total fast hours. You can then select from ‘RAW’ or “Default’ mode (Default is more ‘stylish’) and click submit.

After a 1-2 minute wait, you are then directed to a web based URL where you will select up to 16 images of the 32 possible style interpretations. You can then copy an alphanumeric code which you can add as a suffix to future prompts. This is a ‘fine-tuning’ or an aggregation of the styles you selected from MJ’s initial output.
Now, you can return to Midjourney and make a series of simple prompts plus that suffix and they should be stylistically similar. a cat --style 4HKRusE9NnXQ. You can continue to modify your style to push it on direction or another and generate a new code if you like. So far I’m really impressed and think it’s great when you need a large number of consistent styles. So how is this different from '/prefer option set which also allows for consistency among prompts with a uniform suffix? Because you can continue to fine-tune the result with a visual interface. A little awkward at first, but a nice move in the right direction.

For example, I revisited the style tuner page, selected all the darkest examples of Scandinavian folklore, and generated a new suffix to apply to my previous prompts and received the following result. Creepier than I expected, but demonstrates the broad range of possibility. Hopefully I can go back and find a balance between the two.

RunwayML Promises Higher Quality

Runway ML Prompt: Drone footage, Oregon coast, crashing waves, golden hour.

RunwayML has released a major update on it’s text o video and image to video models that some filmmakers are calling a game-changing leap forward. If you check out the video on this Twitter thread, it really is amazing. Perhaps I don’t know how to prompt as well, or they’re cherry picking the best of multiple generations, but my results don’t look as good. Stay tuned, I have a few credits left and I’ll try to generate something awesome. Pika Labs has a new update this week as well. If you like the horror genre, check out the winners of their Halloween AI Film Contest.

AI Product Design Course

These online classes from ELVTR are usually pretty expensive, so that’s not what I’m pushing here. But if you are a product designer trying to transition into AI, you can scan and snipe a lot of good ideas from the syllabus. That’s what I’m doing. Check it out . They also offer a Product Management for AI class, if that’s your flavor.
Warning, don’t spend a lot of time scanning that page for a price. You can only find out by having a salesperson call you back. Save time, scan the syllabus for project ideas and just go build something.

Stability AI Announces Text to 3D

Do you work in 3D? Are you excited about the new possibilities of text to 3D prompting. Apparently it’s a thing now and I’ve seen some convincing examples of it on YouTube. You can read about it here and you’ll also find a contact form that will get you early access if that’s your thing.

That’s a wrap! Short newsletter today. I have a bigger essay coming out on Tuesday and I’m sure I’ll have a lot to write about today. I hope you’re getting a lot out of this newsletter and you’re sending it on to your friends.

Geeta Sadashivan

Nov 8, 2023

I like the examples you included of your experiments with Midjourney-- the comparison of the folk art rabbit/duck/owl/cat with the creepy versions. Could you include a recurring section in your newsletter in which you share your experiments and results? Thanks!

Expand full comment

1 reply by Erik Fadiman

1 more comment...

Fadimantium

Discussion about this post