10 Things I learned About AI This Week

The days are longer...

Jun 12, 2023

As an experiment, I’m doubling the amount of content this week. I have a poll at the end of the newsletter if you’d like to share your thoughts.

1. Google has a free AI/ML course

Google has released a free Generative AI learning path that includes courses on topics such as Introduction to Generative AI, Large Language Models, and Image Generation. It is available to everyone, regardless of programming experience. Enroll if you are interested in learning about Generative AI. I’m going to start as soon as school is done for the summer. Message me if you want to chat about it.

2. Midjourney 6 is coming…and I’m Not Ready

I’ve barely caught up with version 5 and 5.1 and now Midjourney is talking about version 6 which should include the following:
Higher Resolution: Expect to see images with better-resolution images and possibly their own upscaler.

Improved Natural Language Processing: It will interpret text inputs better.
Images will be More Accurate and Detailed
Increased Degree of Control
Variation Change: Possible mode for higher and more subtle variations

And there are rumors of a possible “DragGan” interface, and out-painting. Read the whole article by Christie C. here.

3. Video Tool of the Week: Runway

So I wanted to test the current state of the art of “text to video” and “image to video” and I quickly mocked up this fake beer commercial. I’m not saying it’s brilliant, but the script was written by Chat GPT (with a little editing), The images were from Midjourney, the audio is from Eleven Labs, and the video was from text and image prompts in Runway’s new Gen2 tool. It took me about an hour. If you’d like a deeper analysis of this tool, check out this article from Tech Crunch.

4. Audio Tool of the Week: Eleven Labs

Eleven Labs was incredibly easy to get started with. I literally just copied my script from Chat GPT, selected a voice from a short list of samples, hit the play button to listen and then hit the download button. The only challenge in the free plan is that I was limited to a character count, so I had to break up my 45 second script into 3 sections, but they all sounded great and were easily dropped into my video editor.

5. Adobe Express Gets Image Generation

Adobe Express (formerly Spark) is their social media tool and collection of templates that makes it easy for non-designers to quickly generate content. If you watch the one minute video here, it looks even more amazing as it now incorporates Photoshop’s Generative Fill.

Today we’re bringing the magic of Adobe Firefly directly into Adobe Express, letting users create stunning imagery through Text-to-Image and Text-Effects by prompting Express with their own words in natural, conversational language.

6. Getting Better Prompts Out of MidJourney

Frank Andrade (the PyCoach on Medium) has now become one of my main sources of inspiration. If you’d like a deeper dive into understanding how Midjourney works, check out this article where he trains Chat GPT to write better Midjourney prompts. And if you’re just getting started with Midjourney, start with this beginner-friendly article.

7. Inspiration of the Week

If you’re one of my second year students you’ve already seen this because I shared it back in April. This is a real world example of how a creative team used Generative tools to create a campaign with real goals and metrics for retirement accounts for ING Poland. Check it out, it’s beautiful work. Future Me. Make sure you scroll down to the 2 minute video at the end.

8. Deeper Understanding

If you’re like me and you realize that you need to understand the foundation, not just learn how to use the tools, you might like this article, A Gentle Introduction to Large Language Models Without the Hype. It really helped clear up some difficult concepts about Artificial Intelligence, Machine Learning, and Neural Networks.

9. For the UX or Product Designers

In Product Design in the Age of AI, Kazden Cattapan discusses the coming changes we should expect, but points out that “understanding people, solving problems, and communication skills” will never go out of fashion.

As we enter this new era of computing, we’re seeing the rise of declarative UI — interfaces that allow a person to specify a desired outcome, leaving the rest to a system. Put another way, we’re iterating on human-computer interaction and making it easier for humans to get a computer to do what they want with less input, and the computers are understanding the way humans speak, rather than humans having to learn how a computer speaks.

10. Ethical Moment

If you’re concerned with the proliferation of deep fakes that is about to hit (is hitting) the internet, you might be relieved by this article from VOX that goes into detail about Microsoft, Google, and Adobe are trying to do about it.

One novel approach — that some experts say could actually work — is to use metadata, watermarks, and other technical systems to distinguish fake from real. Companies like Google, Adobe, and Microsoft are all supporting some form of labeling of AI in their products.

Thanks for reading! If you find value in this content, you can do me a huge favor and share it with a friend!

Fadimantium

Discussion about this post