The Startup Breakdown
Posts
Every video you see is now AI

Every video you see is now AI

OpenAI's Sora and Google's Gemini 1.5 announcements are shaping the next generation of AI models

Trey Layton
February 20, 2024

This is The Startup Breakdown, the newsletter where we learn, laugh, and love startups. By joining this growing community of hundreds of future startup aficionados (think i spelled that right?), you're getting a beachside view of the ocean that is the startup and VC scene. This ain’t your grandpa’s newsletter, so prepare yourself for an inbox full of 4/20 jokes and Succession references.

If you'd like to receive these newsletters directly in your inbox once a week, hit subscribe and never miss an email!

Love what you're reading? Craving even more startup goodness, in-depth news analysis, and maybe some extra memes? Click below to upgrade to our premium edition and become the startup guru you were born to be.

Happy Tuesday, folks.

Not gonna lie, had a pretty rough weekend mentally.

Just wanted to let you all know that if you’re going through it, you’re not alone.

Can’t tell you how much this community means to me.

Google Introduces Gemini 1.5, OpenAI Hijacks their Day

Google was giddy going into Thursday, knowing it was about to make AI nerds drool with its Gemini 1.5 announcement.

Then, OpenAI decided to crash the birthday party, open the birthday kid’s gifts, and take the birthday cake home before it was even cut.

Gemini is genuinely astounding, and its retrieval is something that OpenAI may genuinely not be able to compete with.

1 million token context window (roughly 10 books of prompt space…)
99.7% retrieval success rate, better than most models with much smaller window sizes
Video analysis and comprehension

The context window & retrieval combo are huge developments for those who have experienced models getting lost and providing incorrect answers based on inputted data.

Confused Where Are We GIF by Republic Records

Gif by republicrecords on Giphy

The question over whether OpenAI could technically match this accuracy, even on far fewer tokens, is irrelevant. The cost of doing so would be too great for even a company with OpenAI’s financial resources to bear.

Video support is also innovative, allowing users to feed in videos (the demo was a 45-minute movie) and get immediate summarization/analysis of the video contents.

The model trails GPT-4’s reasoning abilities, and unfortunately for Google, it’s called artificial intelligence, not artificial retrieval. Complex use cases are fairly limited for Gemini.

For good reason, there was immediate hype and praise for Google’s latest news, with some even suggesting that OpenAI’s lead may be slowly shrinking…

Sam Altman, however, decided to put these rumblings to rest immediately, launching OpenAI’s Sora, a text-to-video model that is going to supercharge deepfake production.

The model can turn simple text prompts into 60-second, detailed scenes complete with characters, changing camera angles, and even impressively accurate applications of the laws of physics.

Altman spent the rest of the afternoon producing videos for his followers, showcasing both the model’s unbelievable capabilities as well as Twitter’s weird, weird creative mind.

Once again, any sign of life from one of the other big tech giants hoping to compete at the LLM level was immediately squashed under the boots of OpenAI’s technical dominance.

Gif by giffffr on Giphy

Sora killed more than just the rumors of Google’s ascension though. Hundreds of startups have been working to be the first to build video models, from simple product demos to more ambitious projects like full movies.

Many of these companies will not survive, and at this rate, no “AI” companies outside of Google/Mistral/Anthropic have the technical moat to stick around. OpenAI simply has too many resources, too much talent, and Sam Altman’s vision for a completely automated future.

Sora is empowering, however, for the companies using AI as a weapon to achieve their mission rather than the mission itself. Companies that use AI are equipped with more capabilities at lower costs and quicker turnarounds every other week.

Even at Info, our app is about to get 100x better as the news content we produce gets a boost from Sora’s content generation.

Matt Shively Internet GIF by ABC Network

Gif by abcnetwork on Giphy

Not every consequence of Sora will be positive, though, and deepfakes which have already been terrifying will be 10 times better.

I’m excited to see the private solutions which emerge, such as those already being experimented with in the watermark and cryptographic verification space.

Not sure when we will get access to Sora. But I am already looking forward to the projects that some of you very creative minds put together and fearing the inevitable fake videos of Elon asking for $DOGE donations to a specified wallet.

Have fun/stay safe out there.

Google's Gemini 1.5 astonishes with its advanced retrieval capabilities, challenging OpenAI's dominance. OpenAI counters with Sora, introducing next-level text-to-video transformations, intensifying the AI innovation race.

Have other feedback? Reply directly to this email and let me know!

Cheers to another day,

Trey

Reply

or to participate.