[NetBehaviour] ChatGPT-4o: Breakthrough or Bust?

Mez Breeze via NetBehaviour Tue, 14 May 2024 03:50:35 -0700

Hihi All.

So I've just written this Patreon post
<https://www.patreon.com/posts/104222987> all about OpenAi's latest
release, ChatGPT-4o. I'm curious to see what peeps here make of it:


--

_ChatGPT-4o: Breakthrough or Bust?_

ChatGPT-4o [or Omni as the cool kids at OpenAI term it] sauntered into the
release-spotlight earlier today, with OpenAi writing on their website that
it’s:

*“…a step towards much more natural human-computer interaction—it accepts
as input any combination of text, audio, and image and generates any
combination of text, audio, and image outputs. It can respond to audio
inputs in as little as 232 milliseconds, with an average of 320
milliseconds, which is similar to human response time in a conversation.”*

Released alongside a bevy of slick videos, ChatGPT-4o is touted to be the
next big thing with multiple video promos parading its new multimodal
chops. One such video shows one version of GPT-4o narrating visual scenes
to its visually impaired AI mate [while it asks questions in turn] all
while the human testers fidgeted impatiently on the sidelines, barely
masking their urge to fast-forward to the good bits.

The grand unveiling of GPT-4o could’ve been lifted straight from a sci-fi
script. We watched [some in awe, some with cynicism] as these videos
attempted to paint a future where AIs chat about the décor in an empty room
and then sing a duet about the process. And yet it's hard not to chuckle at
the not-so-subtle desperation in the human testers who seem hell-bent on
skipping/interrupting the more stilted voice-scripted parts of the AI
dialogue and hurry to reach the 'money shot' of the demo [ie the concrete
feature they were trying to showcase]. It begs the question: was the
fanfare a bit premature? After all, this wasn’t the GPT-5 release party
everyone had RSVP’d to.

Diving into the nitty-gritty, the AI’s voice tech is frankly impressive,
reminding us of those Figure 01 Speech-to-Speech Reasoning clips
<https://www.youtube.com/watch?v=Sq1QZB5baNw&ab_channel=Figure> from a
while back. The voices are spot-on: the AI sounds like a person, with
natural human-emulated cadences, sub-vocalisations, and tonalities [the
laughter is weirdly realistic]. What was far less impressive is the
replication of bias in the gendered aspects the simulated speech with the
female voices being far more sexualised/flirty than the male ones. It's
especially disappointing given the potential ramifications for how this
will impact users and perpetuate current gender biases. In 2024, one would
hope we’d be past such gendered gimmicks [but….no].

Now let’s talk timing: releasing ChatGPT-4o for free might seem like a
clever move, but the cynics among us might sniff something fishy. Rumour
has it OpenAI’s nearly chewed through the entire web for data to feed its
language models, so why not throw open the gates and let the global crowd
serve up fresh fodder? It’s a clever ploy about scraping up every last
crumb of human interaction to power their data-hungry tech.

Let’s not be too harsh, though. OpenAI’s latest toy is a bit of a marvel:
in some ways it’s like watching a new species come to life. But as we ooh
and ahh over this latest OpenAI release, are we so dazzled by the prospect
of talking chat [and vision-processing] bots that we overlook the less
glamorous implications, like privacy erosion and data exploitation? [And
let’s not kid ourselves, wasn't everyone really hanging out for GPT-5?]

In essence ChatGPT-4o is a bit of a mixed bag. It's an impressive party
trick, sure, but when the release-glitter settles we’re left pondering what
it really brings to the table. It’s a step forward no doubt, but also a
sidestep, a fancy detour on the road to more profound innovations. The real
kicker isn’t what this AI can do but what it represents in the grand scheme
of things [all up a blend of breakthrough and bias, of marvels and missed
opportunities]. So let’s marvel at the spectacle but remain skeptical of
the smoke and mirrors. After all [in the current world of current AI
release-hype] every new release is a double-edged sword, and it’s up to us
to suss out which edge to grab.
-- 
| mezbreezedesign.com

_______________________________________________
NetBehaviour mailing list
[email protected]
https://lists.netbehaviour.org/mailman/listinfo/netbehaviour

[NetBehaviour] ChatGPT-4o: Breakthrough or Bust?

Reply via email to