The future is headless

Originally published on the GRID blog.

An image processing demonstration

Most AI interactions today happen in chat interfaces, but what if ChatGPT could use external tools to respond?

Let’s say I’m preparing a presentation about AI researchers. I need profile photos of Fei-Fei Li, Geoffrey Hinton, and Yann LeCun that:

Have a consistent aspect ratio
Share a similar brightness and contrast
Apply a subtle, cohesive filter

Instead of turning to purpose-built software like Preview or Photoshop. What if we could do this from within our AI assistant?

Let’s see how far we get with Photo GPT, a custom GPT I’ve equipped with API access to simple image processing capabilities:

In only a couple of minute I have three processed, uniform, ready-to-use photos, all transformed through natural language:

Fei Fei Li, Geoffrey Hinton and Yann LeCun

This isn’t Photoshop by any means. It’s not meant to be. But for simple, everyday tasks, having seamless access to such tools from within our “interface for everything” will be faster, easier and more convenient.

In addition to face detection and color manipulation, Photo GPT is capable of a small range of image transformations and application of Instagram-like filters:

It can even fill in meme templates:

The best way to explore it is to simply ask it what it can do.

Go ahead and try it for yourselves here: Photo GPT

The potential of headless software in AI

As evident from this experiment, current integration options in ChatGPT leave a lot to be desired:

It doesn’t work with uploaded images
If doesn’t work with generated images
It can’t reliably use images from ChatGPT’s built-in Web search
The capability is only available in this custom GPT rather than across all ChatGPT conversations

Yet ChatGPT’s function calling is far ahead of Claude’s.

But this gives a glimpse of the future. Like the Google search box became the gateway to the web, the leading LLM interfaces have a chance of becoming the interface to all things digital.

The power of AI isn’t only in making the models bigger and more capable. There is a huge and immediate opportunity in making them better at using external tools. Instead of trying to do everything, LLMs can act as an interface to a world of headless functions, APIs, and execution engines.

This is the promise of headless software — modular, specialized capabilities that are available to automation and AI assistants regardless of the user-interface.

Right now, the only way to get new functions into ChatGPT is to create an entirely new GPT. But what people actually need are capabilities that are useful and relevant to them and they can use across their entire AI experience.

And the inevitable app store to serve these capabilities.

Headless software

Screenshot from Photo GPT

An image processing demonstration

The potential of headless software in AI

Stay in the loop