GPT-4 Launch Details

Mar 15, 2023

This model now accepts both text and image inputs. It then outputs text only.
We are seeing that new LLMs are moving in this direction. I was expecting that GPT-4 would also be capable of both image and text output but it seems we will need to wait a bit longer for that.
The developer live demo showed some cool use cases for this. For example taking a photo of a hand drawn UI wireframe and then getting GPT-4 to code a simple application as output.

A big improvement to earlier models is that GPT-4 can now handle up to 25,000 word input in the same context.
This makes it much more capable when it comes to summarising large amounts of data / documents and allowing the user to query information on that data.
Again in the live demo we saw a new use case — the user inputted a lengthy technical documentation webpage to GPT-4 and then asked it to help resolve a bug in the user’s code using the documentation provided.

OpenAI evaluated the new model on a number of standardised tests such as the BAR exam etc.
Impressively GPT-4 scored in the top 10% for the bar exam, whereas ChatGPT scored in the bottom 10%. Other tests show similar improvements over ChatGPT.
Interestingly GPT-4 with image input gave even better results in some tests.

In a departure from other AI papers (where it has been normal to describe the architecture of the models in detail), OpenAI chose not to give any details this time due to the “competitive landscape” and safety concerns.
The only information is that GPT-4 is a “Transformer-style model” pre-trained on publicly available data and also licensed data from 3rd parties. It is also fine tuned after training using Reinforcement Learning from Human Feedback (RLHF) in the same way ChatGPT was aligned.
Unfortunately I assume this lack of architectural details will become the new normal going forward as AI competition continues to heat up.

OpenAI states that GPT-4 is much better at returning factual responses and blocking undesirable content than ChatGPT — making it more aligned.
It still ‘hallucinates’ at times though. Could the better accuracy actually be more dangerous? — the model is still not 100% factual and a concern is that it could be trusted more by its users since it will make less frequent errors.

On another note — as rumoured, Microsoft have confirmed on their blog that Bing has been running on an early preview version of GPT-4 so it could be that you have already interacted with GPT-4 over the last few weeks.

Simon Attard