General-purpose ML Models

Recently, Large Language Models (shorthand: LLM) have made quite a splash. Most notoriously, OpenAI released their GPT-3 model in 2020.

The typical use cases

Large Language Models have been trained and commercialized by a few organizations. If you venture onto their websites, you’ll find a sample of what these are capable of. Some common examples:

Text summaries/classifications – Given an input text, you can ask the model to output a summary of the text, or assign it a category.
Chatbots – Given an input chat history, the model can return a response, as if written by a person. I.e. it understands the flow of conversation, and can be used as a chatbot.
Full-on text generation – Give the model a description for a short story, an essay or article. It can then generate this for you.

The surprising use cases

After the first LLM:s were released into private betas, it became apparent that the models actually “understood” more than freeform text. For instance, they could actually write and understand code. Very soon, blogs and technical social media was bursting with examples of people generating simple websites or using LLM:s to solve programming exercises. This was all done simply by telling the model what to do, as if you were describing it to a person.

Other interesting use cases included chess. It turns out that in the process of training the models on data from the internet, the models had learned how to play chess to some degree. This is simply because there exists text-based ways of representing chess games (used extensively before modern chess websites). In short, if it’s on the internet and exists in text format, these models have been trained on it.

Effects

These models have turned out to be general-purpose. The exact same model can be used to solve a whole range of problems – as long as they can be represented as text. For applicable problems, using an existing model lowers the cost of utilizing ML by magnitudes.

Already, at the time of writing, you might’ve started to see the effects. Chat & email clients can suggest entire responses for you. In programming text editors, there now exists plugins to generate code, simply by writing a comment describing what you want to do (or by inferring from context). Expect this development to continue. LLM:s will find its way into more and more corners of the software you use daily.