Foundation Models

https://aws.amazon.com/what-is/foundation-models/

<aside> 💡

A foundation model is a large, pre-trained AI model designed to be adaptable to a wide range of downstream tasks. These models are trained on massive datasets and can then be fine-tuned for specific applications without needing to be retrained from scratch.

</aside>

Some examples are GPT-4, Claude, and Llama, as well as image and audio models like Stable Diffusion and DALL-E.

Since such models are trained to perform a wider variety of tasks, they’re a cheaper alternative to training a model to perform something specific say code generation.

Overview

Foundation models are based on complex neural networks including generative adversarial networks (GANs), transformers, and variational encoders.