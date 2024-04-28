Key Takeaways Llama3 is freely available to developers, setting a new standard for open-access AI models.

Meta prioritizes quality over size, making Llama3 easier to run on local machines.

Meta plans for future improvements, with a 400B parameter version and multi-language support in the works.

Meta's long-awaited Llama3 (stylized LlaMa3) model is here, and it brings a host of technical improvements. While it's another relatively small model, with variations in 8B and 80B parameters, Llama3 retains a focus on high quality training data and effective guardrails. Meta used a training dataset seven times larger than their previous model (Llama2), training Llama3 on 15 trillion tokens and separately developing a series of data pipelines, filters, and heuristic based approaches to maximize data quality with a relatively small number of parameters.

Llama3 is a significant step forward for Meta's models, and one that will only improve as the company refines its processes and releases new iterations with larger parameter counts and training datasets. A fully multi-modal version is already planned, as well as a 400B parameter count version and multi-language support. But what separates LlaMa3 from the likes of OpenAI's GPT models, or Google's Gemini, you may ask? Here are some reasons why Llama3 is actually a big deal.

1 Llama3 is freely available to developers

Meta is taking a different approach to the likes of OpenAI

One unique step Meta is taking in the AI space is the openly available, portable nature of its models. Meta is joining companies like Mistral in releasing their model for anyone to freely use. This includes an unlimited license for commercial or research use. The company has been open in its ambition to release its models publicly to enhance the development of AI, promising early support for the likes of AWS, Databricks, and a host of other cloud platforms, in addition to developer support for developers who are fine-tuning models locally.

Meta clearly hopes to build an ecosystem and toolchain around its AI models, and is embracing with open arms the large online communities which are building, training, and adapting freely available models for all sorts of applications. This is in stark contrast to the more 'product-driven' approach of the likes of OpenAI and Google. This may be Meta potentially looking to avoid the traditional incumbent-curse of similar disruptors in tech, who often invest heavily in early market-ready products only to be quickly overtaken and iterated. Llama3 can act as a catalyst to spur more innovation and investment in AI, as well as outsourcing some of Meta's workload in understanding and pushing the capabilities of their models.

Llama 3 models will soon be available on AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA NIM, and Snowflake, and with support from hardware platforms offered by AMD, AWS, Dell, Intel, NVIDIA, and Qualcomm. (Meta)

You can download Llama3's model weights now from Meta.

2 Meta is starting to take AI guardrails seriously

Llama Guard 2 and Cybersec Eval 2 are designed to protect the model and users

Llama3 is launching with a 'system-level' approach to AI responsibility, which is something other big players in the AI space have been noticeably quiet about. Part of this comes from Meta's openly-available approach to their models, which potentially removes some of the guardrails that enabled the likes of OpenAI and Google's Gemini models to skirt around this somewhat. Meta is keen to emphasize its protection both in the training and tuning phase. This includes their introduction of Llama Guard 2.

Llama Guard 2 is a separate LLM model (ironically trained on Llama3) with 8B parameters. It's designed to act as an input-output safeguard for Llama3 models, filtering incoming tasks into risk categories and marking them as safe or unsafe.

Meta is also continuing with its CyberSecEval2 suite for protecting against malicious code and prompt injection attacks. Furthermore, CodeShield has been designed to filter out insecure code generated by the model at inference-time.

3 Llama3 focused on quality, not size

Meta's model might actually run on your PC

Meta has again taken a different approach to some larger models, training again on a smaller dataset and parameter count, but focusing on very high quality data. This different approach has its benefits. Computing costs for training the model may be far lower (and the training process faster) this way, although Meta still required two custom-built 24,000 GPU clusters provided by NVIDIA to train Llama3. Meta is substituting the massive parameter counts of larger LLMs (GPT4 is reported to have more than a trillion parameters), and instead focusing on a very high-quality offline dataset.

There are other benefits to this approach too. Llama3 is far easier to run on local machines (you'll still need a lot of horsepower, even for the 8B parameter model), helping developers, startups, and would be AI-disruptors get up and running with the latest models without the need for excessive capital investment upfront.

4 This is just the start for Llama3

Meta is planning for 400B parameters in the future

One immediate benefit of Llama3 is that Meta is already being transparent about its planned future improvements, including multi-modal support, multi-language support, and a 400B parameter version on the horizon. More parameters are always welcome, although it does mean a larger model. Multi-language support is something that will be difficult, as the current versions of Llama3 are trained exclusively in English. Meta is likely working behind the scenes to build up its data-processing pipelines, as well as their ability to perform RLHF (reinforcement learning with human feedback), and fine-tuning things in a variety of languages. When we do see a multi-language version of Llama3, it should hopefully mean that more versions of Meta's models will have multi-language support moving forward.

When we do see a multi-language version of Llama3, it should hopefully mean that more versions of Meta's models are multi-language going forward.

Multi-modal support (i.e. image and video generation and ingestion) is also apparently on the horizon. Meta has released a separate image-generator alongside Llama3, but their decision to leave out true multi-modal support may have been influenced by the backlash other companies have been facing for flaws in their multi-modal models. I'm excited about the future here though, especially given Meta's demonstrated commitment to ensuring the safety of their models.

LlaMa3 is looking to the future

Unlike a lot of companies operating in the AI space at the moment, Meta (although launching yet another AI assistant alongside Llama3) doesn't seem desperate to rush to market. While Meta is arguably still playing catch-up to the likes of Google and OpenAI, its models are becoming increasingly powerful and are doing so with a focus on all the right areas, including easy developer support, scalability, platform buy-in, and general AI safety. These are issues that are oft-ignored by companies with a more pinpoint focus on getting a product to market. Whether Meta will have success here, it's impossible to say. Whether Meta is simply lethargic or being patient, its approach is already standing out as quite unique. Regardless, we're excited about the future of Llama3.