Claude 3 is now available for all Cody users
New Large Language Models (LLMs) launch nearly every week, and hardly a month goes by without a new model pushing performance benchmarks to new heights. For Cody, we always want to provide the latest and greatest models to users—which is why LLM interoperability is a core feature of the product.
LLM interoperability helps with two things:
- Integrating with the latest models lets Cody become smarter and more capable in short order after new models are released
- As a user, you’re able to choose the models you prefer since Cody supports multiple options interchangeably
Earlier in March, we saw this trend continue with the announcement of the Claude 3 family of models, and today, we’re announcing that Claude 3 models are now available to all Cody users.
Why Claude 3?
Cody has supported Anthropic models since its first release, including Claude 2.0, Claude 2.1, and Claude Instant. In Anthropic’s recent announcement blog, they shared that the Claude 3 family shows lots of improvements to prior Claude models in the benchmarks.
The Claude 3 family includes three models: Haiku (the fastest model), Opus (the most intelligent model), and Sonnet (the in-between model). Anthropic’s blog goes into depth on where each of them shines, but we’ll share some specifics from their blog on why they’re good candidates for Cody:
- All Claude 3 models show better performance in code generation
- Sonnet is comparable in intelligence to Claude 2.1 while being 2x faster
- Opus is a similar speed to Claude 2.1 while being far more intelligent
- Opus excels at the Needle in a Haystack (NIAH) evaluation, which tests the model’s ability to recall information from a large amount of data. This translates to Cody’s need to pull information out of large amounts of code used as context
The Claude 3 models also offer a huge 200K token context window. Cody doesn’t use this entire context window today; we limit Cody’s context window to roughly 7K tokens. We do this to ensure Cody doesn’t use too much context and lose track of the most useful context in the process, hurting response quality. Since Claude 3’s Needle in a Haystack performance is improved, and it can more accurately recall relevant information from a large amount of context, we’ll be dramatically expanding Cody’s context window in the coming weeks.
Tl;dr: the Claude 3 models are impressive, and they’re now another option that Cody users can select from as we continue to make Cody more customizable for devs’ preferences.
Cody’s performance with Claude 3
We rolled out Claude 3 models to our Cody Pro users in March, and early feedback has been extremely encouraging. Of Cody Pro users who previously selected non-default models for chat, roughly 55% of them switched their preference to the new Claude 3 models in the month following launch.
We also recently launched an AI experiment tool called LLM Litmus Test to help you compare and contrast different models. You can test Claude 3 vs. Claude 2.1 or even Claude 3 vs. GPT-4 Turbo and generate shareable links with your results. Using the litmus test, we can see some examples of Claude 3 Opus in action.
Example 1: Explaining a coding concept (React)
Here, you can see how Claude 3 Opus does a great job explaining code concepts. Claude 3 explained how to use React’s useContext() in a step-by-step tutorial. Compare that to Claude 2.1, which dropped all the example code in a single code block with less detailed instructions.
Example 2: Explaining a coding concept (Axum)
We can also compare Claude 3 Opus with GPT-4 Turbo. We asked both models to explain another concept: creating a route with Axum. Both models provided a code block, but Claude 3 Opus did a distinctly better job explaining the code step-by-step.