Generative AI models maintenance policy

This page describes the model maintenance policy for the Foundation Model APIs pay-per-token, Foundation Model APIs provisioned throughput, Batch Inference with ai_query, and Foundation Model Fine-tuning offerings.

To continue supporting the most state-of-the-art models, Databricks manages models through a lifecycle that progresses from update to deprecation to eventual retirement.

  • Update: Databricks applies incremental updates to a model to deliver optimizations. See Model updates.
  • Deprecated: A deprecated model is no longer recommended for new workloads, but remains available in workspaces with existing usage of the model. Workspaces that aren't using the model at the time of deprecation no longer have access to it.
  • Retired: A retired model is no longer accessible, and support for the model is fully discontinued. Any workload using the model stops working.

Model deprecation policy

When Databricks deprecates a model, that model is no longer recommended and is planned for retirement. Databricks announces retirement dates for deprecated models with the notification timelines summarized in the following sections. Retirement dates might be announced at the time of deprecation or at a later date. After the retirement date, the model is no longer accessible, and any workload using it stops working.

For deprecated and retired models and their announced retirement dates, see Deprecated and retired models. For partner models, see Partner model retirement policy.

Important

The deprecation policies that apply to the Foundation Model APIs pay-per-token and Foundation Model Fine-tuning offerings only impact supported chat and completion models.

Foundation Model APIs

The following table summarizes the deprecation policy for the Foundation Model APIs pay-per-token, Foundation Model APIs provisioned throughput, and Batch Inference with ai_query offerings.

Deprecation notification Transition to retirement On the retirement date
Databricks takes the following steps to notify customers about a model deprecation:
  • In the Databricks UI, a warning message indicates that the model is deprecated.
  • The applicable documentation contains a notice that indicates the model is deprecated, along with a retirement date if one has been announced.
After deprecating a model, Databricks announces a retirement date three months or more in the future. During this transition period:
  • The model remains available only for workspaces with existing workloads that use it, until the announced retirement date.
  • The deprecated model isn't accessible from workspaces that weren't actively using it at the time of deprecation.
  • Customers with existing workloads should migrate to the recommended replacement model or sunset affected workloads.
The model is no longer available for use and is removed from the product. Any existing workloads using the model stop working. Applicable documentation is updated to indicate that the model is no longer available, and to recommend a replacement model.

Partner model retirement policy

Partner models are models that third-party partners — specifically OpenAI, Anthropic, and Google — provide through Foundation Model APIs. For these partner models, Databricks generally follows the same deprecation timelines and policies described above.

However, partners might provide retirement dates shorter than the three-month transition period that Databricks publishes. In these cases, Databricks attempts to bridge the gap by temporarily redirecting models to a similar version, so customers receive the full transition time.

For example, if a partner model retirement is announced with one month's lead time instead of three, Databricks redirects the model for an additional two months to prevent immediate breakage and allow time for migration. Queries fail at the end of the full three-month period.

Note

This redirection can only occur if the replacement model has the same price and is backwards compatible. The replacement model is usually an incremental model version, like 3.0 versus 3.1.

Foundation Model Fine-tuning

The following table summarizes the deprecation policy for Foundation Model Fine-tuning.

Deprecation notification Transition to retirement On the retirement date
Databricks takes the following steps to notify customers about a model deprecation:
  • In the Experiments tab, a warning message appears in the drop-down menu for Foundation Model Fine-tuning that indicates that the model is deprecated.
  • The applicable documentation contains a notice that indicates the model is deprecated, along with a retirement date if one has been announced.
After deprecating a model, Databricks announces a retirement date three months or more in the future. During this transition period, customers should migrate their workloads to a recommended replacement model, or delete the affected endpoint. The model is no longer available for use and is removed from the product. Applicable documentation is updated to recommend using a replacement model.

Model updates

Databricks might ship incremental model updates to deliver optimizations. When Databricks updates a model, the endpoint URL remains the same, but the model ID in the response object changes to reflect the date of the update. For example, if Databricks ships an update to meta-llama/Meta-Llama-3.3-70B on 3/4/2024, the model name in the response object updates to meta-llama/Meta-Llama-3.3-70B-030424. Databricks maintains a version history of the updates. Reach out to your Databricks account team for more details.

Deprecated and retired models

The following sections list models that are deprecated (no longer recommended for new workloads) or retired (end-of-life and no longer available). Retirement dates for deprecated models are announced at least three months in advance.

Foundation Model APIs retirements

The following table shows model retirements, their retirement dates, and recommended replacement models to use for Foundation Model APIs pay-per-token and provisioned throughput serving workloads. Databricks recommends that you migrate your applications to use replacement models before the indicated retirement date.

Note

OpenAI and Google Gemini models are only available through ADI Services, provided by Databricks.

Partner model Retirement date Recommended replacement model
OpenAI GPT-5.1 Codex Max Pay-per-token: July 16, 2026 OpenAI GPT-5.5
OpenAI GPT-5.1 Codex Mini Pay-per-token: July 16, 2026 OpenAI GPT-5.4 Codex Mini
OpenAI GPT-5.2 Codex Pay-per-token: July 16, 2026 OpenAI GPT-5.5
Anthropic Claude 3.7 Sonnet Pay-per-token: April 12, 2026 Use the latest Claude Sonnet model
Gemini 3 Pro Provisioned throughput: March 26, 2026 Gemini 3.1 Pro. To allow more time for migration, between March 26, 2026 and June 7, 2026, API calls to Gemini 3 Pro will be temporarily redirected to Gemini 3.1 Pro. The pricing for both models is identical.
Open model Retirement date Recommended replacement model
Meta Llama 3.1 405B Pay-per-token: February 15, 2026
Provisioned throughput: May 15, 2026
OpenAI GPT OSS 120B
DBRX / DBRX Instruct Pay-per-token: April 30, 2025
Provisioned throughput: December 19, 2025
Pay-per-token: Meta-Llama-4-Maverick
Provisioned throughput: Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size.
Mixtral 8x7B / Mixtral-8x7B Instruct Pay-per-token: April 30, 2025
Provisioned throughput: February 27, 2026
Pay-per-token: Meta-Llama-4-Maverick
Provisioned throughput: Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size.
Meta Llama 3 (70B) Pay-per-token: July 23, 2024 (Meta-Llama-3-70B-Instruct); December 11, 2024 (Meta-Llama-3.1-70B-Instruct)
Provisioned throughput: February 27, 2026
Pay-per-token: Meta-Llama-4-Maverick
Provisioned throughput: Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size.
Meta Llama 3 8B Provisioned throughput: February 27, 2026 Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size.
Meta Llama 2 70B / Meta-Llama-2-70B-Chat Pay-per-token: October 30, 2024
Provisioned throughput: February 27, 2026
Pay-per-token: Meta-Llama-4-Maverick
Provisioned throughput: Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size.
Meta Llama 2 13B Provisioned throughput: February 27, 2026 Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size.
Meta Llama 2 7B Provisioned throughput: February 27, 2026 Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size.
Mistral 7B Provisioned throughput: February 27, 2026 Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size.
MPT 30B / MPT 30B Instruct Pay-per-token: August 30, 2024
Provisioned throughput: December 19, 2025
Pay-per-token: Meta-Llama-4-Maverick
Provisioned throughput: Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size.
MPT 7B / MPT 7B Instruct Pay-per-token: August 30, 2024
Provisioned throughput: December 19, 2025
Pay-per-token: Meta-Llama-4-Maverick
Provisioned throughput: Comparable model on the same offering, like Llama 3.2, 3.3, or 4 model of similar size.

If you require long-term support for a specific model version, Databricks recommends using Foundation Model APIs provisioned throughput for your serving workloads.

Foundation Model Fine-tuning retirements

The following table shows retired model families, their retirement dates, and recommended replacement model families to use for Foundation Model Fine-tuning workloads. Databricks recommends that you migrate your applications to use replacement models before the indicated retirement date.

Model family Retirement date Recommended replacement model family
DBRX April 30, 2025 Llama-3.1-70B
Mixtral April 30, 2025 Llama-3.1-70B
Mistral April 30, 2025 Llama-3.1-8B
Meta-Llama-3.1-405B January 30, 2025 Llama-3.1-70B
Meta-Llama-3 January 7, 2025 Meta-Llama-3.1
Meta-Llama-2 January 7, 2025 Meta-Llama-3.1
Code Llama January 7, 2025 Meta-Llama-3.1

Find workloads that use retired models

Use the following query to find workloads that are using deprecated models and identify their owners.

SELECT
   eu.requester,
   se.endpoint_name,
   se.entity_name,
   COUNT(*) AS request_count,
   SUM(eu.input_token_count) AS total_input_tokens,
   SUM(eu.output_token_count) AS total_output_tokens,
   MIN(eu.request_time) AS first_request,
   MAX(eu.request_time) AS last_request
 FROM system.serving.endpoint_usage eu
 JOIN system.serving.served_entities se
   ON eu.served_entity_id = se.served_entity_id
 WHERE LOWER(se.entity_name) LIKE '%<retired-model-name>%'
 GROUP BY eu.requester, se.endpoint_name, se.entity_name
 ORDER BY request_count DESC