How to Migrate From OpenAI API to Perplexity Sonar
🔍 WiseChecker

How to Migrate From OpenAI API to Perplexity Sonar

If you currently use the OpenAI API for text generation or search-augmented completions, you may want to switch to Perplexity Sonar for lower cost, built-in web grounding, and real-time citations. The OpenAI API and Perplexity Sonar both support a chat completions endpoint, but the request format, authentication, and response structure differ in several key ways. This article explains the exact changes you need to make in your code, the differences in model capabilities, and the common pitfalls to avoid during migration.

Key Takeaways: Migrating From OpenAI to Perplexity Sonar

  • Endpoint change: Replace https://api.openai.com/v1/chat/completions with https://api.perplexity.ai/chat/completions.
  • Authentication header: Use Authorization: Bearer YOUR_PERPLEXITY_API_KEY instead of your OpenAI key.
  • Model name: Set model to sonar-pro or sonar-reasoning-pro for production use.

ADVERTISEMENT

Differences Between OpenAI API and Perplexity Sonar API

The Perplexity Sonar API is designed to be compatible with the OpenAI chat completions format, but several differences exist. The most important difference is that Perplexity Sonar automatically searches the web for each request and returns citations in the response. This means you do not need to implement a separate retrieval-augmented generation pipeline. The API also supports a return_citations parameter and a search_domain_filter parameter to restrict which sources are used. The pricing model is per-search-request rather than per-token, which can reduce costs for applications that require frequent lookups.

Request Format Differences

Both APIs use a JSON payload with messages and model fields. However, Perplexity Sonar does not support the stream parameter for streaming responses in the same way. It also does not support the temperature parameter for all models. You must check the specific model documentation for supported parameters.

Response Structure Changes

The OpenAI response includes a choices array with a message object. Perplexity Sonar returns the same structure but adds a citations array inside each choice. Each citation contains a url, title, and text snippet. If you previously parsed only the content field, you must update your code to optionally read the citations.

Steps to Update Your Code for Perplexity Sonar

Follow these steps to modify an existing OpenAI API client to work with Perplexity Sonar. The examples use Python with the requests library, but the same logic applies to any programming language.

  1. Change the API endpoint URL
    Replace the OpenAI base URL with the Perplexity endpoint. Use https://api.perplexity.ai/chat/completions for all requests.
  2. Update the authentication header
    Replace your OpenAI API key with a Perplexity API key. Set the header to Authorization: Bearer pplx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx. Obtain your key from the Perplexity API dashboard.
  3. Set the correct model name
    Change the model field to one of the following: sonar-pro for standard search-augmented generation, sonar-reasoning-pro for deeper reasoning with citations, or sonar-deep-research for multi-step research queries. For testing, you can use sonar or sonar-reasoning.
  4. Remove unsupported parameters
    Check your existing payload for parameters like temperature, top_p, frequency_penalty, presence_penalty, logit_bias, or user. Remove any that are not listed in the Perplexity API documentation. If you keep unsupported parameters, the API will ignore them or return an error.
  5. Add optional search parameters
    To control the search behavior, add the search_domain_filter parameter with an array of domains, such as ["wikipedia.org"]. To force a fresh search and ignore cached results, set search_recency_filter to "month" or "week".
  6. Parse the citations from the response
    After receiving the response, extract the citations array from response["choices"][0]. Iterate over the citations to display URLs and snippets alongside the generated text. If your application previously used a separate search API, you can now remove that call.

Example Python Code Snippet

Here is a minimal example that sends a query to Perplexity Sonar and prints the answer with citations:

import requests

url = "https://api.perplexity.ai/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_PERPLEXITY_API_KEY",
    "Content-Type": "application/json"
}
payload = {
    "model": "sonar-pro",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ]
}

response = requests.post(url, json=payload, headers=headers)
data = response.json()
answer = data["choices"][0]["message"]["content"]
citations = data["choices"][0].get("citations", [])
print(answer)
for citation in citations:
    print(citation["url"])

ADVERTISEMENT

Common Mistakes When Migrating to Perplexity Sonar

Several issues arise when developers switch from OpenAI to Perplexity without adjusting their assumptions. The following problems are the most frequent.

Using the Wrong Model Name

The OpenAI API uses model names like gpt-4 or gpt-3.5-turbo. Perplexity Sonar uses different model identifiers. If you keep the old model name, the API returns a 404 error. Always set the model to sonar-pro or sonar-reasoning-pro for production.

Forgetting to Handle the Citations Field

If your application expects the response to contain only the content string and ignores the citations field, you lose the source links. Update your response parser to check for the presence of citations and display them if available. If you do not need citations, you can still ignore the field, but you miss one of the key benefits of Perplexity Sonar.

Rate Limits and Quotas Differ

Perplexity Sonar rate limits are based on requests per minute, not tokens per minute. If your application sends many small requests, you may hit the limit faster than with OpenAI. Check the API documentation for your plan’s specific limits and implement exponential backoff in your retry logic.

System Prompts Behave Differently

The system message in Perplexity Sonar is respected, but the model may override the system instructions with web search results. If you need strict adherence to a system prompt, test your prompt with the specific Sonar model you plan to use. Some models, like sonar-reasoning-pro, may prioritize reasoning over strict instruction following.

Item OpenAI API Perplexity Sonar API
Endpoint api.openai.com/v1/chat/completions api.perplexity.ai/chat/completions
Authentication Bearer sk-… Bearer pplx-…
Default model gpt-4o sonar-pro
Web search built-in No (requires plugin or tool) Yes (automatic)
Citations in response Not included Included in citations array
Streaming support Full SSE streaming Limited (check model docs)
Pricing basis Per token Per request (search)

Migrating from the OpenAI API to Perplexity Sonar requires updating your endpoint URL, authentication header, model name, and response parser. The most significant change is the addition of automatic web search and citations. Test your integration with a single query before rolling out to production. For advanced use cases, explore the search_domain_filter and search_recency_filter parameters to refine the search behavior. If you encounter unexpected behavior, check the Perplexity API changelog for updates to model capabilities.

ADVERTISEMENT