When your application calls the Perplexity API, a network request may hang or fail to complete within an acceptable time. This leaves your code waiting indefinitely, blocking user interactions or other critical processes. API timeouts happen when the server does not respond before your client gives up waiting. This article explains how to set request timeouts, catch timeout errors, and implement retry logic in your code.

Key Takeaways: Handling Perplexity API Timeouts

Setting a timeout in the request: Prevents your code from waiting forever by specifying a maximum wait time in seconds.
Catching timeout exceptions: Use try/catch blocks in your programming language to handle timeout errors gracefully.
Implementing retry with exponential backoff: Automatically retry failed requests after increasing wait intervals to reduce server load.

Why Perplexity API Timeouts Occur

An API timeout occurs when the client sends a request but does not receive a response within a set time limit. This can happen for several reasons:

Network latency: Slow internet connections or high latency between your server and the Perplexity API endpoint.
Server overload: Perplexity may experience high traffic, causing delayed responses.
Large payloads: Sending very long prompts or requesting extensive context can increase processing time.
Rate limiting: Exceeding the API rate limit may cause the server to delay or drop requests.

Understanding these causes helps you decide on appropriate timeout values and retry strategies. A timeout is not a permanent failure — it signals that the request should be retried or handled differently.

Steps to Set Timeouts and Handle Errors

The following steps show how to implement timeout handling in Python using the requests library, which is commonly used for Perplexity API calls. Adapt the logic to your programming language of choice.

Step 1: Install the Requests Library

If you have not installed the requests library, run:

pip install requests

Step 2: Set a Timeout on the API Call

Add a timeout parameter to your request. The value is in seconds. A common choice is 30 seconds for the Perplexity API.

import requests

url = "https://api.perplexity.ai/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}
data = {
    "model": "sonar-pro",
    "messages": [{"role": "user", "content": "Explain quantum computing"}]
}

try:
    response = requests.post(url, headers=headers, json=data, timeout=30)
    response.raise_for_status()
    print(response.json())
except requests.exceptions.Timeout:
    print("Request timed out after 30 seconds.")
except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")

The timeout parameter tells the client to stop waiting after 30 seconds. If the server does not respond, a requests.exceptions.Timeout exception is raised.

Step 3: Implement Retry with Exponential Backoff

A single timeout does not mean the request will always fail. Use a retry loop that waits longer between each attempt.

import time
import requests

url = "https://api.perplexity.ai/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}
data = {
    "model": "sonar-pro",
    "messages": [{"role": "user", "content": "Explain quantum computing"}]
}

max_retries = 3
base_delay = 2  # seconds

for attempt in range(max_retries):
    try:
        response = requests.post(url, headers=headers, json=data, timeout=30)
        response.raise_for_status()
        print("Success:", response.json())
        break
    except requests.exceptions.Timeout:
        wait = base_delay  (2  attempt)
        print(f"Timeout on attempt {attempt + 1}. Retrying in {wait} seconds...")
        time.sleep(wait)
    except requests.exceptions.RequestException as e:
        print(f"Non-timeout error: {e}")
        break
else:
    print("All retries exhausted. Request failed.")

This code retries up to 3 times with delays of 2, 4, and 8 seconds. If all attempts fail, the loop ends with a failure message.

Step 4: Handle Timeouts in Asynchronous Code

If you use asyncio and httpx, set a timeout on the client.

import httpx
import asyncio

async def call_perplexity():
    url = "https://api.perplexity.ai/chat/completions"
    headers = {
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json"
    }
    data = {
        "model": "sonar-pro",
        "messages": [{"role": "user", "content": "Explain quantum computing"}]
    }
    async with httpx.AsyncClient(timeout=30.0) as client:
        try:
            response = await client.post(url, headers=headers, json=data)
            response.raise_for_status()
            print(response.json())
        except httpx.TimeoutException:
            print("Request timed out.")

asyncio.run(call_perplexity())

The timeout parameter in AsyncClient applies to the entire request. Adjust the value based on your use case.

Common Issues After Setting Timeouts

Timeout Value Too Short

If your timeout is set too low, even normal requests may fail. For Perplexity API calls, a minimum of 20 seconds is recommended. Increase to 60 seconds for long prompts or large context windows.

Retry Loop Runs Forever

Without a maximum retry count, your code may retry indefinitely. Always set a limit, such as 3 or 5 attempts. Use exponential backoff to avoid overwhelming the server.

Timeout Not Raised on Slow Network

Some libraries have separate timeouts for connection and read phases. For example, requests allows a tuple (connect, read). Set both to avoid hanging on a slow network.

response = requests.post(url, headers=headers, json=data, timeout=(10, 30))

This sets a 10-second connection timeout and a 30-second read timeout.

Rate Limiting Causes Repeated Timeouts

If you hit the Perplexity rate limit, the server may delay or drop requests. Check the response headers for rate limit information. Implement a delay between requests or use a queue system.

Timeout Strategies Comparison

Strategy	Description	Best Use Case
Fixed timeout	Set a single timeout value for all requests	Simple applications with consistent response times
Exponential backoff with retry	Retry with increasing wait intervals	Handling transient network issues or server overload
Separate connect and read timeouts	Set different limits for connection and data transfer	Unstable networks where connection may succeed but data transfer stalls
Circuit breaker	Stop making requests after repeated failures for a period	Preventing cascading failures in microservices

Now you can add timeout handling to your Perplexity API calls. Start by setting a 30-second timeout on every request. Then implement a retry loop with exponential backoff for transient failures. For production systems, consider using a circuit breaker pattern to avoid overwhelming the API during extended outages.

← Back to WiseChecker Home More in Windows & PC