Streaming Perplexity API responses using Server-Sent Events allows your application to display real-time answers as they are generated instead of waiting for the full response. SSE is a standard protocol that sends data from the server to the client over a single HTTP connection. This article explains how to configure your API request to enable streaming and how to handle the incoming event stream in JavaScript and Python. You will learn the exact parameters, event types, and code patterns required for a working implementation.
Key Takeaways: Streaming Perplexity API With Server-Sent Events
- API request parameter
stream: true: Enables Server-Sent Events mode for real-time token delivery. - Event types
text_chunkanddone: Parse these events to capture partial answer text and detect stream completion. - Standard SSE client (JavaScript
EventSourceor Pythonrequestswith streaming): Use these tools to consume the event stream without custom parsing.
Understanding Server-Sent Events for the Perplexity API
Server-Sent Events is a web standard that lets a server push data to a client over a long-lived HTTP connection. Unlike WebSockets, SSE is unidirectional — the server sends data only. The Perplexity API supports SSE when you set the stream parameter to true in your request body. Instead of receiving a single JSON response, the server sends multiple events, each containing a fragment of the generated text. This approach reduces perceived latency for users because they see the answer appear word by word.
The SSE stream uses the standard text/event-stream content type. Each event has a data field and an optional event field. Perplexity emits two event types: text_chunk for each piece of generated text and done to signal the end of the stream. The data field for text_chunk events contains a JSON object with the partial answer, while the done event includes the full answer and metadata.
What You Need Before You Start
To stream responses from the Perplexity API, you need a valid API key from the Perplexity developer portal. You also need a development environment that can make HTTP requests and parse SSE streams. This guide covers JavaScript in the browser and Python 3.7 or later. No additional libraries are required for the basic implementation, but you may use the eventsource polyfill for older browsers or the sseclient library in Python for convenience.
Steps to Enable Streaming and Parse Server-Sent Events
The following steps show how to set the stream parameter and handle the incoming events in JavaScript and Python. Both methods follow the same logical flow: open a connection, listen for events, process each text chunk, and close the stream when the done event arrives.
Method 1: JavaScript in the Browser
- Create an EventSource object
Use theEventSourceconstructor to connect to the Perplexity API endpoint. Pass the API endpoint URL and your API key as a query parameter or set theAuthorizationheader. TheEventSourceconstructor does not support custom headers, so you must include the API key in the URL as a query parameter if your backend proxy forwards it. Alternatively, use thefetchAPI with a ReadableStream for full header control. - Listen for the
text_chunkevent
Attach an event listener to theEventSourceobject for thetext_chunkevent. Inside the callback, parse theevent.dataproperty as JSON. Extract thetextfield from the parsed object and append it to the display element in your UI. Each event contains a small piece of the answer. - Listen for the
doneevent
Add a second event listener for thedoneevent. When this event fires, parse the JSON data to retrieve the full answer and any metadata such as usage statistics or citations. After processing, calleventSource.close()to terminate the connection. - Handle errors
Use theonerrorcallback on theEventSourceobject to detect connection failures or server errors. ThereadyStateproperty indicates the connection status. Reconnect logic can be implemented by creating a newEventSourceafter a delay.
Method 2: Python With the Requests Library
- Send a POST request with stream enabled
Use therequests.postmethod with thestream=Trueparameter. Set theAuthorizationheader toBearer YOUR_API_KEY. Include thestream: truefield in the JSON body along with the model name and prompt. - Iterate over the response lines
Useresponse.iter_lines()to process each line of the SSE stream. Each line that starts withdata:contains a JSON payload. Skip lines that are empty or start withevent:if you prefer to parse the event type from the data field. - Parse each data line
Strip thedata:prefix and parse the remaining string as JSON. Check thetypefield of the parsed object. If the type istext_chunk, extract thetextfield and yield or print it. If the type isdone, break the loop and store the final answer and metadata. - Close the response
After the loop ends, callresponse.close()to free the connection. Alternatively, use a context manager with thewithstatement to ensure the response is closed automatically.
Common Issues and Limitations When Streaming
EventSource Does Not Support Custom Headers
The browser EventSource API does not allow setting custom HTTP headers, such as Authorization. To work around this, you must proxy the request through your own backend server that adds the API key. Alternatively, use the fetch API with a ReadableStream and parse the SSE manually. The fetch approach gives you full control over headers and request body.
Streaming Stops Before the Answer Is Complete
If the connection drops or the server times out, the stream may end prematurely. Implement reconnection logic that detects the done event or the onerror callback. In Python, catch requests.exceptions.ChunkedEncodingError and retry the request with the same prompt. The Perplexity API does not support resuming a partial stream, so you must restart the request.
Events Arrive Out of Order
SSE events are delivered in the order the server sends them. Network buffering or proxy servers may reorder packets, but the event stream itself preserves order. If you observe out-of-order text, verify that you are not processing events from multiple concurrent connections. Each stream is independent and sequential.
Perplexity API Streaming vs Non-Streaming Response
| Item | Streaming (SSE) | Non-Streaming |
|---|---|---|
| Response delivery | One event per token or small chunk | Single JSON response after full generation |
| Perceived latency | Low — user sees text appear immediately | Higher — user waits for complete answer |
| Connection duration | Open until all events are sent | Closed after the single response |
| Error handling | Partial output may be lost on failure | No partial output — retry the whole request |
| Client complexity | Requires event parsing and reconnection logic | Simple — parse one JSON object |
Streaming is best for chat interfaces and real-time displays. Non-streaming is simpler for batch processing or server-side aggregation where latency is not critical.
Conclusion
You can now stream Perplexity API responses using Server-Sent Events by setting stream: true in your request and parsing the text_chunk and done events. In JavaScript, use the EventSource API or fetch with a ReadableStream for full header control. In Python, use the requests library with stream=True and iterate over lines. As a next step, implement reconnection logic to handle network interruptions gracefully. For advanced use, combine streaming with the citation parameter to receive citation data in the done event.