API Reference

Peerwave offers a chat endpoint that is mostly compliant with OpenAI's chat API, making it a drop-in replacement for most applications. Access LLMs with both streaming and non-streaming responses.

Basic Chat Request

Send a message to an LLM model and get a response. This API is synchronous and returns a single response.

Try Basic Chat Request

fetch("https://api.peerwave.ai/api/chat", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Redirect": window.location.pathname + "?codeblock=basic-chat"
  },
  body: JSON.stringify({
  "model": "cheapest",
  "messages": [
    {
      "role": "user",
      "content": "How are you doing?"
    }
  ]
})
}).then((response) => {
  if (!response.ok) {
    throw new Error("Invalid response");
  }
  return response.json().then((data) => {
    console.log(data);
  })
})

Response Format

{
  "model": "model-name",
  "message": {
    "role": "assistant",
    "content": "Hello!"
  }
}

Parameters

model (required)

Which model to generate a completion with. Available models can be found with the /api/models endpoint. Several meta-models are available which describe classes of models without specifying them by name.

messages (required)

An array of message objects. Message objects have the following fields:

role - Can be system, user, assistant, or tool
content - The content of the message
images - A list of base64 encoded images for multimodal models
tool_calls - A list of tools the model wants to use (usually generated by the model)

format (optional)

Currently only `json` is supported to request JSON-formatted responses.

Streaming Chat Request

LLM responses can take a while to complete, especially for large generated responses (e.g. "generate 100 haikus"). To start seeing responses immediately, use the /api/chat/stream endpoint. This endpoint takes the same parameters as /api/chat.

JavaScript Streaming Example

const dec = new TextDecoder();
let readStreamChunk = (reader) => {
  return reader.read().then(({ value, done }) => {  
    if (done) {  
      console.log('Stream finished.');  
      return;  
    }  
    const decoded = dec.decode(value, { stream: true });
    for (const line of decoded.split('\n')) {
      if (line !== '') {
        console.log(JSON.parse(line));  
      }
    }
    return readStreamChunk(reader);  
  })
}

fetch("https://api.peerwave.ai/api/chat/stream", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    "model": "cheapest",
    "messages": [
      {
        "role": "user",
        "content": "How are you doing?"
      },
    ],
  }),
}).then((response) => {
  if (!response.ok) {
    throw new Error("Invalid response");
  }
  const reader = response.body!.getReader();
  return readStreamChunk(reader);
})

The response is in newline delimited JSON. All responses follow the same format except for the last message which indicates completion and tells you how many credits the interaction took:

Streaming Response Format

{"message":{"role":"assistant","content":"How"},"done":false}
{"message":{"role":"assistant","content":"'s"},"done":false}
{"message":{"role":"assistant","content":" it"},"done":false}
{"message":{"role":"assistant","content":" going"},"done":false}
{"message":{"role":"assistant","content":"?"},"done":true,"done_reason":"stop"}
{"model":"llama3.1:8b", "credits":256}

Get Available Models

The available models on the network change as providers connect and disconnect. To get the currently available models, use the `/api/models` endpoint.

Try Get Available Models

fetch("https://api.peerwave.ai/api/models", {
  method: "GET",
  headers: {
    "Content-Type": "application/json",
    "Redirect": window.location.pathname + "?codeblock=get-models"
  }
}).then((response) => {
  return response.json().then((data) => {
    console.log(data);
    return data;
  })
})

Response Format

{
  "models": ["model-name1", "model-name2"]
}

Meta Models

Meta models are models that describe classes of models without specifying them by name. For example, the cheapest meta-model will always return the cheapest available model.

This is useful for when you want to do something with the network but not worry about what is actively available or has sufficient capacity. When using meta models, the response will indicate which model was actually used to complete a request.

The following meta models are available:

cheapest - Always returns the cheapest available modelfastest - Pick the provider that can serve tokens most quickly