Skip to main content
POST
/
v1
/
inference
/
completions
curl -X POST https://api.gravixlayer.com/v1/inference/completions \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer $GRAVIXLAYER_API_KEY' \
-d '{
"model": "meta-llama/llama-3.1-8b-instruct",
"prompt": "Hello! Tell me about AI."
}'
{
  "data": {
    "id": "<string>",
    "object": "text.completion",
    "created": 123,
    "model": "<string>",
    "choices": [
      {
        "text": "<string>",
        "index": 123,
        "logprobs": {},
        "finish_reason": "<string>"
      }
    ]
  }
}

Authorizations

Authorization
string
header
required

API key authentication. Get your API key from the Gravix Layer Dashboard.

Body

application/json
model
string
required

Model identifier

Example:

"meta-llama/llama-3.1-8b-instruct"

prompt
string
required

Prompt to complete

max_tokens
integer
default:16

Maximum tokens to generate

temperature
number
default:1

Sampling temperature

Required range: 0 <= x <= 2
top_p
number
default:1

Nucleus sampling

Required range: 0 <= x <= 1
n
integer
default:1

Number of completions

stream
boolean
default:false

Whether to stream the response

logprobs
integer | null

Include top N log probabilities

echo
boolean
default:false

Echo the prompt in response

stop

Stop sequences

presence_penalty
number
default:0

Presence penalty

Required range: -2 <= x <= 2
frequency_penalty
number
default:0

Frequency penalty

Required range: -2 <= x <= 2
best_of
integer
default:1

Generate N completions, return best

logit_bias
object

Modify token probabilities

user
string | null

User identifier

Response

Completion response

data
object