Skip to content

Commit

Permalink
server : update readme
Browse files Browse the repository at this point in the history
ggml-ci
  • Loading branch information
ggerganov committed Dec 17, 2024
1 parent 400a5a1 commit 2230786
Showing 1 changed file with 41 additions and 1 deletion.
42 changes: 41 additions & 1 deletion examples/server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -761,6 +761,8 @@ curl http://localhost:8080/v1/chat/completions \

### POST `/v1/embeddings`: OpenAI-compatible embeddings API

This endpoint requires that the model uses a pooling different than type `none`.

*Options:*

See [OpenAI Embeddings API documentation](https://platform.openai.com/docs/api-reference/embeddings).
Expand Down Expand Up @@ -793,7 +795,45 @@ See [OpenAI Embeddings API documentation](https://platform.openai.com/docs/api-r
}'
```

When `--pooling none` is used, the server will output an array of embeddings - one for each token in the input.
### POST `/embeddings`: non-OpenAI-compatible embeddings API

This endpoint supports `--pooling none`. When used, the responses will contain the embeddings for all input tokens.
Note that the response format is slightly different than `/v1/embeddings` - it does not have the `"data"` sub-tree and the
embeddings are always returned as vector of vectors.

*Options:*

Same as the `/v1/embeddings` endpoint.

*Examples:*

Same as the `/v1/embeddings` endpoint.

**Response format**

```json
[
{
"index": 0,
"embedding": [
[ ... embeddings for token 0 ... ],
[ ... embeddings for token 1 ... ],
[ ... ]
[ ... embeddings for token N-1 ... ],
]
},
...
{
"index": P,
"embedding": [
[ ... embeddings for token 0 ... ],
[ ... embeddings for token 1 ... ],
[ ... ]
[ ... embeddings for token N-1 ... ],
]
}
]
```

### GET `/slots`: Returns the current slots processing state

Expand Down

0 comments on commit 2230786

Please sign in to comment.