Introduction Completions
Example section for showcasing API endpoints
Quick Example
curl -X POST https://fastapi.mymagic.ai/v1/completions \
-H 'Authorization: Bearer <your personal access token>' \
-H 'Content-Type: application/json' \
-d '{
"model": "<model>",
"question": "<your question>",
"list_inputs": ["<input1>", "<input2>"],
"storage_provider": "<your storage provider>",
"bucket_name": "<your bucket name>",
"session": "<your session name>",
"max_tokens": "<number of tokens to output>",
"system_prompt": "<your system prompt>",
"role_arn": "arn:aws:iam::<your aws account ID>:role/<your s3 access role>",
"region": "The region your bucket is in",
"return_output": "Boolean indicating whether to return the output or not",
"input_json_file": "The name of the input json file in your s3 bucket",
"structured_output": "json schema for the response output"
}'
Currently the API supports the following llms:
- Llama3-70b (replace
<model>
withllama3_70b
) - Llama2-70b (replace
<model>
withllama2_70b
) - Llama2-7b (replace
<model>
withllama2_7b
) - CodeLlama-70b (replace
<model>
withcodellama_70b
) - Mixtral-8x7 (replace
<model>
withmixtral_8x7
) - Mistral-7b (replace
<model>
withmistral_7b
)
All our models are quantized and optimized for inference.
AWS S3
Please use the s3 access role you created in the previous step. Also, put your files for batch inference in a folder called <personal_access_token>/<session_name>
in your s3 bucket. If you name your session my_session
, then you should put your files in the following folder: <personal_access_token>/my_session
.
GCS
For using GCS, you will need to set up a service account with the necessary permissions to access bucket. Place your files for batch inference in a bucket, ideally under a folder named <personal_access_token>/<session_name>
. If you name your session my_session
, then you should put your files in the following folder: <personal_access_token>/my_session
.