Quick Example

curl -X POST https://fastapi.mymagic.ai/v1/completions \
-H 'Authorization: Bearer <your personal access token>' \
-H 'Content-Type: application/json' \
-d '{ 
"model": "<model>",
"question": "<your question>", 
"list_inputs": ["<input1>", "<input2>"],
"storage_provider": "<your storage provider>",
"bucket_name": "<your bucket name>", 
"session": "<your session name>", 
"max_tokens": <number of tokens to output>,
"system_prompt": "<your system prompt>", 
"role_arn": "arn:aws:iam::<your aws account ID>:role/<your s3 access role>",
"region": "<the region your bucket is in>",
"return_output": <boolean indicating whether to return the output or not>,
"input_json_file": "<the name of the input json file in your s3 bucket>",
"structured_output": "<json schema for the response output>"
}'

Response

The API will respond with a job ID:

{
  "status": "Submitted",
  "job_id": "<unique job id>"
}

Checking Job Status

To check the status of your job, use the following endpoint:

curl -X GET https://fastapi.mymagic.ai/v1/job_status/<job_id> \
-H 'Authorization: Bearer <your personal access token>'

The response will indicate the status of your job:

{
  "status": "<job status>"
}

Possible status values are: “Queued”, “Processing”, “Waiting for instance”, “Completed”, or “Error”.

If the status is “Completed”, the response will include additional fields:

{
  "status": "Completed",
  "task_id": "<task id>",
  "result": {
    // The result of the processed job
  }
}

If the status is “Error”, the response will include an error message:

{
  "status": "Error",
  "error": "<error message>"
}

Supported Models

Currently, the API supports the following LLMs:

  • Llama3-70b (replace <model> with llama3_70b)
  • Llama2-70b (replace <model> with llama2_70b)
  • Llama2-7b (replace <model> with llama2_7b)
  • CodeLlama-70b (replace <model> with codellama_70b)
  • Mixtral-8x7 (replace <model> with mixtral_8x7)
  • Mistral-7b (replace <model> with mistral_7b)

All our models are quantized and optimized for inference.

Storage Providers

AWS S3

Please use the S3 access role you created in the previous step. Also, put your files for batch inference in a folder called <personal_access_token>/<session_name> in your S3 bucket. If you name your session my_session, then you should put your files in the following folder: <personal_access_token>/my_session.

GCS

For using GCS, you will need to set up a service account with the necessary permissions to access the bucket. Place your files for batch inference in a bucket, ideally under a folder named <personal_access_token>/<session_name>. If you name your session my_session, then you should put your files in the following folder: <personal_access_token>/my_session.