Deploying Models
Step-by-step guide to creating your first deployment.
Last updated
Step-by-step guide to creating your first deployment.
Last updated
Deploying your model on PropulsionAI enables you to make it available for real-time usage, whether it's for internal tools or customer-facing applications. Follow these steps to deploy your model:
Go to the "Deployments" section in your project dashboard.
Click "New" to start setting up a new deployment.
In the deployment setup page, you will need to fill out the following details:
Base URL: This is the endpoint where your model will be accessible. You can customize the suffix if needed.
Hardware: Select the hardware that best suits your model’s needs. The options include:
L4
V100
T4
A100
H100
The hardware selection will depend on the size and complexity of your model. Be mindful that different hardware configurations come with varying costs.
Model Version: Choose the model version you want to deploy. This could be a fine-tuned version or a base model if no further customization is needed.
Replicas:
Minimum Replicas: Set the minimum number of replicas (instances) that should always be running. Start with 0 if you're unsure, as this will save costs when the model isn’t actively in use.
Maximum Replicas: Set the maximum number of replicas to handle increased load when your model is in high demand.
Note: Instances take up to 120 seconds to boot up when a request is received.
Record Conversations: Toggle this option if you want to record all interactions with your model.
Dataset: If you're recording conversations, you’ll need to select or create a dataset where these interactions will be stored. Important: Ensure you have the necessary consents if you plan to record conversations. This is especially critical for production environments, where data privacy laws may apply. Proceed with caution.
After configuring the deployment settings, click "Deploy" to launch your model.
After the deployment shows "Running" status, you can quickly try out the model by clicking on the "Try in Playground" button.
Your model will now be accessible through the provided API endpoint.
Once deployed, monitor your model’s performance and resource usage. Adjust the number of replicas or switch hardware as needed to optimize costs and performance.
With these steps, you’ve successfully deployed your model on PropulsionAI. Your model is now ready to handle real-time requests, and you can easily scale its deployment based on your specific needs.