Deploying Models
Step-by-step guide to creating your first deployment.
Deploying your model on PropulsionAI enables you to make it available for real-time usage, whether it's for internal tools or customer-facing applications. Follow these steps to deploy your model:
1. Navigate to the Deployments Section
Go to the "Deployments" section in your project dashboard.
2. Create a New Deployment
Click "New" to start setting up a new deployment.
3. Configure the Deployment
In the deployment setup page, you will need to fill out the following details:
Base URL: This is the endpoint where your model will be accessible. You can customize the suffix if needed.
Hardware: Select the hardware that best suits your model’s needs. The options include:
L4
V100
T4
A100
H100
The hardware selection will depend on the size and complexity of your model. Be mindful that different hardware configurations come with varying costs.
Model Version: Choose the model version you want to deploy. This could be a fine-tuned version or a base model if no further customization is needed.
Replicas:
Minimum Replicas: Set the minimum number of replicas (instances) that should always be running. Start with 0 if you're unsure, as this will save costs when the model isn’t actively in use.
Maximum Replicas: Set the maximum number of replicas to handle increased load when your model is in high demand.
Note: Instances take up to 120 seconds to boot up when a request is received.
Record Conversations: Toggle this option if you want to record all interactions with your model.
4. Deploy the Model
After configuring the deployment settings, click "Deploy" to launch your model.
After the deployment shows "Running" status, you can quickly try out the model by clicking on the "Try in Playground" button.
Your model will now be accessible through the provided API endpoint.
5. Monitor and Manage Your Deployment
Once deployed, monitor your model’s performance and resource usage. Adjust the number of replicas or switch hardware as needed to optimize costs and performance.
With these steps, you’ve successfully deployed your model on PropulsionAI. Your model is now ready to handle real-time requests, and you can easily scale its deployment based on your specific needs.
Last updated