Deploying Models

Step-by-step guide to creating your first deployment.

Deploying your model on PropulsionAI enables you to make it available for real-time usage, whether it's for internal tools or customer-facing applications. Follow these steps to deploy your model:

1. Navigate to the Deployments Section

  • Go to the "Deployments" section in your project dashboard.

2. Create a New Deployment

  • Click "New" to start setting up a new deployment.

3. Configure the Deployment

In the deployment setup page, you will need to fill out the following details:

  • Base URL: This is the endpoint where your model will be accessible. You can customize the suffix if needed.

  • Hardware: Select the hardware that best suits your model’s needs. The options include:

    • L4

    • V100

    • T4

    • A100

    • H100

    The hardware selection will depend on the size and complexity of your model. Be mindful that different hardware configurations come with varying costs.

  • Model Version: Choose the model version you want to deploy. This could be a fine-tuned version or a base model if no further customization is needed.

  • Replicas:

    • Minimum Replicas: Set the minimum number of replicas (instances) that should always be running. Start with 0 if you're unsure, as this will save costs when the model isn’t actively in use.

    • Maximum Replicas: Set the maximum number of replicas to handle increased load when your model is in high demand.

    Note: Instances take up to 120 seconds to boot up when a request is received.

  • Record Conversations: Toggle this option if you want to record all interactions with your model.

4. Deploy the Model

  • After configuring the deployment settings, click "Deploy" to launch your model.

  • After the deployment shows "Running" status, you can quickly try out the model by clicking on the "Try in Playground" button.

  • Your model will now be accessible through the provided API endpoint.

5. Monitor and Manage Your Deployment

  • Once deployed, monitor your model’s performance and resource usage. Adjust the number of replicas or switch hardware as needed to optimize costs and performance.


With these steps, you’ve successfully deployed your model on PropulsionAI. Your model is now ready to handle real-time requests, and you can easily scale its deployment based on your specific needs.

Last updated