Enabling a self-hosted model for the Migration Portal AI Copilot v1.3

You can use a self-hosted AI Factory model to serve the AI Copilot. This example uses NVIDIA NIM to serve the requests and Llama 3 to process it and generate an answer.

Warning

There are significant safety implications to consider when using self-hosted models with Migration Copilot.

The models provided by third-party vendors like OpenAI amd Azure OpenAI include content filtering and other safeguards which are designed to reduce the risk of the model responding to, generating or contributing to unsafe content. When you use self-hosted models these additional protections are no longer present.

In addition, because you are hosting the models, you now bear responsibility for the risks and potential liability associated with any unsafe behavior.

Prerequisites

Prepare the resources your environment requires to deploy the Migration Portal AI Copilot with a self-hosted solution.

  • You have administrative access to the HM environment.

  • Your organization has created a chat completion and a text embeddings model with the Hybrid Manager's AI Factory. They have provided the endpoints for each model, which you can set as environment variables.

    export COMPLETIONS_SVC=llama-3-3-nemotron-super-49b-v1
    export EMBEDDINGS_SVC=llama-3-2-nv-embedqa-1b-v2
    
    export COMPLETIONS_ENDPOINT=$(kubectl get inferenceservice $COMPLETIONS_SVC -o     jsonpath='{.status.url}')
    export EMBEDDINGS_ENDPOINT=$(kubectl get inferenceservice $EMBEDDINGS_SVC -o     jsonpath='{.status.url}')

Enabling the AI Copilot

  1. Check if the edb-migration-copilot namespace exists:

    kubectl get namespaces edb-migration-copilot

    The namespace is created during the installation of the Hybrid Manager. If you are enabling the AI Copilot before installing the HM, you must create the namespace in advance.

  2. If the edb-migration-copilot namespace doesn't exist yet, create it:

    kubectl create ns edb-migration-copilot
  3. Set the following environment variables to link the secret with the model endpoints:

    export OPENAI_API_BASE=${COMPLETIONS_ENDPOINT}/v1
    export OPENAI_EMBEDDINGS_API_BASE=${EMBEDDINGS_ENDPOINT}/v1
    export OPENAI_API_KEY=<openai api key> # set to a placeholder value like `noop` if models are deployed in a way that no key is required
    Note

    The AI Copilot uses OpenAI-compatible APIs to communicate with all models, including self-hosted ones. This is why some configuration parameters contain openai in their names, even when you're using a different model to serve queries.

  4. Create the ai-vendor-secrets secret and configure it to point at the models' endpoints:

    kubectl create secret generic ai-vendor-secrets \
        --namespace=edb-migration-copilot \
        --type=opaque \
        --from-literal=AI_VENDOR=NIM \
        --from-literal=RAGCHEW_OPENAI_API_BASE="${OPENAI_API_BASE}" \
        --from-literal=RAGCHEW_OPENAI_EMBEDDINGS_API_BASE="$    {OPENAI_EMBEDDINGS_API_BASE}" \
        --from-literal=OPENAI_API_KEY="${OPENAI_API_KEY}"
  5. Create a new file called migration-portal-values.yaml with the following helm value to override the default AI vendor secrets with the secret created in the previous step.

    parameters:
      edb-migration-copilot:
        ai_vendor_secrets: ai-vendor-secrets
  6. Update the Hybrid Manager installation file to include the AI Copilot configuration. This involves either updating the YAML values you used for installation or running the helm upgrade command with the AI Copilot configuration parameters.

  7. Restart the edb-migration-copilot services to trigger a reconciliation of the new values with the system.

    kubectl rollout restart edb-migration-copilot -n edb-migration-copilot

Additional configuration for air-gapped installations (experimental)

When running in an air-gapped environment, Migration Copilot will fail when it tries to fetch the pre-trained tokenizer data from Hugging Face Hub. Set the following parameter to use a local snapshot of tokenizer data instead:

Restart the edb-migration-copilot services.

Important

Migration Copilot only ships with tokenizer data for the nvidia/llama-3.3-nemotron-super-49b-v1 pre-trained tokenizer. Using airgapped_mode: '"true"' with tokenizer_model set to any other model will result in failure of the Migration Copilot.