GPU Recommendations for Default NIM Models v1.3

Overview

From Hybrid Manager, there are two primary consumers of AI models:

  • PG.AI Knowledge Base (AIDB Postgres extension) for creating and maintaining AI Knowledge Bases.
  • PG.AI GenAI Builder (containerized Griptape) for building agentic AI assistants.

Default NIM Models

Model typeNIM modelNVIDIA NIM documented resource requirements
Text completionllama-3.3-70b-instruct4 × L40S
Text embeddingsarctic-embed-l1 × L40S
Image embeddingsnvclip1 × L40S
OCRpaddleocr1 × L40S
Text rerankingllama-3.2-nv-rerankqa-1b-v21 × L40S

Minimum GPU Requirement

Based on the default models above, the minimum to run them concurrently is 8 × L40S GPUs.

Cloud Mappings

  • AWS EKS: recommend a node group with 2 × g6e.12xlarge nodes.
  • GCP GKE: recommend a node pool with 2 × a2-highgpu-4g nodes.

Note: GCP does not offer L40S GPUs. The recommended A2 nodes with A100 GPUs are supported and documented for the NIM models listed above.