Use Cases and Personas for AI Factory on Hybrid Manager v1.3
Overview
AI Factory on Hybrid Manager enables diverse organizational use cases through sovereign AI infrastructure. Organizations leverage these capabilities to build intelligent applications while maintaining complete control over data and models within their Kubernetes environments.
Key Use Cases
Internal Knowledge Management
Organizations deploy AI assistants that provide employees with natural language access to internal documentation, policies, and institutional knowledge.
Implementation Pattern
Financial services firms index compliance documentation, regulatory filings, and internal procedures into knowledge bases. Employees query these systems using conversational interfaces, receiving accurate answers with source citations while maintaining data sovereignty.
Technical Components
- Knowledge bases indexing SharePoint, Confluence, and database content
- LLM endpoints for query understanding and response generation
- Retrieval systems with role-based filtering for access control
See Knowledge Base Creation for implementation guidance.
Customer Service Automation
AI-powered customer support systems handle routine inquiries while escalating complex issues to human agents.
Implementation Pattern
E-commerce companies deploy assistants that access order systems, product catalogs, and return policies through secure APIs. These assistants resolve 60-70% of customer inquiries automatically while maintaining full audit trails for quality assurance.
Technical Components
- REST API tools for order management system integration
- Knowledge bases containing product documentation and FAQs
- Thread tracking for conversation history and escalation workflows
See Customer Service Agent Quickstart for a working example.
Enterprise RAG Systems
Retrieval-augmented generation applications ground LLM responses in organizational data, reducing hallucination while maintaining accuracy.
Implementation Pattern
Healthcare organizations implement RAG systems over clinical guidelines, research papers, and treatment protocols. Medical professionals receive evidence-based recommendations with citations to source materials, supporting clinical decision-making while ensuring traceability.
Technical Components
- Vector embeddings stored in PostgreSQL with pgvector
- Hybrid search combining semantic and keyword matching
- Reranking models for result optimization
See RAG Pipeline Documentation for detailed architecture.
AI-Enabled APIs
Organizations expose AI capabilities through secure APIs for integration with existing applications.
Implementation Pattern
Software companies provide intelligent features within their products by deploying model endpoints accessible through standard REST interfaces. These APIs handle tasks like document summarization, sentiment analysis, and content generation while maintaining predictable performance characteristics.
Technical Components
- InferenceServices exposing OpenAI-compatible endpoints
- API gateways with authentication and rate limiting
- Load balancing across model replicas for high availability
See Model Serving Guide for deployment instructions.
User Personas
Platform Engineers
Platform engineers manage AI Factory infrastructure, ensuring reliable model serving and optimal resource utilization.
Primary Responsibilities
- Configure GPU nodes and resource allocation
- Deploy and monitor InferenceServices
- Manage model library and registry connections
- Implement security policies and network configurations
Key Workflows
- Provision GPU resources following GPU Setup Guide
- Configure model registries per Model Library Documentation
- Monitor inference performance using Observability Tools
Data Scientists
Data scientists leverage AI Factory to deploy models and build intelligent applications without managing infrastructure complexity.
Primary Responsibilities
- Create and optimize knowledge bases
- Configure retrieval strategies for RAG applications
- Evaluate model performance and accuracy
- Design assistant behaviors and tool integrations
Key Workflows
- Build knowledge bases following Knowledge Base Management
- Configure retrievers using Retriever Creation Guide
- Deploy assistants through Assistant Configuration
Application Developers
Developers integrate AI capabilities into business applications through APIs and SDKs.
Primary Responsibilities
- Integrate model endpoints into applications
- Implement conversation management using threads
- Build custom tools for assistant capabilities
- Handle error conditions and fallback strategies
Key Workflows
- Access model endpoints via KServe Integration
- Implement conversation tracking using Thread Management
- Create custom tools following Tool Development Guide
Business Analysts
Business analysts use Gen AI Builder's visual interfaces to create AI applications without coding expertise.
Primary Responsibilities
- Define assistant behaviors and knowledge domains
- Configure data source connections
- Test and validate assistant responses
- Monitor usage patterns and effectiveness
Key Workflows
- Design assistants using visual builder interface
- Connect data sources through configuration wizards
- Test interactions and refine behaviors iteratively
Industry-Specific Implementations
Financial Services
Use Case: Regulatory compliance assistant analyzing transaction patterns and generating required reports.
Components
- Knowledge base of regulatory documentation
- Tools accessing transaction databases
- Audit logging for compliance tracking
Healthcare
Use Case: Clinical decision support system providing evidence-based treatment recommendations.
Components
- RAG over medical literature and guidelines
- Integration with electronic health records
- Strict access controls for patient data
Manufacturing
Use Case: Maintenance assistant diagnosing equipment issues using sensor data and technical manuals.
Components
- Knowledge base of equipment documentation
- Real-time data integration from IoT sensors
- Predictive maintenance model endpoints
Implementation Considerations
Scalability Planning
Different use cases require varying levels of scale:
- High-volume customer service: Multiple model replicas with auto-scaling
- Internal knowledge systems: Moderate scaling with caching optimization
- Specialized assistants: Single instances with dedicated resources
Security Requirements
Use cases determine security configurations:
- Public-facing applications: External ingress with authentication
- Internal systems: Cluster-local access with RBAC
- Regulated industries: Enhanced audit logging and encryption
Performance Optimization
Optimize based on use case characteristics:
- Real-time applications: Low-latency model serving with edge caching
- Batch processing: Higher throughput with larger batch sizes
- Interactive assistants: Balanced latency and throughput
Getting Started by Use Case
For Knowledge Management
- Review Knowledge Base Concepts
- Follow Data Source Integration
- Implement Hybrid Search
For Customer Service
- Start with Assistant Concepts
- Configure Rulesets for behavior control
- Deploy using Quickstart UI
For API Development
- Deploy models via InferenceService Creation
- Access endpoints following Python Client Guide
- Implement monitoring per Observability Documentation
Additional Resources
Architecture References
Implementation Guides
Related Documentation