16 января 2025
Key Responsibilities:
Model Deployment
Dockerize and deploy multi-agent LLM instances, ensuring compatibility with LangGraph pipelines.
Optimize model serving for low-latency, high-throughput applications, leveraging GPU/TPU resources efficiently.
Implement scalable model deployment strategies for concurrent multi-agent tasks.
Monitoring and Logging
Establish end-to-end monitoring for the AI platform, tracking system health, latency, and failure rates
Monitor LLM-specific metrics such as token latency, memory consumption, and prompt-response accuracy.
Develop drift detection mechanisms for model inputs and outputs to ensure sustained performance.
Pipeline Automation
Automate training, evaluation, and deployment workflows for LangGraph-enabled pipelines.
Build and maintain CI/CD pipelines for integrating multi-agent frameworks with backend services.
Automate versioning and rollback mechanisms for LLMs, ensuring seamless updates.
Infrastructure Management
Collaborate with DevOps teams to scale Kubernetes clusters for LangGraph chains and WebSocket-heavy APIs.
Optimize resource allocation for shared GPU/TPU inference loads across agents.
Implement caching strategies for high-reuse LLM queries and shared agent tools.
Collaboration and Documentation
Document multi-agent system workflows, LangChain integration, and API usage for internal and external stakeholders.
Collaborate with backend engineers to align model endpoints with application requirements and QA teams to resolve deployment issues.
Provide guidelines for extending LangGraph tools and chains with custom agent implementations.
Qualifications:
Experience deploying (Ray Serve/vllm/...) and optimizing (quantization/ONNX/...) LLMs in production (e.g., OpenAI, HuggingFace models).
Proficiency with containerization (Docker) and orchestration (Kubernetes, Helm).
Familiarity with LangChain, Web3 tools, or multi-agent systems is a strong plus.
Strong understanding of CI/CD tools (e.g., GitHub Actions, Jenkins).
Excellent communication skills and ability to document complex systems clearly.
What we offer:
- Remote work with irregular trips within and outside of Russia
- Exciting growing international start-up with ambitious goals of making headways in a revolutionary, multi-billion dollar industry
- Pay in USDT
- Scrum/agile environment
- Highly skilled engineering team
- Attractive compensation plus token allocations
- Paid vacation and public holidays.