Senior DevOps Engineer IRC291687
GlobalLogic Посмотреть все вакансии
- Киев
- Постоянная работа
- Полная занятость
- 5+ years of experience in data center networking or high performance computing networking, or AI cluster networking
- Hands-on experience with Cumulus Linux, especially on Spectrum-X platform
- Understanding of Adaptive Routing and its usecases, congestion control, ECMP
- Experience with network performance analysis tools, telemetry, and troubleshooting congestion issues
- Familiarity with data center networking architecture for AI workloads
- Experience creating/supporting large-scale GPU clusters
- Experience working with AI training or inference workloads
- Datacenter networking,
- Cumulus Linux
- ECMP
- Congestion control
- Adaptive routing
- GPU clasters
- AI training WL
- Configure, analyze, and optimize Adaptive Routing behavior on NVIDIA Spectrum-X switches
- Investigate and determine factors affecting packet eligibility for Adaptive Routing
- Troubleshoot network congestion, performance bottlenecks, and packet flow inefficiencies
- Work closely with compute, infrastructure, and DevQA teams to ensure optimal network performance
- Support performance testing of AI workloads and validate network performance under load
- Analyze telemetry and performance metrics to improve routing efficiency and cluster stability
- Assist in defining best practices for high-performance AI networking infrastructure
- Provide technical expertise and guidance to engineering and operations teams