As a NOC Engineer, you'll play a vital role in ensuring our systems run seamlessly to support critical business operations. You'll monitor and maintain our Linux-based infrastructure, troubleshoot issues in real-time, and work directly with Kubernetes and high performance computing environments. Your work will be essential in keeping our production workloads stable, performant, and ready for the demands of enterprise AI solutions.
Responsibilities:
- Monitor HPC and Kubernetes clusters, responding promptly to alerts and managing incidents to maintain system stability.
- Troubleshoot and resolve issues across hardware, software, and networking.
- Develop and automate processes to streamline operations, increase uptime, and improve system performance.
- Collaborate with global engineering teams to enhance infrastructure resilience and scalability.
Qualifications:
- Proficient in Linux administration with hands-on experience in Kubernetes and container orchestration.
- Strong troubleshooting skills, able to resolve issues under pressure and identify root causes.
- Driven to learn and grow in a dynamic, high-speed environment.
About Together AI
Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.
Equal Opportunity
Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.
Please see our privacy policy at https://www.together.ai/privacy
Top Skills
What We Do
Together AI is a research-driven artificial intelligence company. We contribute leading open-source research, models, and datasets to advance the frontier of AI. Our decentralized cloud services empower developers and researchers at organizations of all sizes to train, fine-tune, and deploy generative AI models. We believe open and transparent AI systems will drive innovation and create the best outcomes for society