Director, AI Operations & Optimization

Posted Yesterday
Be an Early Applicant
2 Locations
In-Office
118K-206K Annually
Senior level
Analytics • Consulting
The Role
Lead enterprise AI runtime operations, reliability, and optimization. Establish AIOps/LLMOps practices, observability, incident management, and operational tooling. Drive performance, scalability, governance, and cross-functional coordination while developing teams and enabling production AI transformation.
Summary Generated by Built In

When you’re the best, we’re the best. We instill an environment where employees feel engaged, satisfied and able to contribute their unique skills and talents while living and working as their authentic selves. We provide extensive opportunities for personal and professional development, building both employee competence and organizational capability to fuel exceptional performance through an inclusive environment both now and in the future.

Summary

In this role, the Director, AI Operations and Optimization will lead the operationalization, reliability, optimization, and continuous improvement of enterprise AI capabilities across Vizient. This leader is responsible for establishing scalable AI runtime operational practices, advancing AIOps and LLMOps capabilities, implementing observability and monitoring frameworks, and driving operational excellence for production AI solutions. The Director will oversee the operational support and continuous improvement of AI-powered applications, agentic workflows, and reusable AI platform capabilities while ensuring reliability, governance, security, and performance at enterprise scale. Through cross-functional collaboration and strong operational leadership, this role will help enable Vizient’s enterprise AI transformation strategy by delivering sustainable, scalable, and responsible AI operations.

Responsibilities:

AI Runtime Operations & Reliability

  • Lead enterprise AI operational activities, including runtime monitoring, operational support, incident management, production reliability, and operational continuity for AI-powered applications and intelligent automation solutions.
  • Establish, implement, and continuously improve AI operational practices, including AIOps and LLMOps processes, runtime observability, operational telemetry, drift detection, release coordination, support workflows, and operational readiness activities.
  • Drive runtime stability and service reliability initiatives through production monitoring, escalation management, root cause analysis, operational playbooks, and service continuity practices.
  • Support enforcement of runtime governance standards, operational safeguards, human oversight controls, and secure operationalization practices for enterprise AI solutions.
  • Ensure operational excellence across AI environments through proactive monitoring, issue prevention, and continuous service improvement efforts.

AI Optimization & Operational Maturity

  • Lead initiatives focused on runtime efficiency, operational scalability, inference utilization, supportability, performance optimization, and sustainable AI operations.
  • Support the implementation and optimization of reusable operational patterns, observability frameworks, support standards, telemetry pipelines, operational tooling, and AI support capabilities.
  • Promote standardized operational processes, scalable support models, automation opportunities, and continuous improvement initiatives across AI operations functions.
  • Drive operational maturity by identifying opportunities to enhance performance, reduce operational risk, and improve support effectiveness.

Cross-Functional Operational Coordination

  • Partner closely with AI Engineering & Delivery, AI Governance, AI Quality Engineering, Automation, Architecture, Platform Engineering, Security, Infrastructure, and business stakeholders to ensure operational readiness and runtime reliability.
  • Coordinate operational execution activities across AI operations teams, including operational planning, vendor and contractor management, issue prioritization, escalation management, knowledge transfer, and delivery continuity.
  • Support operational assessments, production readiness reviews, implementation planning, runtime support strategies, and modernization initiatives for prioritized AI capabilities.
  • Collaborate with technical and business leaders to align operational practices with enterprise AI objectives and service expectations.

Leadership, Communication & Team Development

  • Lead, mentor, and develop operations managers, engineers, analysts, and contractor resources while fostering a high-performing, collaborative, and continuously learning culture.
  • Provide clear communication regarding operational performance, runtime risks, service reliability concerns, optimization opportunities, engineering tradeoffs, and strategic recommendations.
  • Establish accountability for operational outcomes while promoting operational discipline, innovation, and continuous improvement.
  • Research and evaluate emerging AI operational technologies, observability platforms, automation capabilities, optimization techniques, and runtime management practices to drive innovation and operational effectiveness.

Qualifications

  • Bachelor’s degree in Computer Science, Information Systems, Engineering, Technology Management, or a related field preferred.
  • 8+ years of experience in AI operations, software engineering, platform operations, engineering delivery, DevOps, Site Reliability Engineering (SRE), infrastructure operations, or related enterprise technology functions required.
  • 3+ years of experience leading operational teams, engineering support organizations, platform operations, or large-scale technology initiatives required.
  • Hands-on experience supporting, operationalizing, monitoring, or optimizing production AI solutions utilizing large language models (LLMs), APIs, agentic workflows, orchestration frameworks, and modern AI engineering practices required.
  • Strong experience implementing and scaling operational support models, observability practices, incident management processes, DevOps methodologies, runtime operations, or enterprise operational frameworks required.
  • Experience with observability platforms, monitoring tools, incident management processes, runtime operations, CI/CD pipelines, and production support practices required.
  • Experience leading distributed teams, managing contractors and vendors, and delivering operational initiatives within complex and evolving environments required.
  • Experience with cloud platforms, APIs, data integration technologies, automation frameworks, monitoring solutions, DevOps tools, and modern operational toolsets required.
  • Strong analytical, problem-solving, communication, presentation, stakeholder management, and cross-functional collaboration skills required.
  • Demonstrated ability to manage multiple priorities in fast-paced, evolving, and operationally dynamic environments required.
  • Experience supporting enterprise-scale AI, automation, digital transformation, or platform modernization initiatives preferred.
  • Knowledge of AI governance, responsible AI principles, operational risk management, and production AI lifecycle management preferred.

#LI-JB1

Estimated Hiring Range:

At Vizient, we consider skills, experience, and organizational needs in our compensation approach. Geographic factors may adjust the range estimate and hires typically fall below the top range. Compensation decisions are tailored to individual circumstances. The current salary range for this role is $117,600.00 to $206,000.00.

This position is also incentive eligible.

Vizient has a comprehensive benefits plan! Please view our benefits here:

http://www.vizientinc.com/about-us/careers

Equal Opportunity Employer:   Females/Minorities/Veterans/Individuals with Disabilities

The Company is committed to equal employment opportunity to all employees and applicants without regard to race, religion, color, gender identity, ethnicity, age, national origin, sexual orientation, disability status, veteran status or any other category protected by applicable law.

Skills Required

  • Bachelor's degree in Computer Science, Information Systems, Engineering, Technology Management, or related field
  • 8+ years experience in AI operations, software engineering, platform operations, engineering delivery, DevOps, SRE, infrastructure operations, or related enterprise technology functions
  • 3+ years experience leading operational teams, engineering support organizations, or large-scale technology initiatives
  • Hands-on experience supporting, operationalizing, monitoring, or optimizing production AI solutions utilizing large language models (LLMs), APIs, agentic workflows, and orchestration frameworks
  • Experience implementing and scaling operational support models, observability practices, incident management processes, DevOps methodologies, and runtime operations
  • Experience with observability platforms, monitoring tools, incident management processes, CI/CD pipelines, and production support practices
  • Experience leading distributed teams, managing contractors and vendors, and delivering operational initiatives in complex environments
  • Experience with cloud platforms, APIs, data integration technologies, automation frameworks, monitoring solutions, and DevOps tools
  • Strong analytical, problem-solving, communication, presentation, stakeholder management, and cross-functional collaboration skills
  • Demonstrated ability to manage multiple priorities in fast-paced, evolving, and operationally dynamic environments
  • Experience supporting enterprise-scale AI, automation, digital transformation, or platform modernization initiatives
  • Knowledge of AI governance, responsible AI principles, operational risk management, and production AI lifecycle management

Vizient Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Vizient and has not been reviewed or approved by Vizient.

  • Leave & Time Off Breadth Time off allowances are generous, with ample PTO and holidays available from day one. Separate paid volunteer days add additional protected time away.
  • Flexible Benefits Flexibility to work from anywhere for part of the year and hybrid options support work-life balance. Policies enabling remote periods complement the broader PTO structure.
  • Retirement Support Retirement benefits include a competitive 401(k) company match alongside HSA contributions on eligible plans. These elements strengthen long-term financial security within the total rewards package.

Vizient Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Irving, TX
5,661 Employees
Year Founded: 1977

What We Do

Vizient, Inc., the nation’s largest health care performance improvement company, serves more than 50% of the nation’s acute care providers, which includes 97% of the nation’s academic medical centers, and more than 20% of ambulatory care providers. Vizient provides expertise, analytics and advisory services, as well as a contract portfolio that represents more than $130 billion in annual purchasing volume. Vizient is based in Dallas and has offices in 20 metropolitan areas across the United States. We have 4,000 employees with a breadth of expertise, experience and compassion, who are eager to develop and implement solutions that advance health care for the greater good.

Similar Jobs

Enverus Logo Enverus

Manager, Power Markets

Big Data • Information Technology • Software • Analytics • Energy
In-Office or Remote
5 Locations
1800 Employees
115K-130K Annually

MetLife Logo MetLife

Business Procedures Analyst

Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Remote or Hybrid
United States
43000 Employees
42K-72K Annually

MetLife Logo MetLife

Consultant

Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Hybrid
Aurora, IL, USA
43000 Employees
65K-99K Annually

Crunchyroll Logo Crunchyroll

Communications Senior Manager, Asia

Digital Media • eCommerce • Gaming • Mobile • News + Entertainment
Remote or Hybrid
21 Locations
1300 Employees

Similar Companies Hiring

Amplify Platform Thumbnail
Fintech • Financial Services • Consulting • Cloud • Business Intelligence • Big Data Analytics
Scottsdale, AZ
62 Employees
Scotch Thumbnail
Artificial Intelligence • eCommerce • Fintech • Payments • Retail • Software • Analytics
US
35 Employees
Milestone Systems Thumbnail
Artificial Intelligence • Security • Software • Analytics • Big Data Analytics
Lake Oswego, OR
1500 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account