Principal, IT Software Engineer 2 - AIOps Lead

Sorry, this job was removed at 08:17 a.m. (CST) on Thursday, May 15, 2025
2 Locations
In-Office or Remote
141K-256K Annually
Consumer Web • Digital Media • Information Technology • News + Entertainment • On-Demand
DIRECTV is changing the way the world experiences entertainment.
The Role

DIRECTV is seeking an AIOps Lead (Principal, IT Software Engineer 2) who will play a crucial role in driving the adoption and execution of Artificial Intelligence for IT Operations (AIOps) practices across the organization. This individual will be responsible for leading observability standards, AIOps initiatives, automation-first strategies, leveraging AI and machine learning technologies to optimize IT operations, detecting anomalies, improving system performance, and automating incident and problem management processes.

The ideal candidate will have a strong background in IT operations, SRE, a deep understanding of observability platforms and AIOps and tools, DevOps, software development and the ability to lead cross-functional teams to drive innovation in the realm of IT operations automation and monitoring.

Here’s what you’ll do:

Team Leadership and Guidance:

  • Lead projects from a team of 3-4 NPW engineers dedicated to stability and observability improvements and operation efficiency.
  • Technical lead for a team to design and develop end-to-end solutions, managing dependencies and cross-team impacts.
  • Provide hands-on guidance and support to team members (50% hands-on, 50% managerial).
  • Lead a team of AIOps engineers and specialists, ensuring their development, coaching, and alignment with organizational goals.
  • Develop and report on team performance KPIs.
  • Foster a culture of continuous learning, DevOPS excellence through regular technical sessions and internal workshops.
  • Active participant in the development community (Business Unit) to promote best practices through educating their peers.
  • Manage risk and request help from leadership, when necessary, to meet commitments or change directions.

Observability, AIOPS Strategy and Execution:

  • Define and implement an Observability, AIOPS strategy aligned with business objectives and an autonomous IT operations vision.
  • Responsible for planning short term (sprint-to-sprint) and long-term (multiple PI) initiatives and organizing work and designs to meet the long-term target.
  • Implement and optimize AI and machine learning algorithms to detect performance anomalies, predict outages, automate incident response, and improve overall operational efficiency.
  • Implement automated workflows for proactive issue resolution, reducing manual intervention and improving operational agility.
  • Seek opportunities to improve processes and take an automation-first approach.
  • Lead the evaluation, selection, and deployment of AIOps platforms and tools.
  • Design and implement cost-efficient observability and AIOps solutions across cloud and on-premise environments using a mix of commercial, open source, and CNCF solutions.
  • Leverage data analytics and monitoring systems to generate actionable insights that improve system health, application performance, and availability.
  • Develop internal resources and training materials to ease the adoption and implementation of AIOPS tools and practices.

Cross-functional Collaboration:

  • Work closely with IT operations, DevOps, SRE and application development teams to identify pain points and automate processes with AIOps tools and techniques.
  • Present findings, improvements, and key metrics to senior management and stakeholders.

Automation and Process Improvement:

  • Leverage scripting, AI/ML, and automation skills for automation first approach.
  • Embed Observability and AIOps capabilities into reusable platform services by utilizing DevOps, CI/CD, and IaC tools and practices like Terraform, Jenkins, GitHub, ArgoCD, Harness and Ansible.

Technical Implementation and Management:

  • Establish and enforce observability standards, policies, and best practices across the enterprise.
  • Ensure compliance with regulatory and security requirements.
  • Plan and migrate legacy tools and functions to new AIOPS approach.
  • Develop and maintain AIOPS dashboards, extensions, applications, and workflow automation.
  • Integrate AIOPS with tools like Jira, ServiceNow, MS Teams, Slack, xMatters, Confluence/wiki/KB and MoogSoft/BigPanda.
  • Set up and manage observability stacks for cloud monitoring (AWS, Azure), VMs, Kubernetes, and various databases.
  • Optimize naming conventions, management zones, alerting profiles, and tagging to align with business processes.

Performance Monitoring and Reporting:

  • Analyze and report on observability metrics, KPIs, Service Level Indicators (SLI), and Service Level Objectives (SLOs).
  • Develop and recommend baseline monitoring thresholds, SLO, and error budgets to drive continuous improvement in MTR and Availability.

What you’ll need to be successful:

Educational and Professional Experience:

  • Bachelor’s degree in computer science or engineering, or related field.
  • 5 – 7 years required, 7+ years preferred, of experience in IT operations, DevOps, or site reliability engineering, with at least 2 years in AIOps-related roles.
  • Strong experience with AIOps tools such as Moogsoft, BigPanda, Splunk, Dynatrace, Datadog, ServiceNow, xMatters or similar.
  • Solid understanding of machine learning algorithms and their application in IT operations.
  • Hands-on experience with cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes).

Technical Skills:

  • 3+ years of experience with Dynatrace SaaS, DQL, and Logs on Grail or similar.
  • Strong scripting/automation skills in Python, Perl, Shell, and JavaScript.
  • Experience with automation, DevOps, GitOps, CI/CD, and IaC tools (Terraform, Jenkins, GitHub, Ansible).
  • Experience integrating and automating ITSM tools like ServiceNow, xMatters, PagerDuty, JIRA.
  • Hands on experience in building and operating open-source observability tools like ELK, Grafana, Prometheus fluentd, fluent bit, Loki, OpenTelemetry, OpenSearch, and Thanos.
  • Experience in designing and implementing observability and AIOPS solutions for complex, distributed systems.
  • Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions (both frontend and backend).
  • Experience with OS: Linux & Windows, Java, NodeJS, ReactJS, databases: Oracle, Casandra, Kafka, MuleSoft, Salesforce, networking.
  • Expertise in incident management, monitoring systems, and ITSM processes.

Leadership and Communication:

  • 2+ years of experience leading engineering teams in Observability, SRE, Platform, Infrastructure, or Application organizations.
  • Excellent communication, collaboration, and problem-solving skills.
  • Proficient in developing and maintaining technical documentation, runbooks, and process.
  • Proven track record of driving change and innovation in a fast-paced, dynamic environment.

May require a background check due to job duties requiring routine access to DIRECTV and DIRECTV customer’s proprietary data. Qualified applicants with arrest and conviction will be considered for employment in accordance with local ordinances and state law.

This role may require occasional travel, less than 5%.

This is a remote position that can be located anywhere in the United States. #LI-Remote

A career with us comes with big rewards:

DIRECTV's compensation structure is designed to be market-competitive and fully supports efforts to attract and retain employees. It is the company's policy to offer pay that is competitive with other employers in the local market. Our salary ranges are determined by role, level, and location.

The Base Salary range displayed below reflects the minimum and maximum target salary for each of DIRECTV's 4 (four) US Labor Market Zones. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training.

DIRECTV WAGE ZONES: $140,790 - $255,530

Low (N1): $140,790 - $211,090

Mid (N2): $148,200 - $222,200

High (N3): $163,020 - $244,420

Top (N4): $170,430 - $255,530

Click HERE to review information on some of the largest Designated Market Areas (DMAs). Your recruiter can share more about the specific salary range for your preferred location during the hiring process. 

Please note that the salary ranges reflect base salary only and do not include bonus or benefits - when you consider all of these together, it represents a pretty impressive total compensation package.

Apply today!

Fair Chance Ordinance Notice for Los Angeles County applying for jobs at DIRECTVCompliance Notice Regarding Use of Automated Decision-Making Tools in Hiring ProcessRSRDTV

Similar Jobs

Onebrief Logo Onebrief

Platform Engineer

Software • Defense
In-Office or Remote
2 Locations
180K-230K Annually

Wells Fargo Logo Wells Fargo

Security Engineer

Fintech • Financial Services
Remote or Hybrid
7 Locations
100K-196K Annually

Wells Fargo Logo Wells Fargo

Security Engineer

Fintech • Financial Services
Remote or Hybrid
7 Locations
100K-196K Annually

Wells Fargo Logo Wells Fargo

Security Engineer

Fintech • Financial Services
Remote or Hybrid
7 Locations
100K-196K Annually
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: El Segundo, CA
12,000 Employees
Year Founded: 1994

What We Do

DIRECTV is changing the way the world experiences entertainment. Innovation powers all that we do, and our groundbreaking solutions deliver compelling entertainment experiences to millions of customers.

Why Work With Us

DIRECTV has been at the forefront of entertainment for nearly three decades. We're now entering the new era of DIRECTV to create the best entertainment and communications experience in the world. At DIRECTV our amazing people, combined with a culture that thrives on collaboration and creativity, are the foundation that create a great place to work.

Gallery

Gallery

Similar Companies Hiring

Axle Health Thumbnail
Logistics • Information Technology • Healthtech • Artificial Intelligence
Santa Monica, CA
15 Employees
Scrunch AI Thumbnail
Software • SEO • Marketing Tech • Information Technology • Artificial Intelligence
Salt Lake City, Utah
Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account