Principal Software Engineer, DevOps

Posted 8 Days Ago
Be an Early Applicant
Ann Arbor, MI
In-Office
180K-210K Annually
Senior level
Artificial Intelligence • Software • Energy • Utilities
The Role
The role involves designing, building, and operating a platform for data from edge AI devices, ensuring systems are scalable and reliable while mentoring engineers. Responsibilities include managing containerized applications using Kubernetes, implementing infrastructure as code with Terraform, and supporting compliance and continuous improvements while collaborating with cross-functional teams.
Summary Generated by Built In
Utilidata is a fast-growing NVIDIA-backed edge AI company enabling greater visibility and control of power utilization in energy-intensive infrastructure, like the electric grid and data centers. Karman, the company’s distributed AI platform powered by a custom NVIDIA module, is transforming the way utility companies operate the grid edge and will enable data centers to unlock more compute for the same provisioned power.
We are seeking a DevOps Engineer to help design, build, and operate Utilidata’s off-device platform that ingests, processes, and serves data flowing from edge AI devices. The role will build and maintain infrastructure across on-premises and cloud environments - bridging edge deployments with cloud-based data processing to support analytics, operations, and ML workloads at scale. This is a hands-on development role with technical leadership responsibilities and with company wide impact. This engineer will architect and maintain the systems that keep our platform running, set technical direction for infrastructure and deployment practices, and mentor engineers. This engineer will partner closely with on device, and ML teams to ensure our off-device platform is resilient, well-instrumented, and ready to scale. This is a remote position based in the United States, working with distributed teams across the country.
Responsibilities
  • Oversee the deployment and management of containerized applications using Kubernetes, ensuring optimal performance and availability
  • Contribute to strategic planning regarding how the infrastructure solutions evolve to match the requirements of Data Center partners
  • Lead the design, implementation, and maintenance of scalable and reliable systems on AWS and/or on-premise
  • Utilize Terraform for infrastructure as code to automate the provisioning and management of cloud resources
  • Monitor system performance and uptime, ensuring systems meet established service level objectives (SLOs)
  • Support SOC2 security compliance requirements for data handling
  • Mentor and guide team members in DevOps practices, promoting a culture of reliability and excellence
  • Advocate for automation of operational tasks to enhance efficiency and reduce manual intervention
  • Collaborate with cross-functional teams to build and maintain CI/CD pipelines
  • Troubleshoot and resolve complex production issues, conducting root cause analysis and implementing corrective actions
  • Participate in on-call rotations and incident response teams
  • Assist in capacity planning, performance tuning, and technical decision-making
  • Drive continuous improvement initiatives for processes and infrastructure
Minimum Qualifications 
  • 8+ years of development experience including extensive experience in platform engineering, SRE, or distributed systems, with clear senior or principal-level impact
  • Experience designing and operating infrastructure across on-premises and cloud environments
  • Strong proficiency in container orchestration, particularly Kubernetes
  • Strong proficiency with AWS services and architecture
  • Hands-on experience with Terraform for infrastructure automation
  • Familiarity with monitoring tools (Prometheus, Grafana, or similar) and observability best practices
  • Excellent problem-solving skills, leadership abilities, and attention to detail
  • Strong communication and collaboration skills, with experience in driving technical outcomes
  • Willingness to travel up to 20% of time
Enhanced Qualifications (Nice to Have) 
  • Bachelor's degree in Computer Science, Engineering, or a related field
  • Experience supporting or enabling MLOps platforms, model deployment pipelines, or ML-adjacent infrastructure
  • AI Workload scheduling using Kubernetes
  • Knowledge of Apache Spark for large-scale data processing
  • Knowledge of database technologies (SQL, NoSQL)
  • Understanding of networking concepts and security best practices
Salary Range: $180,000 to $210,000 base compensation depending on experience and stock options. Salary will be commensurate with an individual's skills, training, years of experience, and in line with internal compensation bands. 
Location: This position can be performed remotely from anywhere in the United States.
Our Commitments:
Utilidata values the diversity of our team. We provide equal employment opportunities without regard to race, color, religion, creed, sex, gender, sexual orientation, gender identity or expression, national origin, age, physical disability, mental disability, medical condition, pregnancy or childbirth, sexual orientation, genetics, genetic information, marital status, or status as a covered veteran or any other basis protected by applicable federal, state and local laws.
We are committed to:
  • Creating a diverse and inclusive workplace that is welcoming, supportive, affirming and respectful
  • Empowering employees to solve problems and work together to make a difference
  • Providing mentorship and growth opportunities as part of a collaborative team
  • A flexible work environment with flexible paid time off
  • Competitive compensation and benefits, including health, dental, vision, and employer-match 401k

Top Skills

Spark
AWS
Grafana
Kubernetes
NoSQL
Prometheus
SQL
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Providence, RI
77 Employees
Year Founded: 2012

What We Do

Utilidata is an AI-powered technology company that is working with NVIDIA to create the next generation of AI-embedded infrastructure, starting with the electric grid. Karman, our distributed AI platform, operates on our custom NVIDIA module, makes data available for accelerated computing at the edge, and trains AI models locally.

Karman is embedded in grid devices - starting with smart meters - to transform the way utility companies operate. As the electric grid becomes more complex with the rapid increase of electric vehicles, distributed solar, batteries, heat pumps and extreme weather, utilities need real-time visibility of grid conditions and dynamic, software-defined infrastructure. Karman provides real-time visibility and AI at the grid edge so utilities can better utilize customer energy resources, reduce power outages, and enable quicker storm recovery.

We are a mission-driven, collaborative, and adaptive team working to do what’s right, even when it’s hard. With backgrounds in electric engineering, power systems engineering, software engineering, data science, and energy policy, we bring a unique perspective on the solutions the energy industry needs.

We are committed to ensuring a diverse, inclusive, and flexible workplace where employees are provided mentorship and growth opportunities and are empowered to solve problems as part of a collaborative team.

Similar Jobs

In-Office
Auburn Hills, MI, USA
104031 Employees

HiBob Logo HiBob

Product Manager

HR Tech • Information Technology • Professional Services • Sales • Software
Remote or Hybrid
United States
1350 Employees
140K-180K Annually

CrowdStrike Logo CrowdStrike

Program Manager

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
USA
10000 Employees
130K-200K Annually

CrowdStrike Logo CrowdStrike

Sales Engineer

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
2 Locations
10000 Employees
65K-90K Annually

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account