Senior Site Reliability Engineer

Reposted Yesterday
Easy Apply
Hiring Remotely in USA
Remote
Senior level
Artificial Intelligence
The Role
As a Senior Site Reliability Engineer, you will ensure high availability of services, optimize performance, manage cloud infrastructure, and develop deployment tools.
Summary Generated by Built In
Senior Site Reliability EngineerAbout the Company

Clarifai is a leading, compute orchestration AI platform specializing in computer vision and generative AI. We empower organizations to transform unstructured image, video, text, and audio data into actionable insights, significantly faster and more accurately than manual processes. Founded in 2013 by Matt Zeiler, Ph.D., Clarifai has been at the forefront of AI innovation since achieving the top five placements in the 2013 ImageNet Challenge. Our diverse, globally distributed team operates across the United States, Canada, Estonia, Argentina, and India.

We have secured $100M in funding, including a $60M Series C round, backed by industry leaders such as Menlo Ventures, Union Square Ventures, Lux Capital, NEA, LDV Capital, Corazon Capital, Google Ventures, NVIDIA, Qualcomm, and Osage.

Clarifai is proud to be an equal-opportunity workplace committed to building and maintaining a diverse and inclusive team.

Your Impact

Clarifai’s platform is a kubernetes-native distributed system that requires the orchestration of many components. Efficiently serving and training large neural networks presents unique design and infrastructure challenges. 

You will be critical to solving these challenges both in the context of the cloud and in on premise environments. Additionally, you will be responsible for our broader cloud infrastructure and development tools and environments.

The Opportunity
  • Ensure the smooth operation and high availability of Clarifai's core services
  • Monitor system performance, identify bottlenecks, and implement optimizations to enhance reliability and efficiency
  • Develop Kubernetes resources and custom tooling for seamless cloud and on-premise deployments
  • Design and implement scalable, secure, and cost-effective infrastructure solutions.
  • Partner with teams across the organization to identify & solve engineering challenges
Requirements
  • BS/BA in Computer Science or related degree
  • Good knowledge of cloud providers (AWS, GCP or similar)
  • Expertise with Kubernetes (EKS, GKE, self-hosted) and Infrastructure as Code using Terraform, Helm
  • Solid understanding of web and networking (HTTP, TLS, DNS, Certificates, etc)
  • Experience with CI/CD pipelines using tools such as GitHub Actions, ArgoCD, and Atlantis
  • Strong interpersonal skills working with teams across different time zones and regions
Great to Have
  • Knowledge of basic Microservice Architecture principles
  • Familiarity with security best practices for cloud-based systems.
  • Experience with relational databases, message queues, key value stores
  • Experience writing python, golang, or any other popular programming language
  • Familiarity with any RPC framework
  • Experience developing & building custom Kubernetes operators

Top Skills

Argocd
AWS
GCP
Github Actions
Go
Helm
Kubernetes
Python
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
San Francisco, CA
100 Employees
Year Founded: 2013

What We Do

We help organizations transform unstructured images, video and text data into structured data, significantly faster and more accurately than humans would be able to do on their own. Founded in 2013 by Matt Zeiler, Ph.D., Clarifai has been a market leader in AI since winning the top five places in image classification at the 2013 ImageNet Challenge. Clarifai continues to grow with more than 90 employees and offices in Wilmington, Delaware, San Francisco, and Tallinn, Estonia.

Similar Jobs

Coinbase Logo Coinbase

Senior Site Reliability Engineer

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Easy Apply
Remote
USA
4000 Employees
186K-219K Annually

MongoDB Logo MongoDB

Senior Site Reliability Engineer

Big Data • Cloud • Software • Database
Easy Apply
Remote or Hybrid
7 Locations
5550 Employees
127K-249K Annually

Zeta Global Logo Zeta Global

Senior Site Reliability Engineer

AdTech • Artificial Intelligence • Marketing Tech • Software • Analytics
Easy Apply
Remote or Hybrid
United States
2429 Employees
140K-170K Annually

NBCUniversal Logo NBCUniversal

Senior Site Reliability Engineer

AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Remote or Hybrid
Los Angeles, CA, USA
68000 Employees
130K-160K Annually

Similar Companies Hiring

Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account