Senior SRE Engineer

Posted 19 Days Ago
Hiring Remotely in East End, St. Croix
Remote
170K-210K Annually
Senior level
Machine Learning
The Role
Seeking a talented Senior SRE/Deployment Engineer to build, deploy, and maintain products across various platforms, including multi-cloud, on-premises, and bare-metal deployments. Responsibilities include developing and maintaining deployment options, resolving infrastructure bugs, and collaborating with cross-functional teams for successful system advancements. Required skills include proficiency in Linux, AWS, GCP, Azure, Kubernetes, Docker, scripting, configuration management tools, monitoring applications, and excellent communication skills. Remote work in the USA with a competitive salary and benefits package offered.
Summary Generated by Built In
About Comet

Comet is building the development platform for teams who want to ship robust, reliable, and responsible AI applications. Opik, our open source LLM evaluation framework, has quickly become one of the most popular tools in the space. Our experiment management platform is used by data scientists at companies like Uber, Netflix, and Etsy. Tens of thousands of researchers, engineers, academics, and hobbyists use Comet every day to build the future of AI. 

Working at Comet will give you access to the most exciting work being done in all areas of machine learning. Some of the top researchers and companies working on self-driving cars, drug discovery, particle research, diffusion models, and LLMs use Comet every day. Your work has the potential to accelerate the development of some of the most impactful technology in the world, and you will be doing it alongside a team of passionate, caring individuals. If that sounds exciting, Comet is the right place for you.

Comet is backed by more than $63 million in venture capital funding and powers some of the best machine-learning teams in the world, including Netflix, Uber, Etsy, and Mobileye. We are a remote-first company with offices in New York City (USA) and Tel Aviv (Israel).

You are:

We are seeking a talented Senior SRE Engineer to join our team and help build, deploy and maintain our products across various platforms, including multi-cloud, on-premises, and bare-metal deployments.

Responsibilities:

  • Develop and maintain all deployment options for Comet, including multi-cloud, on-premises, and bare-metal deployments, using Linux single server or containerization technologies such as Kubernetes and Helm charts.
  • Utilize Helm charts to package, configure, and deploy Kubernetes applications efficiently.
  • Quickly identify and resolve infrastructure bugs, ensuring high system availability and reliability
  • Work closely with customers to understand their deployment needs and provide effective support for deploying and maintaining Comet on their infrastructure. This role is customer facing
  • Drive the success of system advancements by collaborating with cross-functional teams, including development, support, and other teams, to ensure seamless integration and successful deployment of new features and updates.

Requirements:

  • Must have proven customer-facing and customer support experience, capable of assisting clients with varying levels of technical expertise.
  • 5+ years of experience in Linux system internals, scripting and configuration management tools (Bash/Python/Ansible)
  • 5+ years of experience in running production systems over the cloud, such as AWS, GCP, or Azure, and using containerization technologies such as Kubernetes and Docker to deploy and manage applications
  • 5+ years of experience with cloud-based infrastructure services such as EC2, RDS, S3, and VPC, and with related tools such as CloudFormation and Terraform
  • 5+ years of experience with monitoring applications such as Prometheus, Grafana, or ELK stack.
  • 5+ years of experience using Helm charts to package, configure and deploy Kubernetes applications.
  • Excellent communication skills, both verbal and written, to effectively collaborate with team members and clients
  • Passionate about troubleshooting and investigating in unfamiliar environments.


What We Offer:

  • Competitive base salary - $170K - $210K based on proven experience, skills and location.
  • Competitive benefits package.
  • Flexible working hours and remote work options.
  • Opportunities for professional growth and development.
  • A collaborative and innovative work environment.
  • The chance to work with cutting-edge technologies and projects.
  • This role will be fully remote in the USA working with a global team (large presence in the US, Tel Aviv and Europe).– some flexibility with work hours is required.




Comet is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees without regard to race, religion, color, sex, gender identity, gender expression, sexual orientation, national origin, ancestry, citizenship status, uniform service member status, marital status, pregnancy, age, medical condition, physical or mental disability, genetic information/characteristics, and any other characteristic protected by State or Federal law.

Top Skills

Ansible
Bash
Docker
Kubernetes
Linux
Python
The Company
HQ: New York, NY
87 Employees
On-site Workplace
Year Founded: 2017

What We Do

Comet is a meta machine learning platform designed to help AI practitioners and teams build reliable machine learning models for real-world applications by streamlining and connecting the machine learning model lifecycle. By leveraging Comet, users can employ machine learning experiment tracking to track, compare, explain and reproduce their models. Backed by thousands of users and multiple Fortune 100 companies, Comet provides insights and data to build better, more accurate AI models while improving productivity, collaboration and visibility across teams.

Similar Jobs

Remote
2 Locations
3000 Employees
221K-245K Annually

Movable Ink Logo Movable Ink

Lead Frontend Engineer

Artificial Intelligence • Marketing Tech • Software
Easy Apply
Remote
East End, St. Croix, VIR
590 Employees

Movable Ink Logo Movable Ink

Senior Full Stack Engineer

Artificial Intelligence • Marketing Tech • Software
Easy Apply
Remote
East End, St. Croix, VIR
590 Employees

Instacart Logo Instacart

Senior Staff Software Engineer, CoreX

eCommerce • Food • Software
Remote
2 Locations
3000 Employees
261K-290K Annually

Similar Companies Hiring

JuiceMedia.AI Thumbnail
Marketing Tech • Machine Learning • Digital Media • Big Data Analytics • Analytics • Agency • AdTech
Marina Del Rey, CA
68 Employees
Halter Thumbnail
Software • Machine Learning • Internet of Things • Hardware • Greentech • Business Intelligence • Agriculture
Auckland City, NZ
150 Employees
InCommodities Thumbnail
Renewable Energy • Machine Learning • Information Technology • Energy • Automation • Analytics
Austin, TX
234 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account