Site Reliability Engineer

Posted 3 Days Ago
Be an Early Applicant
Hiring Remotely in Bangalore, Bengaluru Urban, Karnataka
In-Office or Remote
Mid level
Software • Big Data Analytics
The Role
As an SRE Engineer at ScyllaDB, you will manage cloud operations, enhance reliability and performance of Scylla Cloud, and automate tasks through scripting.
Summary Generated by Built In

ScyllaDB is seeking experienced and dynamic individuals to join our Cloud Operations & Site Reliability Engineering (SRE) team. As a Scylla Cloud Operations & SRE Engineer, you will play a vital role in maintaining the operational excellence of our cutting-edge NoSQL database platform, Scylla Cloud. Leveraging your expertise in cloud infrastructure, AI, and system operations, you will ensure the reliability, scalability, and performance of our cloud offerings. If you are passionate about working in a fast-paced environment, collaborating with cross-functional teams, and driving continuous improvement, this role is tailored for you.

Applicants for this position should be able to start their workday anytime between 00:00 GMT and 10:00am GMT.

Responsibilities:

  • Collaborate with the Support & DevOps teams to ensure the smooth day-to-day operation of Scylla Cloud.
  • Monitor system health, troubleshoot issues, and proactively address any operational challenges.
  • Act as a liaison with the Support Organization to address cloud platform-related issues.
  • Respond to tasks and tickets escalated by Support Staff, and collaborate to ensure timely resolutions.
  • Develop and maintain a comprehensive runbook that can be leveraged by Support Staff to troubleshoot and resolve common issues, improving efficiency in issue resolution.
  • Create scripts and automation solutions to streamline operational tasks and enhance efficiency.
  • Contribute to the development of automation strategies for cloud infrastructure management.
  • Assist and perform migrations of ScyllaDB clusters between clouds and accounts.
  • Assist and perform upgrades for Scylla Cloud, including Scylla database versions, OS upgrades, and security patches.
  • Collaborate with DevOps/Cloud Engineering to ensure seamless upgrade processes.
  • Participate in scaling up and down Scylla Monitor & Scylla Managers servers based on demand. Employ proactive monitoring strategies to identify and address potential performance bottlenecks and resource constraints.
  • Feature Requests: Collaborate with the Cloud Engineering team to define and create feature requests that enhance the functionality and performance of Scylla Cloud.
  • Conduct regular cluster health and performance audits, identifying areas for optimization. Implement strategies to enhance the efficiency and reliability of Scylla Cloud clusters.
  • Work closely with the Customer Success team to ensure that provisioned resources align with customer needs and purchased packages. Provide insights into potential scaling opportunities and usage optimization.
  • Demonstrate a deep understanding of public cloud environments (AWS, GCP, Azure), Kubernetes, Linux system operations, and NoSQL database deployment/management. Apply this knowledge to resolve complex technical challenges.
  • Utilize scripting languages like Python, Terraform, Ansible and Bash to create automation tools that enhance operational efficiency.
  • Cross-Functional Collaboration: Collaborate closely with Support and Engineering teams to address issues, drive improvements, and implement customer-focused solutions.
  • Utilize AI effectively and securely to optimize tasks and automation.
  • 3+ years of experience in public cloud platforms (AWS, GCP, Azure).
  • 3+ years of Linux system operations and metrics analysis.
  • Availability to begin work between 00:00 AM and 10:00 AM GMT.
  • Strong scripting skills in Python and Bash.
  • Experience with reporting and visualization tools such as Splunk, Grafana, Prometheus, and Kibana.
  • Excellent written and verbal English communication skills.
  • Exceptional organizational skills and ability to manage multiple projects concurrently.
  • Ability to work both independently and collaboratively within cross-functional teams.
  • Strong problem-solving skills, especially under pressure.
  • Eagerness to continuously learn and adapt to emerging technologies.
  • Familiarity with container technologies like Docker and Kubernetes.
  • Familiarity within automation tools such as Ansible and Terraform.

Nice to Have:

  • Experience with AI assisted scripting/coding with tools such as Cursor, Windsurf, Kiro, Antigravity, or Claude Code.
  • Proficiency with automation tools such as Ansible and Terraform.
  • 3+ years of Argo Workflow or Jenkins experience
  • Proven expertise in NoSQL database deployment, management, and data modeling.

If you are passionate about contributing to the success of ScyllaDB's cloud offerings and thrive in a dynamic and collaborative environment, we invite you to join our Cloud Operations & SRE team. Your technical expertise, problem-solving skills, and dedication will play a crucial role in ensuring the reliability and performance of Scylla Cloud for our global customer base.


Top Skills

Ansible
AWS
Azure
Bash
GCP
Grafana
Kibana
Kubernetes
Linux
NoSQL
Prometheus
Python
Splunk
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Herzliya
213 Employees
Year Founded: 2013

What We Do

ScyllaDB is the database for data-intensive apps that require high performance and low latency. It enables teams to harness the ever-increasing computing power of modern infrastructures--eliminating barriers to scale as data grows. Unlike any other database, ScyllaDB is built with deep architectural advancements that enable exceptional end-user experiences at radically lower costs. Over 300 game-changing companies like Disney+ Hotstar, Expedia, FireEye, Discord, Crypto.com, Zillow, Starbucks, Comcast, and Samsung use ScyllaDB for their toughest database challenges. ScyllaDB is available as free open source software, a fully-supported enterprise product, and a fully managed service on multiple cloud providers. For more information: ScyllaDB.com

Similar Jobs

Confluent Logo Confluent

Site Reliability Engineer

Big Data • Information Technology • Software • Database • Analytics • Infrastructure as a Service (IaaS) • Big Data Analytics
Remote
IN
3263 Employees

Alkira, Inc. Logo Alkira, Inc.

Software Engineer

Cloud • Information Technology
In-Office or Remote
Bangalore, Bengaluru Urban, Karnataka, IND
127 Employees

Selector Logo Selector

Site Reliability Engineer

Artificial Intelligence • Information Technology • Software
In-Office or Remote
Bangalore, Bengaluru Urban, Karnataka, IND
104 Employees

Fortive Logo Fortive

Site Reliability Engineer

Hardware • Other • Software • Appliances • Industrial • Manufacturing
Remote
India
13486 Employees

Similar Companies Hiring

Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account