Site Reliability Engineer (RCV Team)

Posted 19 Days Ago
Be an Early Applicant
Hiring Remotely in Bulgaria
Remote or Hybrid
Mid level
Artificial Intelligence • Cloud • Events • Productivity • Software • Business Intelligence • Conversational AI
Trusted AI communications.
The Role
As a Site Reliability Engineer, you will manage AWS cloud infrastructure, support incident management, and implement observability and security best practices.
Summary Generated by Built In

Say hello to opportunities.

If you’re looking to be part of what’s next in communication, you’re in the right place.

At RingCentral, we believe the best customer experiences happen when humans and AI work together. Our agentic voice AI portfolio—AIR, AVA, and ACE—brings together automation, assistance, and insights across the entire conversation lifecycle. The result? More seamless, intelligent experiences for businesses everywhere.

With $2.5B+ in ARR and $250M invested in R&D annually, we’re building the future of AI-powered business communications.
 

About RingCentral Video

RingCentral Video(RCV) is a robust AI-powered video conferencing and collaboration platform that provides a full range of solutions for team collaboration at any scale. It offers a comprehensive solution that covers the entire video communications lifecycle.

The platform's extensive suite of capabilities supports video meetings, conferences, hybrid workspaces, and meeting room integration, ensuring a seamless experience whether the audience is working online, in-office, or in a hybrid format. AI capabilities include automatic transcriptions, instant meeting summaries, contextual notes, live captions, and intelligent noise reduction, making every meeting productive and inclusive.

Position Overview

As a Site Reliability Engineer for RCV, you'll be responsible for the reliability and performance of the video communications platform. You'll be involved in the process of incident management, proactively addressing observability gaps, supporting software delivery, ensuring the safe and predictable transition of changes from development to production, and building a self-healing infrastructure. To achieve this, we are looking for a responsible and initiative engineer.

Responsibilities:

  • Manage geo-distributed cloud infrastructure on AWS and EKS, using IaC (Terraform) and GitOps (FluxCD) to ensure scalability;

  • Participate in 2 weeks on for 12h/daily (primary/backup roles), 3 weeks off on-call shifts to ensure continuous production support and timely response to operational needs;

  • Participate in service capacity planning, software performance analysis, and system configuration;

  • Design, consult, re-platform, and re-factor observability of current cloud infrastructure (Prometheus, Grafana, VictoriaMetrics, centralized logging and alerting);

  • Participate in release management, working closely with development teams to implement GitOps principles in release processes and manage CI/CD pipelines using GitLab CI;

  • Conduct blameless post-mortems to learn from incidents and prevent them;

  • Develop and test disaster recovery plans and runbooks to ensure business continuity;

  • Implement security best practices and controls in the infrastructure to meet compliance standards and prepare for audits.

Requirements:

  • Cloud & Infrastructure: AWS production environments - read and write Terraform manifests, understand IaC principles;

  • Kubernetes: Manage Kubernetes clusters - troubleshoot pod failures, set resource limits, work with scaling, understand networking;

  • CI/CD: Create and maintain CI/CD pipelines (GitLab CI is preferable);

  • Observability: Manage monitoring stacks (Prometheus, Grafana) - write PromQL queries, create dashboards, configure effective alerts;

  • Troubleshooting: Debug performance issues in distributed systems - analyze network traces, read application logs for root cause analysis;

  • Performance: Identify and eliminate bottlenecks - interpret metrics, optimize resource allocation and costs;

  • Incident Management: Participate in incident response - quickly localize problems, coordinate with other teams through war rooms/incident channels, document event timelines.

Nice to have:

  • A reliability-oriented mindset with a focus on designing and building resilient architectures;

  • In-depth troubleshooting - ability to use and implementation profiling tools (APM mostly);

  • Previous SRE experience or knowledge, giving you a heightened awareness of what data to collect, how to display it, and how users can benefit from it;

  • A deep understanding of Kubernetes. This is one of our core tools, and the better you understand it, the more valuable it is;

  • Hands-on practice with Istio/Gloo;

  • Knowledge of scripting languages such as Python or Go;

  • Understanding the principles and limitations of caching mechanisms (Redis);

  • Experience with messaging queues (Strimzi Kafka);

  • Familiarity with SQL and noSQL database management systems (Aurora, DocumentDB).

What we offer: 

  • Well-coordinated professional team.

  • Cutting edge technologies, interesting and challenging tasks, dynamic project, great opportunities for self-realization, professional and career growth.

  • Additional Health and Life Insurance Package.

  • Employee Assistance Program.

  • 25 vacation days.

  • 102,26 EUR/200 BGN Digital Food Vouchers.

  • 61, 36 EUR/120 BGN Gross as part of the salary for Working Expenses Allowance.

RingCentral’s work culture is the backbone of our success. And don’t just take our word for it: we are recognized as a Best Place to Work by BuiltIn, the Top Work Culture by Comparably and hold local BPTW awards in every major location. Bottom line: We are committed to hiring and retaining great people because we know you power our success.

About RingCentral

RingCentral is a global leader in agentic voice AI–powered business communications, delivering an integrated platform for business phone, SMS, contact center, workforce engagement management, video collaboration, and messaging. As the communications layer connecting businesses and customers, RingCentral is the front door of business communication and is in the advantageous position to apply AI at every phase of the conversation journey — before, during, and after each interaction. Our agentic AI portfolio includes autonomous voice-first AI agents that automate calls, assist in the moment, and analyze every interaction – enabling businesses to work smarter, respond faster, and connect more meaningfully with their customers. Visit ringcentral.com to learn more.
RingCentral is an equal opportunity employer that truly values diversity. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We are committed to providing reasonable accommodations for individuals with disabilities during our application and interview process. If you require such accommodations, please click on the following link to learn more about how we can assist you.

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Belmont , CA
7,000 Employees
Year Founded: 2003

What We Do

RingCentral is a global leader in AI-powered trusted business communications, contact center, revenue intelligence, video and hybrid event solutions. RingCentral empowers businesses with conversation intelligence and unlocks rich customer and employee interactions to provide insights and improved business outcomes.

Why Work With Us

Innovation isn't just a buzzword—it's the core and heart of everything we do. We believe that groundbreaking ideas emerge from every corner of our organization. Our biggest strength? We are not all the same. At RingCentral, our commitment to fostering a culture of curiosity and inclusivity is what sets us apart.

Gallery

Gallery

Similar Jobs

Circle (circle.so) Logo Circle (circle.so)

Lead Product Designer

Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
Easy Apply
Remote
31 Locations
250 Employees
140K-170K Annually

Smartling Logo Smartling

Don't see the role you're looking for currently available? Apply here.

Artificial Intelligence • Cloud • Information Technology • Machine Learning • Natural Language Processing • Software
Easy Apply
Remote
28 Locations
117 Employees

DraftKings Logo DraftKings

Senior Site Reliability Engineer

Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
Remote or Hybrid
Sofia, Sofia-grad, BGR
6400 Employees

DraftKings Logo DraftKings

Senior Platform Engineer

Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
Remote or Hybrid
Bulgaria
6400 Employees

Similar Companies Hiring

Fairly Even Thumbnail
Hardware • Other • Robotics • Sales • Software • Hospitality
New York, NY
30 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account