Teleport Jobs

System Reliability Engineer / T2 Support Engineer

Teleport

System Reliability Engineer / T2 Support Engineer

Reposted 4 Days Ago

Be an Early Applicant

Gurugram, Haryana, IND

In-Office

Mid level

Software

The Role

The System Reliability Engineer ensures the reliability and stability of production systems through monitoring, incident response, and operational changes while collaborating closely with technical teams.

Summary Generated by Built In

About the Role

We are looking for an engineer who enjoys understanding how systems behave in real production, not just writing features. This role is responsible for maintaining reliability, stability, and smooth functioning of our live platform running on Google Cloud.

You will act as the first technical owner of production systems — monitoring services, investigating alerts, resolving issues, and performing controlled configuration and operational changes. This role works closely with backend developers, QA, and infrastructure teams to prevent incidents and reduce downtime.

This is not a call-center support role and not a pure development role — it is a hands-on technical position focused on debugging, incident handling, and system operations.

Tech Stack

Google Cloud Platform (Compute, Logging, Monitoring)
Java (Spring Boot based microservices)
MongoDB
Apache Kafka (event-driven architecture)
Redis cache
Linux servers

Key ResponsibilitiesProduction Monitoring & Alert Handling

Monitor application health, latency, errors, consumer lag, database connections, and resource utilization
Acknowledge and investigate monitoring alerts
Perform first-level troubleshooting and stabilize services
Identify whether issue is infra, application, database, or messaging related

Incident Response

Participate in on-call rotation
Diagnose production incidents and restore services with minimal downtime
Safely restart services, scale instances, or rollback deployments when required
Communicate incident status to stakeholders

Technical Support & Operational Changes

Handle technical support tickets requiring engineering understanding
Update configurations and feature flags
Manage scheduled jobs / cron triggers
Trigger or replay events in Kafka
Assist in minor Java configuration/code fixes when needed
Coordinate production releases

Database & Messaging Operations

Investigate MongoDB performance issues and slow queries
Monitor and resolve Kafka consumer lag and stuck messages
Manage Redis cache behavior (TTL, eviction, connection issues)

Logs & RCA

Analyze logs and metrics to determine root cause of issues
Prepare basic Root Cause Analysis (RCA) reports
Suggest preventive actions to reduce recurring incidents

RequirementsRequired SkillsCore Technical Skills

Good understanding of Linux commands and server behavior
Experience analyzing application logs and debugging runtime issues
Basic Java knowledge (stack trace reading, configuration changes, rebuild & deploy)
Practical experience with MongoDB (indexes, connections, slow queries)
Understanding of Kafka concepts (consumer, offset, lag, partitions)
Basic Redis knowledge (caching behavior, TTL)

Cloud & Tools

Hands-on experience with any cloud platform (GCP preferred / AWS acceptable)
Experience using monitoring tools (GCP Monitoring, Prometheus, Grafana, ELK, or similar)
Understanding of REST APIs and HTTP status codes

What We Expect From You

Ability to investigate problems logically rather than randomly restarting services
Comfort working with live production systems
Willingness to participate in on-call support
Strong ownership mindset and attention to detail
Good communication during incidents

Good to Have

Experience in e-commerce, fintech, logistics, or high-traffic systems
Exposure to CI/CD pipelines and deployments
Basic scripting (Shell or Python)
Experience writing RCA documents

Experience

3 – 6 years of relevant experience in production support, application support, SRE, DevOps operations, or similar roles.

BenefitsWhy Join Us

Direct exposure to real distributed systems
Hands-on production debugging experience
Opportunity to learn system architecture deeply
Close interaction with development and platform teams

Important Note

This role involves handling live production systems and occasional on-call responsibilities. Candidates interested only in feature development or pure infrastructure automation may not find this role suitable.

Skills Required

3 - 6 years of relevant experience in production support
Good understanding of Linux commands and server behavior
Basic Java knowledge for debugging
Practical experience with MongoDB
Understanding of Kafka concepts
Hands-on experience with any cloud platform

Teleport Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Teleport and has not been reviewed or approved by Teleport.

Fair & Transparent Compensation — Pay is considered competitive and fairly set, with public salary ranges and role-specific bands providing clarity. Market-aligned offers are visible across engineering and go-to-market roles, reinforcing a perception of strong total compensation.
Healthcare Strength — Health coverage spans medical, dental, vision, disability, and mental-health support, including resources like 24/7 assistance and meditation tools. This breadth is consistently highlighted as a core element of the package.
Wellbeing & Lifestyle Benefits — A substantial annual expense/wellness benefit and remote-work support (home office, internet/phone, gym, commuting, and professional development) are emphasized. These flexible perks meaningfully augment total rewards for a remote-first setup.

Learn more about Teleport's Compensation & Benefits →

Teleport Insights

What's It Like to Work at Teleport? Teleport Culture & Values Teleport Career Growth & Development What's the Work-Life Balance Like at Teleport? Teleport Leadership & Management Teleport Company Growth, Stability & Outlook

View all jobs at Teleport

View Teleport Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Oakland, CA

74 Employees

Year Founded: 2015

What We Do

Teleport allows engineers and security professionals to unify access for SSH servers, Kubernetes clusters, web applications, and databases across all environments.