SRE/ELK Stack Engineer - Azure

Posted Yesterday
Be an Early Applicant
Chennai, Tamil Nadu, IND
In-Office
Senior level
Information Technology • Professional Services • Consulting
The Role
Assess and modernize an ELK-based observability platform on Azure by designing OpenTelemetry-based logging, metrics, and tracing across microservices and AKS. Implement telemetry pipelines, integrations with ELK/OpenSearch/Azure Monitor, define SLIs/SLOs and alerting, create dashboards and runbooks, and improve cost, retention, and operational intelligence. Assist with AIOps/anomaly detection and DORA metric integration.
Summary Generated by Built In
Company Description

About CRUX

CRUX is one of the leading information technology companies. Through its Global Network Delivery Model, Innovation Network, and Solution Accelerators, CRUX focuses on helping global organizations address their business challenges effectively.

CRUX continues to invest in new technologies, processes, and people, which can help its customers, succeed. From generating novel concepts through CRUX’s R&D and academic alliances, to drawing on the expertise of key partners, it keeps clients operating at the very edge of technological possibility.

CRUX highly skilled, dedicated IT professionals, its subsidiaries and Joint Ventures provide customized IT solutions for several industries using our range of technical expertise and experience.

CRUX

Client’s satisfaction is our utmost priority. We will go through and provide you with the right vendor with the right talent who are capable of handling any job you desire. We will handle the project for you making sure that all your requirements are met. We work for you.

We believe that every IT & ITES project is unique in it and cannot be generalized.  In this model the client stands to gain by working with the pioneers of the industry at relatively lower cost and towards the end of the development life cycle the technology is transferred which value adds to the local content.

CRUX offers a wide variety of services.  Match your business needs to our capabilities.  Our professional staff’s are highly qualified to assist companies in any area related to their information systems environment

Job Description

Customer currently uses ELK stack, and the goal is to standardize and modernize logs, metrics, and traces using OpenTelemetry, while improving visibility, reliability, and operational intelligence.

Observability Architecture & Modernization

·       Assess the existing ELK-based observability setup and define a modern observability architecture

·       Design and implement standardized logging, metrics, and distributed tracing using OpenTelemetry

·       Define observability best practices for cloud-native and Azure-based applications

·       Ensure consistent telemetry collection across microservices, APIs, and infrastructure

Logging, Metrics & Tracing

·       Instrument applications using OpenTelemetry SDKs (SpringBoot, .NET, Python, Javascript – as applicable)

·       Support Kubernetes and container-based workloads (if applicable)

·       Configure and optimize log pipelines, trace exporters, and metric collectors

·       Integrate OpenTelemetry with ELK / OpenSearch / Azure Monitor / other backends

·       Define SLIs, SLOs, and alerting strategies

·       Knowldege in integrating the GitHub and Jira metrics as DORA metrics to observability.

Operational Excellence

·       Improve observability performance, cost efficiency, and data retention strategies

·       Create dashboards, runbooks, and documentation 

AI-based Anomaly Detection & Triage (Good to Have )

·       Design or integrate AI/ML-based anomaly detection for logs, metrics, and traces

·       Worked on AIOps capabilities for automated incident triage and insights

Required Technical Skills

Core Observability

·       Strong hands-on experience with ELK Stack (Elasticsearch, Logstash, Kibana)

·       Deep understanding of logs, metrics, traces, and distributed systems

·       Practical experience with OpenTelemetry (Collectors, SDKs, exporters, receivers)

Cloud & Platforms

·       Strong experience with Microsoft Azure to integrate with Observability platform.

·       Experience with Kubernetes / AKS to integrate with Observability platform.

·       Knowledge of Azure monitoring tools (Azure Monitor, Log Analytics, Application Insights)

·       Experience with Kubernetes / AKS is a strong plus.

Soft Skills;'

·       Strong architecture and problem-solving skills

·       Clear communication and documentation skills

·       Hands-on mindset with an architect-level view

Good to Have / Preferred Skills

·       Experience with AIOps / anomaly detection platforms

·       Exposure to tools like Prometheus, Grafana, Jaeger, OpenSearch, Datadog, Dynatrace, New Relic (any)

·       Experience with incident management, SRE practices, and reliability engineering

Qualifications

Soft Skills;'

·       Strong architecture and problem-solving skills

·       Clear communication and documentation skills

·       Hands-on mindset with an architect-level view

Good to Have / Preferred Skills

·       Experience with AIOps / anomaly detection platforms

·       Exposure to tools like Prometheus, Grafana, Jaeger, OpenSearch, Datadog, Dynatrace, New Relic (any)

·       Experience with incident management, SRE practices, and reliability engineering

Additional Information

 

Experience Level: 5-8 Years 

Location: Chennai 

 

Skills Required

  • Hands-on experience with ELK Stack (Elasticsearch, Logstash, Kibana)
  • Practical experience with OpenTelemetry (Collectors, SDKs, exporters, receivers)
  • Instrumenting applications with OpenTelemetry SDKs (SpringBoot, .NET, Python, JavaScript)
  • Strong experience with Microsoft Azure and Azure monitoring tools (Azure Monitor, Log Analytics, Application Insights)
  • Experience with Kubernetes / AKS and container-based workloads
  • Deep understanding of logs, metrics, traces, and distributed systems
  • Define SLIs, SLOs, and alerting strategies
  • Knowledge of integrating GitHub and Jira metrics as DORA metrics
  • 5-8 years of relevant experience (SRE/observability)
  • Strong architecture, problem-solving, communication, and documentation skills
  • Experience with AIOps or anomaly detection platforms
  • Exposure to Prometheus, Grafana, Jaeger, OpenSearch, Datadog, Dynatrace, New Relic
  • Experience with incident management, SRE practices, and reliability engineering
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
102 Employees

What We Do

CRUX is a leading information technology company that helps global organizations address business challenges effectively using its Global Network Delivery Model, Innovation Network, and Solution Accelerators. The company provides customized IT solutions across various industries, leveraging a highly skilled team of IT professionals and continuous investment in new technologies, processes, and people to ensure client success.

Similar Jobs

Comcast Logo Comcast

Development Engineer

Digital Media • Information Technology • News + Entertainment
Hybrid
Chennai, Tamil Nadu, IND
115000 Employees

Comcast Logo Comcast

SDET Engineer 3

Digital Media • Information Technology • News + Entertainment
Hybrid
Chennai, Tamil Nadu, IND
115000 Employees

Comcast Logo Comcast

Development Engineer

Digital Media • Information Technology • News + Entertainment
Hybrid
Chennai, Tamil Nadu, IND
115000 Employees

Comcast Logo Comcast

Development Engineer

Digital Media • Information Technology • News + Entertainment
Hybrid
Chennai, Tamil Nadu, IND
115000 Employees

Similar Companies Hiring

Amplify Platform Thumbnail
Fintech • Financial Services • Consulting • Cloud • Business Intelligence • Big Data Analytics
Scottsdale, AZ
62 Employees
Standard Template Labs Thumbnail
Artificial Intelligence • Information Technology • Software
New York, NY
25 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account