Elastic Stack Engineer

Sorry, this job was removed at 04:52 p.m. (CST) on Friday, Dec 12, 2025
Be an Early Applicant
Hiring Remotely in Gauteng, ZAF
Remote
Information Technology
The Role

Job Description: Elastic Stack Engineer

(Search & Observability)

Role Overview

As an Elastic / Observability & Security Platform Engineer, you will lead the design,

implementation, monitoring and continuous improvement of our Elastic-based observability and security stack. You will take ownership of detection rules, watchers, ML-models, health monitoring of data streams, alerting frameworks, and tracking of data pipeline latency/integration times. You will work closely with data engineers, security operations, platform engineering, and business-units to ensure robust real-time monitoring, anomaly detection, alerting, and data integration observability.

Key Responsibilities


Architect, deploy, configure and optimise the Elastic Stack (Elasticsearch, Kibana,

Beats, Logstash, Elastic Machine Learning, Elastic Watcher/Alerting).


Develop and maintain JSON-based configuration files, logic and pipelines for

detection rules, watchers and alerting states.


Design, build and operationalise machine-learning jobs within Elastic ML (e.g.,

anomaly detection, forecasting, classification) for observability/security use-cases.


Monitor, maintain and improve the health and performance of data-streams (logs,

metrics, events, traces) ingesting into the Elastic cluster: ensure data freshness,

minimal latency, correct mapping, index lifecycle management (ILM), shard

management, and cluster health.


Implement and maintain alerting/notification frameworks: watchers/triggers, custom

alert-logic via JSON, integration with downstream systems (Slack, Teams,

PagerDuty, email, webhook).


Track and report on the integration time between upstream data sources and the

Elastic ingestion pipeline (i.e., latency from source → pipeline → index →

availability), diagnose and mitigate delays or bottlenecks.


Develop dashboards, visualisations and reports in Kibana to communicate KPIs,

SLAs (data-ingestion, alert-response, model accuracy), and to drive continuous

improvement.


Collaborate with data engineering, DevOps, security operations (SecOps), SRE and

business stakeholders to define requirements and deliver effective

observability/security solutions.


Establish best‐practices, standards and documentation for JSON rule-configs,

watchers, ML-jobs, dashboarding and monitoring.


Participate in incident-response processes: support triage, root-cause analysis and feed

learnings back into detection rules/ML jobs/monitoring.


Stay up-to-date and contribute to improving the Elastic ecosystem in our

environment: new features, upgrades, tuning, cost-optimisation, benchmark/scale

testing.

Required Skills & Experience


Strong hands-on experience with the Elastic Stack (Elasticsearch, Kibana, Beats,

Logstash or equivalent ingestion pipelines) – you should be comfortable deploying,

configuring and operating production Elastic clusters.


Proficiency in writing and using JSON configurations and logic for detection rules,

watchers, alerting frameworks, and monitoring pipelines.


Experience building and operationalising Elastic Machine Learning jobs (anomaly

detection, forecasting, classifications) and interpreting model output for

observability/security use-cases.


In-depth experience monitoring and maintaining the health of high-volume data

streams: log/metric/event/tracing data, with attention to data latency, ingestion

batching, pipeline failures, index lifecycle, and cluster resource optimisation.


Experience designing end-to-end alerting workflows (trigger logic, thresholds, multi-

condition rules, escalation, notification integration).


Experience tracking and measuring integration times (data latency from source

ingestion to availability in index/dashboards) and implementing improvements to

reduce that latency.


Strong scripting or programming ability (e.g., Python, Bash, or similar) to automate

tasks, integrations or alert-logic.


Strong analytical and problem-solving skills: ability to diagnose

ingestion/pipeline/cluster issues, chain of events, root causes, and propose

mitigations.


Excellent communication skills: able to articulate detection logic, ML-model results,

data‐latency issues and dashboards to technical and non‐technical stakeholders.


Good understanding of DevOps/SRE practices (CI/CD, Infrastructure as Code,

Monitoring, Logging, Alerting).


Ability to document clearly: JSON rule setups, watchers, dashboards, models,

runbooks.


Bachelor’s degree in Computer Science, Information Systems or equivalent

experience; or equivalent relevant industry experience.

Desirable / Bonus Skills


Experience with elastic security (formerly SIEM) use‐cases using Elastic.


Experience with other observability/tracing stacks (OpenTelemetry, Jaeger,

Prometheus, Grafana) and integrating them into Elastic.


Knowledge of cloud environments (AWS, Azure, GCP) and experience managing

Elastic clusters in cloud or hybrid deployments.


Experience with large scale index management, shard tuning, ILM policies, cluster

scaling, and cost optimisation.


Experience with advanced ML-techniques (unsupervised learning, time‐series

forecasting, advanced feature engineering) applied to observability/security.


Knowledge of security operations (SecOps) and detection use-cases: threat hunting,

anomaly detection, SOC workflows.


Familiarity with infrastructure instrumentation (logs, metrics, traces) and analysing

telemetry from microservices/distributed systems.

Similar Jobs

Remote
Gauteng, ZAF
9 Employees

Ericsson Logo Ericsson

Head of BOS Integrated Services Hub 1

Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)
In-Office or Remote
90 Locations
88000 Employees

TransUnion Logo TransUnion

French speaking Voice Office Operations Representative (Remote)

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Remote or Hybrid
South Africa
13000 Employees

Circle (circle.so) Logo Circle (circle.so)

Senior Site Reliability Engineer

Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
Easy Apply
Remote
31 Locations
250 Employees
130K-140K Annually
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
9 Employees
Year Founded: 2017

What We Do

We are an IT consulting company specializing in data engineering, data science & advanced analytics, cloud computing consulting services and data pipeline automation. We were established in 2017, headquartered in South Africa and have over 100 professionals on board. Our main differentiation is a flexible approach to constantly changing business requirements and needs. Our highly qualified engineers and data scientists provide insightful expertise which help us deliver real added-value to our clients.

Similar Companies Hiring

Axle Health Thumbnail
Logistics • Information Technology • Healthtech • Artificial Intelligence
Santa Monica, CA
19 Employees
Scrunch  Thumbnail
Artificial Intelligence • Information Technology • Marketing Tech • Software • SEO
Salt Lake City, Utah
Standard Template Labs Thumbnail
Artificial Intelligence • Information Technology • Software
New York, NY
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account