Job Description: Elastic Stack Engineer
(Search & Observability)
Role Overview
As an Elastic / Observability & Security Platform Engineer, you will lead the design,
implementation, monitoring and continuous improvement of our Elastic-based observability and security stack. You will take ownership of detection rules, watchers, ML-models, health monitoring of data streams, alerting frameworks, and tracking of data pipeline latency/integration times. You will work closely with data engineers, security operations, platform engineering, and business-units to ensure robust real-time monitoring, anomaly detection, alerting, and data integration observability.
Key Responsibilities
•
Architect, deploy, configure and optimise the Elastic Stack (Elasticsearch, Kibana,
Beats, Logstash, Elastic Machine Learning, Elastic Watcher/Alerting).
•
Develop and maintain JSON-based configuration files, logic and pipelines for
detection rules, watchers and alerting states.
•
Design, build and operationalise machine-learning jobs within Elastic ML (e.g.,
anomaly detection, forecasting, classification) for observability/security use-cases.
•
Monitor, maintain and improve the health and performance of data-streams (logs,
metrics, events, traces) ingesting into the Elastic cluster: ensure data freshness,
minimal latency, correct mapping, index lifecycle management (ILM), shard
management, and cluster health.
•
Implement and maintain alerting/notification frameworks: watchers/triggers, custom
alert-logic via JSON, integration with downstream systems (Slack, Teams,
PagerDuty, email, webhook).
•
Track and report on the integration time between upstream data sources and the
Elastic ingestion pipeline (i.e., latency from source → pipeline → index →
availability), diagnose and mitigate delays or bottlenecks.
•
Develop dashboards, visualisations and reports in Kibana to communicate KPIs,
SLAs (data-ingestion, alert-response, model accuracy), and to drive continuous
improvement.
•
Collaborate with data engineering, DevOps, security operations (SecOps), SRE and
business stakeholders to define requirements and deliver effective
observability/security solutions.
•
Establish best‐practices, standards and documentation for JSON rule-configs,
watchers, ML-jobs, dashboarding and monitoring.
•
Participate in incident-response processes: support triage, root-cause analysis and feed
learnings back into detection rules/ML jobs/monitoring.
•
Stay up-to-date and contribute to improving the Elastic ecosystem in our
environment: new features, upgrades, tuning, cost-optimisation, benchmark/scale
testing.
Required Skills & Experience
•
Strong hands-on experience with the Elastic Stack (Elasticsearch, Kibana, Beats,
Logstash or equivalent ingestion pipelines) – you should be comfortable deploying,
configuring and operating production Elastic clusters.
•
Proficiency in writing and using JSON configurations and logic for detection rules,
watchers, alerting frameworks, and monitoring pipelines.
•
Experience building and operationalising Elastic Machine Learning jobs (anomaly
detection, forecasting, classifications) and interpreting model output for
observability/security use-cases.
•
In-depth experience monitoring and maintaining the health of high-volume data
streams: log/metric/event/tracing data, with attention to data latency, ingestion
batching, pipeline failures, index lifecycle, and cluster resource optimisation.
•
Experience designing end-to-end alerting workflows (trigger logic, thresholds, multi-
condition rules, escalation, notification integration).
•
Experience tracking and measuring integration times (data latency from source
ingestion to availability in index/dashboards) and implementing improvements to
reduce that latency.
•
Strong scripting or programming ability (e.g., Python, Bash, or similar) to automate
tasks, integrations or alert-logic.
•
Strong analytical and problem-solving skills: ability to diagnose
ingestion/pipeline/cluster issues, chain of events, root causes, and propose
mitigations.
•
Excellent communication skills: able to articulate detection logic, ML-model results,
data‐latency issues and dashboards to technical and non‐technical stakeholders.
•
Good understanding of DevOps/SRE practices (CI/CD, Infrastructure as Code,
Monitoring, Logging, Alerting).
•
Ability to document clearly: JSON rule setups, watchers, dashboards, models,
runbooks.
•
Bachelor’s degree in Computer Science, Information Systems or equivalent
experience; or equivalent relevant industry experience.
Desirable / Bonus Skills
•
Experience with elastic security (formerly SIEM) use‐cases using Elastic.
•
Experience with other observability/tracing stacks (OpenTelemetry, Jaeger,
Prometheus, Grafana) and integrating them into Elastic.
•
Knowledge of cloud environments (AWS, Azure, GCP) and experience managing
Elastic clusters in cloud or hybrid deployments.
•
Experience with large scale index management, shard tuning, ILM policies, cluster
scaling, and cost optimisation.
•
Experience with advanced ML-techniques (unsupervised learning, time‐series
forecasting, advanced feature engineering) applied to observability/security.
•
Knowledge of security operations (SecOps) and detection use-cases: threat hunting,
anomaly detection, SOC workflows.
•
Familiarity with infrastructure instrumentation (logs, metrics, traces) and analysing
telemetry from microservices/distributed systems.
Similar Jobs
What We Do
We are an IT consulting company specializing in data engineering, data science & advanced analytics, cloud computing consulting services and data pipeline automation. We were established in 2017, headquartered in South Africa and have over 100 professionals on board. Our main differentiation is a flexible approach to constantly changing business requirements and needs. Our highly qualified engineers and data scientists provide insightful expertise which help us deliver real added-value to our clients.








