Senior Site Reliability Engineer - Observability

Reposted 24 Days Ago
Be an Early Applicant
2 Locations
In-Office
Senior level
Financial Services
The Role
Own reliability and scalability of on-prem observability platforms (ELK, Grafana); handle production escalations, capacity planning, SLOs, onboarding, automation, IaC (Terraform/Helm/Ansible), upgrades, security hardening, and platform modernization.
Summary Generated by Built In
Job Description:About the Role: We are looking for a Senior SRE to join our Platform Engineering team as the operations owner of our observability platforms. You’ll be responsible for the reliability, scalability, and continued evolution of the tools that give our engineering organization visibility into everything they build and run. The current observability platform is primarily comprised of on-premises ELK (Elasticsearch, Logstash, Kibana) Stack and Grafana, with some exposure to New Relic and SolarWinds.  This is a hybrid role: roughly half your time will be spent on steady-state operations and platform support, and the other half on engineering projects that meaningfully advance the platforms you support. It’s a great fit for someone who is genuinely motivated by the pursuit of excellence – not just sustaining what works but relentlessly refining it. You take pride in the platforms you own, and that pride drives you to keep improving them, whether that means tightening an SLO, eliminating a source of toil, or building something that gives teams faster insight into their systems.  What You’ll Work On: Operations & Reliability (~ 50%) 
  • Serve as a primary escalation point for production support involving the ELK Stack, Grafana, and New Relic 
  • Own platform health, capacity planning, and performance tuning for on-premises observability infrastructure – including Elasticsearch cluster management, index lifecycle policies, and retention strategies 
  • Monitor and maintain SLOs for the observability platforms, ensuring the tools engineers depend on are highly available and performant 
  • Support engineering teams in onboarding to observability platforms – helping teams instrument their applications, build dashboards, and define meaningful alerts 
  • Manage patching, upgrades, and configuration management across the observability stack 
  • Collaborate with security to harden platform configurations and manage software vulnerabilities 
  • Contribute to on-call rotations and maintain runbooks and escalation procedures 
 Platform Engineering (~ 50%) 
  • Design and build tooling/automation to reduce toil and improve the experience for teams using observability platforms 
  • Lead or contribute to platform modernization initiatives – e.g., improving ingestion pipelines, scaling platform capacity, standardizing Grafana dashboard and alerting patterns, or evaluating new capabilities within the existing stack 
  • Develop and maintain infrastructure-as-code (Terraform, Helm, Ansible, etc.) for platform components 
  • Build and enforce standards around logging metrics and alerting that help engineering teams adopt observability best practices at scale 
  • Participate in design reviews and contribute to the overall platform roadmap 
  What We’re Looking For: 
  • Bachelor’s degree in a technical field or equivalent practical experience 
  • 5+ years of experience in SRE, DevOps, or platform engineering roles 
  • Deep hands-on experience with the ELK Stack – Elasticsearch cluster operations, Logstash pipeline development, Kibana, and index lifecycle management 
  • Strong experience with Grafana, including data source integrations, dashboard design, and alerting 
  • Solid understanding of observability principles 
  • Experience operating on-premises infrastructure, including capacity planning, server management, and the operational tradeoffs with managed cloud services 
  • Proficiencyin Python forautomation and tooling; familiarity with shell scripting 
  • Strong Linux systems knowledge and comfort working with configuration management tools (e.g., Ansible, Chef, Puppet, etc.) 
  • Demonstrated ability to drive incidents to resolution and communicate clearly under pressure 
  • A bias toward automation and a low tolerance for repetitive manual work 
 Nice to Have:  
  • Experience with Prometheus 
  • Experience with New Relic administration or APM instrumentation 
  • Familiarity with log shipping agents and pipeline tools such as Beats, Fluentd, or Fluent Bit 
  • Experience with distributed tracing tools like OpenTelemetry 
  • Exposure to cloud-based observability offerings and experience thinking through hybrid strategies 
  • Prior experience building or governing observability standards across a large engineering organization 
#LI-Hybrid

    

Dimensional offers a variety of programs to help take care of you, your family, and your career, including comprehensive benefits, educational initiatives, and special celebrations of our history, culture, and growth.

It is the policy of the Company to provide equal opportunity for all employees and applicants.  The Company recruits, hires, trains, promotes, compensates, and administers all personnel actions without regard to actual or perceived race, color, religion, religious practice, creed, sex, sex stereotyping, pregnancy (which includes pregnancy, childbirth, and medical conditions related to pregnancy, childbirth, or breastfeeding), caregiver status, gender, gender identity, gender expression, transgender identity, national origin, age, mental or physical disability, ancestry, medical condition, marital status, familial status, domestic partnership status, military or veteran status or service, unemployment status, citizenship status or alienage, sexual orientation, status as a victim of domestic violence, status as a victim of stalking, status as a victim of sex offenses, genetic information, political activities or recreational activities, arrest or conviction record, salary history, natural hairstyle or any other status protected by applicable law except as otherwise required or permitted by law or regulation applicable to the Company or its affiliates. 

Skills Required

  • Bachelor's degree in a technical field or equivalent practical experience
  • 5+ years of experience in SRE, DevOps, or platform engineering roles
  • Deep hands-on experience with the ELK Stack (Elasticsearch cluster operations, Logstash pipeline development, Kibana, index lifecycle management)
  • Strong experience with Grafana (data source integrations, dashboard design, alerting)
  • Solid understanding of observability principles
  • Experience operating on-premises infrastructure, including capacity planning and server management
  • Proficiency in Python for automation and tooling
  • Familiarity with shell scripting (Bash/shell)
  • Strong Linux systems knowledge
  • Comfort working with configuration management tools (Ansible, Chef, Puppet, etc.)
  • Demonstrated ability to drive incidents to resolution and communicate clearly under pressure
  • A bias toward automation and reducing repetitive manual work
  • Experience with Prometheus
  • Experience with New Relic administration or APM instrumentation
  • Familiarity with log shipping agents and pipeline tools such as Beats, Fluentd, or Fluent Bit
  • Experience with distributed tracing tools like OpenTelemetry
  • Exposure to cloud-based observability offerings and hybrid strategies
  • Prior experience building or governing observability standards across a large engineering organization

Dimensional Fund Advisors Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Dimensional Fund Advisors and has not been reviewed or approved by Dimensional Fund Advisors.

  • Wellbeing & Lifestyle Benefits Work–life balance is portrayed as notably strong and is often framed as part of the total package. Feedback suggests lifestyle and culture help offset only average cash levels for many roles.
  • Healthcare Strength Health coverage is described as comprehensive, including medical, dental, vision, mental‑health support, and HSA/FSA options. An on‑site clinic in Austin and wellness programs are cited as meaningful adds to the overall offering.
  • Leave & Time Off Breadth PTO and holidays are characterized as generous, with parental leave and adoption assistance highlighted as part of the package. Feedback suggests time‑off policies are a consistent bright spot across locations where available.

Dimensional Fund Advisors Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Austin, TX
1,490 Employees
Year Founded: 1981

What We Do

Dimensional is a leading global investment firm that has been translating academic research into practical investment solutions since 1981. Guided by a strong belief in markets, we help investors pursue higher expected returns through a systematic investment process that integrates research insights with advanced portfolio design, management, and trading while balancing tradeoffs that can impact returns. Dimensional is headquartered in Austin, Texas, and has 14 global offices across North America, Europe, and Asia. As of March 31, 2022, Dimensional manages $659 billion for investors worldwide.

Similar Jobs

Remote or Hybrid
4 Locations
200 Employees
148K-249K Annually

CrowdStrike Logo CrowdStrike

Technical Account Manager

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
USA
10000 Employees
86K-135K Annually

Commerce Logo Commerce

Manager, Security Governance Risk & Compliance

Artificial Intelligence • Cloud • Consumer Web • eCommerce • Information Technology • Software
In-Office
2 Locations
1200 Employees
113K-169K Annually

Ericsson Logo Ericsson

Customer Solutions Director

Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)
Hybrid
Plano, TX, USA
88000 Employees

Similar Companies Hiring

Granted Thumbnail
Mobile • Insurance • Healthtech • Financial Services • Artificial Intelligence
New York, New York
23 Employees
Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account