Senior Monitoring Engineer

Reposted 9 Days Ago
Be an Early Applicant
Fort Worth, TX, USA
In-Office
Senior level
Fintech
The Role
Design, implement, and maintain enterprise monitoring and observability solutions (metrics, logs, traces). Build alerts, dashboards, OpenTelemetry instrumentation, automation, and collaborate with application and support teams to reduce MTTR and improve reliability.
Summary Generated by Built In

We’re seeking a Senior Monitoring Engineer to join a high‑performing Monitoring Engineering team in a fast‑paced finance technology organization. You’ll design, develop, and maintain monitoring and observability solutions that keep core applications and infrastructure healthy and visible. In close partnership with application, platform, and development teams, you will implement alerting systems, dashboards, correlations, and automation—driving reliability, reducing MTTR, and elevating operational awareness.

Critical thinking, system analysis, and proactive troubleshooting are essential to success in this role.

Key Responsibilities

Design, Build, and Maintain Monitoring & Observability Solutions

  • Develop and maintain instrumentation, telemetry, and alerting for the Enterprise Monitoring Center using industry‑leading tools, such as:
    • Grafana
    • OpsRamp
    • AppDynamics
    • Elastic Stack
    • BigPanda
    • AWS CloudWatch
    • Azure Monitor
  • Implement Observability best practices, ensuring comprehensive coverage of metrics, logs, and traces across critical systems.
  • Integrate and manage OpenTelemetry for distributed tracing and telemetry data collection, enabling end‑to‑end visibility of business‑critical transactions.

Collaboration & Project Participation

  • Collaborate with application development teams to define and document observability requirements for each project or release.
  • Participate in complex initiatives, ensuring accurate and actionable monitoring and tracing are in place for every step of business‑critical workflows.

Alerting & Escalation Process

  • Define and maintain standardized alert payloads per engineering guidelines, ensuring alerts are actionable.
  • Partner with Level 2 and Level 3 support teams to reflect process changes in monitoring dashboards.
  • Maintain and optimize thresholds, ensuring seamless escalations via BigPanda as the central alert hub.

Dashboard Creation & Maintenance

  • Create and maintain intuitive, actionable dashboards for the Enterprise Monitoring Center and other finance teams.
  • Ensure dashboards are effectively monitored by Level 1 teams, presenting clear, actionable data that reduces MTTR.

System Validation, Documentation & Automation

  • Develop and maintain automation scripts to enhance monitoring efficiency and improve team quality of life.
  • Proactively identify process improvements and learning opportunities; drive continuous improvement.

Automation & Quality‑of‑Life Improvements

  • Contribute to the automation of monitoring, alerting, and operational tasks to streamline workflows and improve overall system reliability.

Qualifications

Education

Bachelor’s in Computer Science, IT, or related field.

Experience

  • Minimum 4 years in a technology organization, with ≥1 year hands‑on engineering experience in monitoring or production operations.

Required Skills

  • Strong experience developing instrumentation and alerting for large, complex environments.
  • Expertise in ≥4 of the following: OpsRamp, Grafana, AppDynamics, Elastic Stack, InfluxDB, BigPanda, and other monitoring solutions.
  • Hands-on experience with Observability concepts and frameworks, including metrics, logs, and traces.
  • Working knowledge of OpenTelemetry for distributed tracing and telemetry data collection.
  • Experience with dashboard creation, alert management, and tool configuration.
  • Excellent verbal and written communication—able to present complex technical issues to both technical and non‑technical stakeholders.
  • Strong problem‑solving and troubleshooting in high‑pressure environments.
  • Ability to prioritize and manage multiple tasks in a deadline‑driven setting.
  • Proven collaboration with cross‑functional teams in large, complex IT environments.
  • Experience with scripting (e.g., Bash, PowerShell) and proficiency in one programming language (e.g., Python, C family, JavaScript).
  • Experience designing and implementing scalable, reliable monitoring solutions.
  • Experience with agile software development methodologies
  • Familiar with problem diagnosis; performance tuning; capacity planning and configuration management across the stack via continuous improvement.

Preferred Qualifications

  • Experience querying, manipulating, and visualizing time‑series data.
  • Familiarity with Infrastructure as Code tools (e.g., Ansible, Terraform).
  • Strong understanding of how to create actionable, digestible visualizations for Level 1 monitoring teams.
  • Working knowledge of REST APIs, JSON, and ServiceNow.
  • Experience with cloud monitoring—particularly AWS or Azure.

Who we Are

OneMain Financial (NYSE: OMF) is the leader in offering nonprime customers responsible access to credit and is dedicated to improving the financial well-being of hardworking Americans. Since 1912, we’ve looked beyond credit scores to help people get the money they need today and reach their goals for tomorrow. Our growing suite of personal loans, credit cards and other products help people borrow better and work toward a brighter future.

Driven collaborators and innovators, our team thrives on transformative digital thinking, customer-first energy and flexible work arrangements that grow lives, careers and our company. At every level, we’re committed to an inclusive culture, career development and impacting the communities where we live and work. Getting people to a better place has made us a better company for over a century. There’s never been a better time to shine with OneMain.

Because team members at their best means OneMain at our best, we provide opportunities and benefits that make their health and careers a priority. That’s why we’ve packed our comprehensive benefits package for full- and some part-timers with: 

  • Health and wellbeing options including medical, prescription, dental, vision, hearing, accident, hospital indemnity, and life insurances 

  • Up to 4% matching 401(k)   

  • Employee Stock Purchase Plan (10% share discount)   

  • Tuition reimbursement   

  • Paid time off (15 days’ vacation per year, plus 2 personal days, prorated based on start date) 

  • Paid sick leave as determined by state or local ordinance, prorated based on start date 

  • Paid holidays (7 days per year, based on start date) 

  • Paid volunteer time (3 days per year, prorated based on start date) 

OneMain Holdings, Inc. is an Equal Employment Opportunity (EEO) employer. Qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship status, color, creed, culture, disability, ethnicity, gender, gender identity or expression, genetic information or history, marital status, military status, national origin, nationality, pregnancy, race, religion, sex, sexual orientation, socioeconomic status, transgender or on any other basis protected by law.

Skills Required

  • Bachelor's degree in Computer Science, IT, or related field
  • Minimum 4 years in a technology organization with ≥1 year hands-on monitoring or production operations engineering
  • Experience developing instrumentation and alerting for large, complex environments
  • Expertise in at least four of: OpsRamp, Grafana, AppDynamics, Elastic Stack, InfluxDB, BigPanda
  • Hands-on experience with observability concepts and frameworks (metrics, logs, traces)
  • Working knowledge of OpenTelemetry for distributed tracing and telemetry collection
  • Experience creating dashboards, alert management, and tool configuration
  • Strong verbal and written communication to present technical issues to varied stakeholders
  • Proven problem-solving and troubleshooting in high-pressure environments
  • Ability to prioritize and manage multiple tasks in deadline-driven settings
  • Proven collaboration with cross-functional teams in large, complex IT environments
  • Scripting experience (e.g., Bash, PowerShell) and proficiency in one programming language (e.g., Python, C family, JavaScript)
  • Experience designing and implementing scalable, reliable monitoring solutions
  • Experience with agile software development methodologies
  • Familiarity with problem diagnosis, performance tuning, capacity planning, and configuration management
  • Experience querying, manipulating, and visualizing time-series data
  • Familiarity with Infrastructure as Code tools (Ansible, Terraform)
  • Working knowledge of REST APIs, JSON, and ServiceNow
  • Experience with cloud monitoring, particularly AWS or Azure

OneMain Financial Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about OneMain Financial and has not been reviewed or approved by OneMain Financial.

  • Retirement Support Retirement benefits include a 401(k) with a dollar-for-dollar match up to 4% after six months, supporting long-term savings. An employee stock purchase plan with a 10% discount adds another ownership-oriented reward element.
  • Leave & Time Off Breadth Time-off offerings include vacation that can grow to five weeks, paid holidays, personal days, sick time, and three paid volunteer days. Paid parental leave is offered at 100% pay for six weeks, adding to the overall leave mix.
  • Flexible Benefits A broad menu of benefits spans HSAs/FSAs, disability coverage, life and long-term care solutions, tuition reimbursement, and voluntary options like pet insurance and legal assistance. Backup child/elder care and adoption assistance further widen the set of selectable supports.

OneMain Financial Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Baltimore, Maryland
5,386 Employees
Year Founded: 1912

What We Do

OneMain provides personal loans with one on one, local service at branches nationwide. Our personalized loan solutions offer customers a simple and straightforward loan application, fixed rates, fixed payments, clear terms and multiple payment options.

Similar Jobs

Cox Enterprises Logo Cox Enterprises

Search Engine Optimization Specialist

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Remote or Hybrid
United States
50000 Employees
22-33 Hourly

Lansweeper Logo Lansweeper

Senior Quality Assurance Engineer

Cloud • Information Technology • Software
Hybrid
Austin, TX, USA
404 Employees

Optimum Logo Optimum

Product Manager

AdTech • Digital Media • Internet of Things • Marketing Tech • Mobile • Retail • Software
Hybrid
3 Locations
9000 Employees
123K-203K Annually

Optimum Logo Optimum

Site Reliability Engineer

AdTech • Digital Media • Internet of Things • Marketing Tech • Mobile • Retail • Software
Hybrid
2 Locations
9000 Employees
84K-137K Annually

Similar Companies Hiring

Scotch Thumbnail
Artificial Intelligence • eCommerce • Fintech • Payments • Retail • Software • Analytics
US
35 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York City, NY
100 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account