Production Support Lead

Sorry, this job was removed at 6:30 a.m. (CST) on Friday, June 2, 2023
Find out who's hiring in Houston, TX.
See all Finance jobs in Houston, TX
Apply
By clicking Apply Now you agree to share your profile information with the hiring company.

Introduction

NYDIG is a leading technology and financial services firm accelerating the Bitcoin future. We believe that Bitcoin is not just a new asset class but a potentially powerful force for good. NYDIG’s focus is two-fold. First, we aim to provide the best investor solution platform with the most sophisticated suite of products to corporations, asset managers, institutions, and other sophisticated investors. Second, we aim to democratize access by providing a technology platform capable of powering embedded bitcoin products and services for any financial institution. Our team is a group of proven innovators with deep domain expertise across finance and technology. We look for optimistic, passionate, low-ego, excellence-driven people who want to work together on creating impactful solutions. This is a rare opportunity to join a rapidly growing firm innovating in an exciting and dynamic industry.

Description

NYDIG is looking for an experienced hands on Production Engineering Lead with a background in Site Reliability/DevOps engineering. This person should have a passion for providing superior system availability and a first-class customer support experience. We are looking for candidates who can lead a 24/7 support organization, drive reliability and performance across a massive scale by mastering the full depth of the stack. As a Production Support Lead, you will have the opportunity to tackle complex problems of scale which are unique to fintech companies while using your expertise in delivery and support of critical services.

Responsibilities

  • Increase operational efficiencies to proactively reduce and mitigate production incidents
  • Provide on-call leadership to mitigate critical incidents
  • Lead the team in producing runbooks and support documentation
  • Lead a team of experienced support engineers to meet or exceed expectations on incident SLAs
  • Ability to understand full technology stack of systems in the assigned domain
  • Build a high performing team of support engineers across several geographical locations to provide a 24x7 support for systems with an ever-watchful eye on their availability, latency, performance, and capacity
  • Collaborating with other tech leads and support teams to ensure integrated end-to-end availability, reliability, and performance
  • Define support strategies for systems in the Cloud (AWS)
  • Influencing resiliency and scalability in production environments in Amazon Web Services and other cloud platforms
  • Identify and drive resolution on monitoring and alerting gaps
  • Lead a team to design, write and deliver technical and process automation to improve the availability, scalability, latency, and efficiency of NYDIG’s services
  • Solve problems relating to mission-critical services and build automation to prevent problem recurrence; with the goal of automated response to all non-exceptional service conditions
  • Engage in service capacity planning and demand forecasting, software performance analysis and system tuning
  • Experience utilizing monitoring solutions, such as New Relic, Splunk and/or DataDog to reduce outages detection time
  • Identifying and remediating risk to critical and non-critical system KPIs
  • Familiarity with application architectures and networking
  • Familiarity with DevOps and\or Site Reliability Engineering concepts and principles
  • Familiarity with automation of routine maintenance tasks and common issues
  • Understanding of Unix/Linux systems from kernel to shell and beyond, taking in system libraries, file systems, and client-server protocols along the way
  • Networking: knowledge and understanding of network theory, such as different protocols (TCP/IP, UDP, ICMP, etc.), MAC addresses, IP packets, DNS, OSI layers, and load balancing)

Requirements

  • Bachelors or Masters Degree in a Computer Science related field
  • 8+ years of related experience
  • 3+ years of experience working in an AWS environment
  • 2+ years of experience with scripting language(s) such as Python to debug, optimize code, and automate routine tasks.
  • 1+ years of experience with Splunk, Datadog or New Relic monitoring and alerting

Perks & Benefits

  • Highly competitive compensation package
  • Generous benefits package including Unlimited PTO
  • 401k program with company match
  • Employer sponsorship for personal/professional development programs (the sky’s the limit!)
  • Flexible unmetered Parental Leave policy

Exceptional benefits package with:

  • $1/month premiums for you and your family
  • HSA plan option with employer funding
  • Dedicated benefit concierge
  • Free One Medical membership
Read Full Job Description
Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.

Similar Jobs

Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.
Learn more about NYDIGFind similar jobs