The Senior Site Reliability Engineer is a senior engineering role to define and deliver our SaaS platform's reliability and scalability. This person will play a critical role in advocating, designing, and implementing technology solutions to scale our cloud platform to support our rapidly growing customer base. This position plays an integral role in defining and assessing the organization's goal to be the highly evolutionary architecture that helps build scalable and resilient services.
What your day could consist of:
- Be the go to person for all of reliability engineering for project, incident and technical issues
- Establish engineering excellence in SRE by driving observability, scalability, high availability, reliability and sustainability of the platform
- Available for major incidents impacting the platform and customers
- Driving and contributing to infrastructure as a code project
- Work on AWS Public cloud with focus on developing and providing self service infrastructure
- Work closely to support and elevate geographically distributed product, security, and platform engineering teams on technical challenges and process improvements
- Guide SRE on implementing automation and efficiencies in managing infrastructure patching to achieve compliance, upgrading to eliminate tech debt
- Measure and guide intervention against toil to ensure engineering time is protected for SRE, DBRE, Security, Product Engineering, DevOps Engineering teams in the company
- Design, implement and integrate management solutions to effectively manage public cloud implementation (Docker, Kubernetes, Service Mesh) and monolith application deployed in multi-region across globe, ensuring reliability, elasticity, performance and security
- Support teams coming up to speed on new services they own
- Establish + mature standards and integration for infrastructure management domains - logging, monitoring, configuration management and orchestration
- Develop cloud and container management platform standards and capabilities, gain insights of the workflows of Product Development, Engineering and Operations teams, ensure platform relevance and drive adoption
- Collaborate with technical leadership and staff engineers across the organization to build the platform to cater the evolving needs of product engineering and SaaS delivery
- AWS and other SaaS tools governance, optimization and rightsizing
What is needed:
- 7+ years of software , DevOps, site reliability engineering experience
- 3+ public cloud experience - combination of cloud native and Open Source tools
- Mandatory experience driving or contributing to infrastructure as a code
- Mandatory experience on EKS or Native K8 workloads
- Exposure to moving monolithic applications to K8 a plus
- Experience with pulumi, cloudformation, terraform or similar IaC
- Experience coding with python, ruby, php, go, or shell scripting
- Working with distributed in-memory datastores like Redis and Memcached
- Experience working globally distributed teams supporting multi-region instances
- Experience with OS, hosting PHP/Python/MySQL based SaaS applications
- High volume, low latency and high throughput services experience
- Designing and implementing various access control models including authentication and authorization
- Experience in AWS IaaS and PaaS services is highly preferred
- Good communication and collaboration skills
- Self-motivated and strong sense of ownership of tasks
- Ability to lead and mentor 1-3 engineers working on focused SRE activities
We are a category-defining Customer Experience Automation Platform (CXA) that helps over 150,000 businesses in 170 countries meaningfully engage with their customers. The platform gives businesses of all sizes access to 600+ pre-built automations that combine email marketing, marketing automation, CRM, and machine learning for powerful segmentation and personalization across social, email, messaging, chat, and text.
As a global multicultural company, we are proud of our inclusive culture which embraces diverse voices, backgrounds, and perspectives. We don’t just celebrate our differences, we believe our diversity is what empowers our innovation and success. You can find out more about our DEI initiatives here.
As one of the fastest-growing SaaS companies in the world, we are scaling rapidly to keep up with market demand. We are growing all of our teams and looking for people who share our values, deliver innovation frequently, and join us in our mission to grow our customer base from 150,000 today to millions. We have been ranked #4 Best Place to Work on Built In Chicago in 2021, a best workplace for remote employees by Quartz and received recognition as a great place to work across all of our regions, and continue to be globally recognized for our employee-centric culture here.
Perks and benefits:
ActiveCampaign is an employee-first culture. We take care of our employees at work and outside of work. You can see more of the details here, but some of our most popular benefits include:
-Comprehensive health and wellness benefits (including no premiums for employees on our HSA plan, telehealth and tele-mental health, and access to the Calm app for mediation)
-Open paid time off
-Generous 401(k) matching with no vesting
-Generous stipend to outfit your remote office
-Career growth including access to personal and professional coaching. We take a proactive approach to diversity and inclusion and offer parental leave, career pathing, and support employees’ ongoing learning and development through Udemy and access to life coaches via Modern Health
ActiveCampaign is an equal opportunity employer. We recruit, hire, pay, grow and promote no matter of gender, race, color, sexual orientation, religion, age, protected veteran status, physical and mental abilities, or any other identities protected by law.
Our Employee Resource Groups (ERGs) strive to foster a diverse inclusive environment by supporting each other, building a strong sense of belonging, and creating opportunities for mentorship and professional growth for their members.