Reliability Support Specialist at 6sense (Remote)
6sense helps B2B marketing and sales organisations fully understand the complex ABM buyer journey. By combining intent signals from every channel with the industry’s most advanced AI predictive capabilities, it is finally possible to predict account demand and optimize demand generation in an ABM world. Equipped with the power of AI and the 6sense Demand Platform™, marketing and sales professionals can uncover, accelerate, and capture buyer demand to drive more revenue.
Reliability Support Specialists at 6sense are instrumental figures of our Reliability team and work with Engineering teams to help diagnose and fix issues to ensure our services and infrastructure are fast, stable, and scalable.
The Reliability team focuses on the automation, integration, operation, and overall improvement of our monitoring, logging, and alerting services to ensure we can deliver product quickly, safely, and reliably.
Own our monitoring, logging, and alerting tools used by the overall Software Engineering team in order to ensure we are meeting reliability requirements
Learning and adopting technologies that may aide in solving our challenges
Support the overall Software Engineering team to monitor/alert on any issues they may encounter
Help respond to service issues and determine how to automatically alert the responsible parties along with context in order to make the service-owner a self-sufficient first-responder
First-responder to issues with shared infrastructure and escalate to other team members as necessary
Work with other teams to get automatic resolutions in place to alleviate need for human response
Participate in on-call rotations to monitor platform/infrastructure issues
Shift : 12/7 shifts. 1 week on, 1 week off. On-call during week on.
2+ years in a reliability or technical support-related role
Proficient with ANSI SQL (reading and writing queries)
Must have strong problem-solving analytical skills and the ability to self-manage
Experience with monitoring REST APIs and web services
Experience with high-availability
Experience with leveraging and configuring observability systems such as Datadog, Grafana, Grafana Loki, Promethus, Sumo Logic
Experience with monitoring relational databases such as MySQL, Aurora/RDS MySQL, PostgreSQL, etc
2+ years of experience with Linux/Unix system administration
Experience with monitoring Hadoop ecosystems (e.g. Hadoop, Hive, Presto)
Experience monitoring and analysing services/applications in service-oriented architecture at the network/server-level as well as in containerised space (such as Kubernetes and Docker)