ITOC Engineer II
Our Opportunity:
We are seeking a highly motivated IT Operations Center Engineer II to be part of our team. The ITOC Engineer II are a cross between system and software engineers who are responsible for all observability aspects of Chewy’s e-commerce platform. We're looking for engineers who want to be a part of supporting infrastructure software health by developing monitoring strategies and performing advanced troubleshooting and operational tasks. Come help us build a bigger and better Chewy as an ITOC Engineer II. You will be part of a small family within Chewy that has a huge impact on our incredible growth. Ideal candidates will possess the ability to discuss complex technical concepts with a diverse audience across all areas of the organization. They will remain calm under pressure and always strive to add structure to high-pressure, fast paced tasks or projects.
What You'll Do:
- Focus on service stability and reliability by working with application owners to set SLOs.
- Complete understanding of operational tools and concepts, such as alerting, monitoring, logging and health checks
- Identify observability requirements during application onboarding phase.
- Be a technology and DevOps evangelist for the rest of the company
- Configure monitoring tools to include Datadog APM/Synthetic/RUM, AWS Cloudwatch, Splunk and others for optimal observability of Chewy’s E-comm environment.
- Function as the ITOC SME for our monitoring infrastructure to aid teams as needed for onboarding and developing monitoring for new and existing services.
- Develop the material for and conduct team training to keep the ITOC up to date on the latest technology and best practices.
- Participate in and develop projects for the observability needs of internal Chewy teams, while identifying and creating opportunities to improve our processes and procedures to further raise the bar.
- Troubleshoot advanced issues impacting platform functionality.
- Interface, as primary POC, with 3rd party vendors and internal teams to maintain a highly efficient platform, enhancing ROI.
- Participate in an on-call rotation to provide 24/7/365 support to Major Incidents for after-hours response.
- Automate manual tasks using specified tools within our environment, such as Terraform, Jenkins and AWS technologies.
- Other duties as assigned
What You'll Need:
- At least 5 years of experience in an IT Operation Center, or similar environment.
- Hands on experience with orchestration and system configuration tools such as Ansible, Puppet, Chef, Terraform [preferred], etc.
- Minimum 5+ years of experience in building and managing applications in public cloud platforms like AWS (preferred), GCP or Azure
- Expert in building and maintaining highly available applications including redundancy, fail over, scalability, monitoring and performance
- Highly skilled in identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues
- Solid understanding/experience of web services, databases and relating infrastructure/architecture
- Experience working with open-source community (troubleshooting, patch submission, etc.)
- Demonstrated 5+ years of Linux System Administration
- Experience with CI tools such as Bamboo, Jenkins, CircleCI
- Ability to organize, troubleshoot and continuously learn
- Previous experience working within controls such as SOX, PCI, etc
- Position may require travel
Bonus:
- AWS Certified Solutions Architect
- Advanced Terraform knowledge and orchestration using Jenkins
- Datadog Integration expertise for container
- Linux certification.
- ITIL v4 certified.
- Splunk Core certified.
- Scripting in Python, Bash, PowerShell, or similar.
- Bachelor’s degree in a related field
If you have a disability under the Americans with Disabilities Act or similar law, and you need an accommodation during the application process or to perform these job requirements, or if you need a religious accommodation, please contact [email protected].
If you have a question regarding your application, please contact [email protected].
Chewy is committed to equal opportunity. We value and embrace diversity and inclusion of all Team Members.
To access Chewy’s Privacy Policy, which contains information regarding information collected from job applicants and how we use it, please click here: https://www.chewy.com/app/content/privacy).