What You'll Be Doing
- Work across multiple functional teams to assess, design, build and maintain a highly fault-tolerant, elastic infrastructure of tools and automation on cloud.
- Create deployments, services, and other resources on Kubernetes clusters.
- Design, build, test, deploy, and automate stable/scalable services for the internal engineering team and end users.
- Champion for a flawless Service Level Agreement (SLA). Shoot for the 5 9s target.
- Be available on-call during your shift to handle any P0 incidents and help bring the systems back online.
- Create and manage CI/CD pipelines for automated testing, deployment, and any other use cases.
- Continuously monitor all the services and drive performance tuning.
- Maintain and improve our existing software engineering tools with upgrades and installations.
- Integrate secure solutions and compliance management including identity and access management role-based access control systems.
- Debug, troubleshoot, and resolve system level scale, performance, and automation problems.
- Provide multi-tier levels of support to engineering and non-engineering stakeholders.
- Check in code to Github repositories and perform code reviews for your fellow team members.
What You Should Have
- Bachelor’s degree in computer programming, computer science, or a related field.
- 5+ years experience in a DevOps or Site Reliability Engineer role.
- Mix of consumer technology and SaaS technology is ideal.
- Working and maintaining production experience of Kubernetes deployments and services.
- Kubernetes (k8s) and Docker production experience
- Built out continuous integration and continuous deployment pipelines.
- Able to write Bash and/or Python scripts.
- Ability to own and be responsible for the projects you will be working on.
We'll Be Excited If You Have
- Experience working with AWS cloud infrastructure and their various services.
- Fluent in Terraform/Terragrunt and writing Infrastructure as Code (IaC).
- Experience and thorough understanding of the Linux operating systems.
- Experience with high-traffic monitoring systems.
- Implementation of logging (Grafana/Prometheus), telemetry (New Relic), and tracing is ideal.
- Experience with Nginx deployments.Closely work with SQL and NoSQL databases and experience executing zero-downtime database upgrades.
- Excellent eye for security and creating bulletproof secure systems.
- Excellent and effective verbal, written, interpersonal communication skills.
- Comfortable with fast-paced change: ability to demonstrate comfort with ambiguity, adapt quickly and be effective in new situations in a highly dynamic setting.
- Data-driven but also imaginative and intuitive in coming up with ideas and solutions.
- Proven ability to balance multiple priorities in a collaborative team environment.
Similar Jobs
What We Do
Firework is the world's leading immersive digital transformation and engagement platform with shoppable video, live streaming commerce, and monetization capabilities.
Powering over 600 direct-to-consumer brands, retailers, and media publishers worldwide, Firework brings TikTok-like interactive video experiences to your own websites and app. We enable customers to create and host native, shoppable video content for engaging product discovery, seamless shopping experiences, and a deeper emotional connection with consumers. The company is backed by IDG Capital, Lightspeed Venture Partners, and GSR Ventures, with over $90 million in capital raised to date with offices in the US(SF and NYC), Toronto, Poland, Slovakia, Brazil, and China.
Why Work With Us
We are a diverse team where everyone belongs. We are creative, curious, and cool in a nerdy way. We believe in growth, results, and in each other and that perfection is a work-in-progress. We are just the right amount of extra and want to change the digital game.









