About the Role
We are seeking a seasoned SRE / DevOps Manager to lead our reliability and operations engineering team. You will be responsible for ensuring the scalability, security, and performance of our infrastructure while fostering a culture of automation, ownership, and continuous improvement.
Key Responsibilities
Team Leadership
- Manage and mentor a team of SRE and DevOps engineers.
- Drive hiring, onboarding, and professional development.
- Set clear goals and performance metrics.
Reliability & Incident Management
- Own system uptime, performance, and reliability.
- Lead incident response and root cause analysis.
- Define and monitor SLAs, SLOs, and SLIs.
Infrastructure & Automation
- Oversee cloud infrastructure (Azure).
- Implement Infrastructure as Code (IaC) using tools like Terraform or other similar tools
- Drive automation of CI/CD pipelines and operational tasks.
- Build and manage a DevSecOps process to connect CI/CD pipelines with AzureDevOps, Gitlab etc.
Monitoring & Observability
- Implement and maintain monitoring, alerting, and logging systems.
- Use tools like Datadog or other similar tools like Prometheus, Grafana, ELK stack.
Security & Compliance
- Ensure infrastructure security and compliance with industry standards.
- Collaborate with InfoSec teams on audits and vulnerability management.
Cross-functional Collaboration
- Work closely with software engineering, product, and QA teams.
- Advocate for DevOps and SRE best practices across the organization.
Qualifications
- 10+ years of experience in DevOps, SRE, or infrastructure engineering.
- 2+ years in a leadership or managerial role.
- 3+ years of expertise with Cloud platform deployments
- 3+ years of experience working with MongoDB and cosmosdb
- Strong experience with cloud platforms (AWS, GCP, Azure).
- Proficiency in scripting languages (Power shell scripting, Python, Bash, Go).
- Hands-on experience with Kubernetes, Docker, CI/CD tools.
- Excellent communication and leadership skills.
Preferred Qualifications
- Experience with compliance frameworks (SOC 2, ISO 27001).
- Familiarity with Agile and DevOps methodologies.
- Certifications in cloud technologies or DevOps practices.
Benefits/Perks
- Hybrid Opportunity
- Competitive salary
- Employer-matched 401(k) plan
- Attractive paid time off policy
- Career growth and development opportunities
Similar Jobs
What We Do
Upshop is the first total store operations platform synchronizing Fresh, Center, eCommerce, and DSD solutions to make retail operations simplified, smart, and more connected.
Upshop has been pioneering total store operations technology for over 30 years; delivering SaaS-based solutions which offer a simplified, smarter, more connected solution to retail store associates. The business leveraged the technology of leading products FreshIQ®, ShopperKit, Date Check Pro, and Itasca Retail's Magic Inventory Intelligence to synchronize one platform, providing retailers the visibility needed to increase sales, cut waste, and streamline labor efficiencies. Over 150+ retail chain accounts trust our software in over 30,000+ stores, 9 countries, and 3 continents.






