- Design, build, and maintain scalable and reliable infrastructure.
- Collaborate with engineering teams to ensure systems are designed with reliability and scalability in mind.
- Evaluate and integrate new technologies to enhance our infrastructure.
- Implement and maintain monitoring and alerting systems to detect and respond to issues promptly.
- Lead incident response efforts, ensuring quick resolution and effective communication.
- Conduct post-incident reviews and drive improvements based on findings.
- Architect & Build innovative automation projects (preferably in Python/GoLang) from scratch to help reduce day-to-day SRE toil
- Create Bash scripts to automate manual activities like upgrades, status checks, and deployment
- Develop and maintain infrastructure as code (IaC) using tools such as Terraform, Ansible, or similar.
- Automate repetitive tasks and processes to improve efficiency and reduce manual intervention.
- Collaborate with cross-functional teams to deliver high-quality products and services.
- Mentor and guide junior SREs and other team members.
- Advocate for best practices in reliability engineering across the organization.
- Drive initiatives to improve service reliability, capacity, and performance.
- Participate in capacity planning and disaster recovery exercises.
- Stay current with industry trends and emerging technologies.
- Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
- 8+ years of minimum experience in the industry as a Software Engineer, SRE, or Platform Engineer.
- Minimum 3+ years of experience as a Platform Engineer or SRE.
- Proven experience in managing large-scale, mission-critical infrastructure.
- Deep understanding of Linux/Unix systems and networking.
- Proficiency in at least one or more programming languages (e.g., Python, Go, Java).
- Intermediate to Expert level skill in bash scripting
- Experience with cloud platforms (AWS, Azure, GCP) and container orchestration (Docker, Kubernetes).
- Strong knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
- Familiarity with CI/CD pipelines and tools (e.g., Jenkins, GitLab CI).
- Excellent problem-solving skills and a proactive attitude.
- Strong communication and collaboration skills.
- Ability to work independently and as part of a team.
- Demonstrated leadership and mentoring abilities.
- English Proficiency Assessment (25 mins)
- Technical Assessment (45 mins)
- Recruiter screen (30 mins)
- Technical Interview (30-45 mins)
Top Skills
What We Do
Gigster lives and breathes to unleash innovation at a global scale. This mission is only realized by enabling a fluid workforce that both democratizes and elevates all that is possible. By offering fully managed, on demand teams across multiple disciplines such as AI, ML, Blockchain, IoT, and others Gigster helps customers innovate at the speed of light.
Founded in 2014 Gigster has completed over 5,000 projects with some of the largest companies in the world.
Third party research firm, Constellation Research, completed a study that showed Gigster’s model: 30% efficiency in staffing, 60+% lower delivery risk, and a 3.6x higher customer satisfaction score than other software development firms.