At Infobip, we dream big. We value creativity, persistence, and innovation, passionately believing that it is through teamwork that we can all reach greater heights.
Since 2006, we have been innovating at the edge of technological possibilities and are now shaping global communications of the future. Through 75+ offices on six continents, Infobip’s platform is used by almost 80% of the population, making it the largest network of its kind and the only full-stack cloud communication platform globally.
Join us on our mission to create life-changing interactions between humans and online services with new and unseen solutions.
Job description:
As a part of Reliability Operations, you will work in a team which strives to identify, respond and mitigate platform incidents. If a platform incident occurs, you and your team will be the first responders to the incident, involving the responsible individuals in mitigation and driving the resolution.
Your job will include working on improving the observability of our platform, as well as collaboration with other engineers in common mitigation tactics. The automation is a big part of the job, as we strive to have meaningful alerting, rather than being triggered for every small glitch, so fine-tuning of existing alerts and improvements of the processes are one of our priorities.
Is your eye twitching when something breaks and you already have a list in your head of possible improvements? This is the place you're looking for.
What you will do:
- Be a first responder to platform alerts
- Monitor our products for issues, prioritize, triage them, and assess client impact
- Detect issues, identify them (affected systems, locations, responsible teams) and respond in a timely manner by utilizing runbooks
- Clearly communicate (summarize) and escalate platform incidents to responsible individuals
- Actively contribute to current runbooks and create a new ones
- When an incident is reported, be the driver of the incident resolution (incident commander)
- Based on alerts, try to prevent an issue becoming an incident
More about you:
- You have an engineering or support background and passion for IT with at least 1 year of prior experience in the same or similar jobs
- You have an experience with tools for monitoring systems (Grafana, Prometheus, NewRelic, Graylog, Kibana, Elasticsearch, Opensearch…)
- You have a strong system-thinking and problem-solving mindset
- You are genuinely interested into how things work, and driven when they don’t
- You have strong analytical and investigative skills combined with the ability to navigate through substantial amounts of data to gather critical information in a timely manner
- You are genuinely interested in site reliability and want to learn about mitigation tactics
- Hands-on knowledge of a system administration tasks are an advantage, but not a prerequisite
- You can speak fluently to clients, and colleagues alike, and have great command of English
- You can exhibit an advanced level of teamwork, excellent communication skills and a high degree of independence
- You are efficient in execution, prone to continuous improvements, experimentation, and self-education
A bit more on what kind of people we are looking for:
- tech savvy
- curious with attention to detail
- critical thinkers
- system-knowledge, holistic view
- enjoys troubleshooting
- responsible
- clear communicator
- problem solver
- willing to teach / mentor others
Infobip employees are people with diverse backgrounds, characteristics, and experiences that share the same passion and talent that helps us achieve our mission. That's why Infobip is committed to creating a diverse workplace and is proud to be an equal-opportunity employer.
All qualified applicants will receive consideration for employment without regard to race, color, ancestry, religion, age, sex, sexual orientation, gender, gender identity, national origin, citizenship, disability, veteran status, or any other part of one's identity.
#LI-RA1
Similar Jobs
What We Do
HIRING NOW! Infobip helps businesses build connected experiences across all stages of the customer journey. Accessed through a single platform, Infobip’s omnichannel engagement, identity, user authentication and contact center solutions help businesses and partners overcome the complexity of consumer communications to grow business and increase loyalty.
We work with large organizations, including seven of the world’s 10 biggest brands, across sales and marketing, operations, human resources, IT and security, and customer service. Our mobile engagement solutions help optimize operational functions, enhance internal and external communications, improve customer experiences, reduce support costs, generate new revenue, and gain a competitive advantage.
Whether two-factor authentication for high-tech retailers, emergency alerts for global giants, or mobile-giving solutions for large charities, Infobip offers the scale, service flexibility, reliability, and heritage to provide interactive solutions for today and in the future.
Companies choose Infobip for our domain expertise, service flexibility, demonstrated performance and reliability, global scale, and corporate maturity.
Why Work With Us
We work with some of the biggest enterprises in the world to make their customers’ lives better. But we’re small enough that every person counts. We’ve got a passion for our technology to rival any start-up. Our people are the best and most professional in the world. But we’re a suits-and-bureaucracy free zone.
Gallery






