Monitoring & Observability
- Design, implement, and maintain monitoring and observability solutions using tools like Prometheus, Grafana Stack (Loki/Grafana/Tempo/Alert Manager), Datadog, and OpenTelemetry.
- Define and implement SLOs, SLIs, and error budgets to measure system reliability.
- Develop and optimize dashboards, alerts, and reports for system performance and business metrics.
Alerting & Incident Management
- Design actionable alerting strategies to minimize noise and improve MTTR.
- Integrate alerting systems with Jira.
- Establish and refine runbooks for on-call teams to handle alerts efficiently.
- Empower teams to ensure observability coverage and incident response practices.
Performance Optimization
- Analyze system performance metrics, identify bottlenecks, and implement optimizations to improve system efficiency,scalability, and cost-effectiveness.
- Help conduct load testing and capacity planning to ensure systems can handle peak traffic loads.
Automation and Tooling
- Identify opportunities for automation and develop tools to streamline operational processes, such asfail-over, configuration management, and monitoring.
- Implement monitoring and alerting systems within automations to detect and resolve issues proactively.
Collaboration and Communication
- Collaborate closely with cross-functional teams, including software engineers, operations, and infrastructure teams, to understand system requirements, providetechnical guidance, and drive solutions.
- Communicate effectively to stakeholders about system changes, incidents, and improvements.
- Foment and spread SRE principles and practices across company
Qualifications
- Proven experience as a Site Reliability Engineer or similar role.
- Proficiency in logging, metrics, and tracing frameworks (DataDog, Loki, Prometheus, OpenTelemetry).
- Experience with cloud platforms (Azure preferred) and infrastructure-as-code tools (e.g., Terraform).
- Strong programming and scripting skills (Python, Bash).
- Proficiency in containerization technologies and orchestration tools (Docker, Kubernetes).
- Understandingof Linux-based systems, networking, and security principles related to containerized applications.
- Strong problem-solving and troubleshooting skills, with a passion for identifying and resolving complex technical issues.
- Excellent communication and collaboration abilities.
- Ability to thrive in a fast-paced, constantly evolving environment.
- Experience with PostgreSQL monitoring and optimization (Optional/Nice to have)
Our Tech Stack
- Azure as an infrastructure provider. We are reviewing secondary cloud options.
- Docker + Kubernetes for microservice orchestration using Istio service mesh
- PostgreSQL for relational db, ElasticSearch for indexing, Redis for caching
- DataDog, Grafana and OpenTelemetry for observability
- GitHub for our Version Control and CI (with our own runners)
- CD: Harness and FluxCD
- Terraform and Terragrunt as IaaC
- Python and bash for scripting infrastructure
- React - We’re all in on React – we maintain multiple single-page React apps
- TypeScript – 99% of our codebase is TypeScript
- Latest .NET version for our backend services
- GraphQL - Our standard for API communication is GraphQL served by our DotNet Back-End
Our Values
- We innovate with purpose
- We focus on outcomes vs. output
- We believe diverse and inclusive teams fuel innovation
- We are humble yet candid
- We do right by the customer
What We Offer
- Unlimited vacation
- Meal vouchers paid in full by the company
- Multisport card contribution
- Pension contributions
- Language courses
- Centrally located office in the heart of Brno
- Bi-weekly team lunches provided by the company
- Tech courses and conferences
- Top of the line MacBook
- Company team building events
- Flexible working hours and the possibility to work from home
Top Skills
What We Do
Capital Markets Gateway (CMG) is a financial technology firm that is modernizing the equity capital markets (ECM). CMG connects investors and underwriters via a neutral platform that delivers integrated ECM data and analytics, unrivaled transparency, and workflow efficiencies. Providing a digital system of record for firm-wide deal activity, CMG helps clients make more timely, better-informed decisions. Launched in 2017 by a team of ECM practitioners, the CMG platform is currently relied upon by nearly 100 buy side firms representing $12 trillion in AUM and 15 investment banks.







