- Observability Expertise:
- Expert in implementing and integrating observability solutions in services using products like Vector, Fluentd, OpenTelemetry, Prometheus, Loki, VictoriaMetrics, Thanos, Tempo, Opensearch, and Grafana
- Deep understanding of service reliability, KPIs, and metrics
- Ability to optimize large-scale telemetry pipelines and backends
- Strong in distributed computing fundamentals, storage technologies, and time series databases
- Contribution to open source projects is a plus.
- Kubernetes and Cloud Experience:
- Professional experience running Kubernetes in on-premises and cloud environments
- Hands-on production experience in designing and managing Kubernetes clusters
- Good understanding of on-premises and cloud platforms
- Programming experience:
- You should have programming knowledge in languages like Golang or Rust
- Able to write libraries and contribute to open source
- Ready to start immediately and make an impact from day one
- 4-8 years of experience in SRE, particularly in implementing Observability at scale
- Strong troubleshooting skills for resolving system issues in production environments
- Implementation experience with SRE concepts such as SLIs and SLOs
- Ability to represent the organization, collaborate with, and coach customer teams
- Passion for sharing knowledge through technical writing and speaking at community events and conferences
Top Skills
What We Do
CloudRaft is a trusted problem solver for startups and Fortune 500 companies. Our team crafts cutting-edge AI Cloud, GPU Cloud, and cloud native solutions. We specialize in DevOps & cloud consulting, observability, and enterprise-grade support for open source technologies like PostgreSQL and Clickhouse. With our expertise, businesses can confidently navigate their digital transformation journey. Our Specialization: - AI Cloud, GPU Cloud, AI Infrastructure, Enterprise AI, and Generative AI: Empowering businesses with advanced AI capabilities that enhance decision-making and operational efficiency. - Cloud Native Solutions, Kubernetes Consulting: Crafting scalable, resilient cloud environments that adapt to your business needs. - DevOps & Cloud Consulting: Streamlining development and operations through best practices in DevOps and cloud strategies. - DevSecOps & Security: Ensuring robust security measures are integrated seamlessly into every stage of development. - Observability: Providing deep insights into system performance to ensure optimal functionality and quick resolution of issues. - Enterprise-grade Support for Open Source Technologies: Offering expert support for tools like Thanos, Prometheus, ArgoCD, PostgreSQL, and Clickhouse, ensuring your open-source projects thrive. We are committed to helping businesses navigate their digital transformation journey with confidence. Visit us at www.cloudraft.io
.jpg)







