Responsibilties
- Infrastructure Leadership: Design and build major new infrastructure components and platforms to support Raya's growing needs
- Kubernetes & Container Orchestration: Lead our Kubernetes strategy, designing and implementing container orchestration solutions that optimize for various application workloads
- Performance Optimization: Design and optimize infrastructure for maximum application performance, focusing on memory management, resource allocation, network traffic optimization, and system efficiency
- Reliability Engineering: Implement SLOs, monitoring, and observability solutions to ensure high reliability of our platform
- Cloud Engineering: Apply your in-depth knowledge of AWS to design scalable, resilient architectures across multiple regions
- Incident Response: Participate in on-call rotations and lead complex infrastructure incident resolution and post-incident analysis
- System Evolution: Thoughtfully improve existing infrastructure through incremental enhancements while respecting operational constraints
- Deployment Automation: Enhance our CI/CD pipelines and deployment strategies to enable faster, safer releases
- AI-Enhanced Workflows: Integrate AI tools and capabilities into infrastructure workflows to automate complex tasks, enhance decision-making, and maximize operational efficiency
- Infrastructure Security: Collaborate with security teams to implement secure-by-design infrastructure
- Cost Optimization: Design cost-effective infrastructure solutions and implement optimization strategies
- Team Mentorship: Contribute to engineering excellence by mentoring other infrastructure engineers
Qualifications
- A BS/MS in Computer Science, Engineering, Systems Administration, or a related technical field (Professional experience can be substituted for candidates with non-engineering educational backgrounds)
- 6-8+ years of hands-on experience with infrastructure engineering, with a track record of designing and implementing scalable infrastructure solutions
- Strong expertise in Kubernetes and Docker, with experience designing and managing production container orchestration environments
- Demonstrated expertise in AWS and infrastructure-as-code tools (Terraform, CloudFormation, Pulumi, Ansible)
- Experience with performance tuning and optimization of both infrastructure and applications
- Experience with monitoring and observability tools (Datadog, Prometheus, Grafana)
- Proficiency in scripting and automation (Python, Bash, Go, Ruby)
- Experience working with and incrementally improving established infrastructure environments
- Strong collaborative instincts, emphasizing open communication, transparency, and cross-team interaction
Desired Qualifications
- Background in SRE (Site Reliability Engineering) practices
- Experience using AI tools to enhance infrastructure workflows, automate tasks, and improve operational efficiency
- Knowledge of database administration and optimization (PostgreSQL, MongoDB, Redis, Elasticsearch)
- Experience with multi-regional/global infrastructure deployment and operations
- Track record of successfully modernizing legacy infrastructure components
- Strong understanding of Node.js performance characteristics and experience optimizing infrastructure for Node.js workloads, including memory management, CPU utilization patterns, and scaling considerations
- Proficiency in application profiling and performance analysis tools
- Experience with network infrastructure and security
- Experience with infrastructure security and compliance controls
- Understanding of cost optimization strategies in cloud environments
- Experience with service mesh technologies (Istio, Linkerd, Consul)
- Experience with other cloud platforms (GCP, Azure) in addition to AWS
- Familiarity with disaster recovery planning and implementation
What Set's You Apart
- Reliability Focus: You have a passion for building systems that are resilient, self-healing, and maintainable
- Performance Optimizer: You have a keen eye for identifying bottlenecks and optimizing system performance at all levels of the stack
- Problem Solver: You excel at diagnosing complex infrastructure issues and implementing effective solutions
- Application-Aware Infrastructure: You design infrastructure that maximizes application performance by understanding how applications actually behave in production, with particular expertise in Node.js workloads
- Pragmatic Innovator: You can balance working within existing constraints while strategically introducing improvements where they'll have the most impact
- Context Builder: You take time to understand existing systems and historical decisions before proposing changes, appreciating the journey that led to the current state
- Infrastructure Vision: You think beyond immediate needs to design infrastructure that can scale and evolve with Raya's growth
- Automation Mindset: You're driven to automate manual processes and eliminate operational toil
- AI-Powered Efficiency: You're excited about leveraging AI to amplify your capabilities, automate complex workflows, and achieve greater scale and impact in your infrastructure work
- Impact-driven: You prioritize infrastructure initiatives that maximize impact, aligning with Raya's overarching goals
- Growth-oriented: You possess a perpetual learner's mindset, open to challenges, and always seeking opportunities to expand your technical horizons
Similar Jobs
What We Do
Raya is a private community that fosters quality, real-world connection. We believe that value-driven communities create a strong sense of belonging, inspiring more learning, sharing, and compassion.
Why Work With Us
We’re a mission-driven team that cares deeply about working together to change the digital landscape. Everyone here is encouraged to learn and participate in professional development opportunities - and have some fun too. We regularly have team events, including karaoke, game nights, and hot pot dinners to get to know each other better.
Gallery
.png)






