Rogo

Site Reliability Engineer (SRE)

Sorry, this job was removed at 06:18 p.m. (CST) on Monday, Aug 04, 2025

New York City, NY

In-Office

Software • Analytics

The AI platform for financial services.

The Role

We're building Al thought partners to make people smarter and more creative, accelerating the creation and sharing of knowledge in financial services. We're unabashedly ambitious, and we're dead set on building the biggest Financial AI company in the world. Our team is lean, smart, and endlessly curious.

What You Will Own

Infrastructure Management: Design, deploy, and maintain cloud infrastructure on AWS and/or Azure, ensuring high availability and resilience.
Monitoring and Performance: Implement and manage monitoring solutions using Datadog to proactively identify and address system issues.
Container Orchestration: Manage Kubernetes clusters, utilizing Helm for package management and deployment automation.
Automation and Scripting: Develop and maintain Infrastructure as Code (IaC) using tools like Terraform, and create automation scripts in Bash or Python to streamline operations.
Collaboration: Work closely with development and operations teams to propagate DevOps culture, share best practices, and ensure seamless integration and deployment processes.
Incident Response: Troubleshoot and resolve complex cross-platform issues related to OS, networking, and databases in a cloud-based environment.
Documentation: Maintain comprehensive documentation of system configurations, procedures, and troubleshooting guides.

What You Will Need

Bachelor’s degree in Computer Science, Information Technology, or a related field.
Experience
- 3-5 years of hands-on experience with AWS and/or Azure cloud platforms, including services like EC2, S3, VPC, and Lambda.
- 2-3 years of experience managing Kubernetes clusters in production environments.
- 2-3 years of experience with Helm for Kubernetes package management.
- 2-3 years of experience with Datadog or similar monitoring tools.
- 3-5 years of experience with Linux system administration and shell scripting.
- 2-3 years of experience with Infrastructure as Code (IaC) tools like Terraform.
Skills
- Proficiency in scripting languages such as Bash and Python.
- Strong understanding of networking fundamentals, including TCP/IP, DNS, and load balancing.
- Experience with CI/CD pipelines and tools like Jenkins, GitLab CI, or GitHub Actions.
- Experience with cloud-native security best practices and compliance frameworks.
- Excellent problem-solving skills and the ability to navigate complex challenges effectively.
- Strong communication and collaboration skills.

Bonus

Experience with MLOps monitoring and observability.
Experience with PostgreSQL, Elasticsearch, and vector databases such as Qdrant or similar technologies.
Experience with monitoring and security tools such as Datadog, AWS GuardDuty, CloudWatch, and CloudTrail.
Certifications in AWS, Azure, or Kubernetes.
Experience with other cloud platforms like Google Cloud Platform (GCP).
Experience with distributed tracing and observability tools.

Who You Are

You thrive in fast-paced environments. You are high-intensity and care a lot about what you do, and you're ecstatic to work at a start-up
You are ambitious. You have fun solving problems that others think are impossible.
You are curious. You find joy in learning about AI, technology, and finance
You are an owner. You are autonomous, self-directed, and comfortable working with ambiguity
You are collaborative, organized, and thoughtful.

Why Join Rogo?

Exceptional traction: strong PMF with the world's largest investment banks, hedge funds, and private equity firms.
World-class team: we take talent density seriously. We like working with incredibly smart, driven people.
Velocity: we work fast, which means you learn a lot and constantly take on new challenges.
Frontier technology: we're developing cutting-edge AI systems, pushing the boundaries of published research, redefining what's possible, and inventing the future.
Cutting Edge Product: Our platform is state-of-the-art and crazily powerful. We're creating tools that make people smarter, reinventing how you discover, create, and share knowledge.

View all jobs at Rogo

View Rogo Profile

Report Job

Similar Jobs

Magnite

Senior Site Reliability Engineer

AdTech • Big Data • Digital Media • Software

Hybrid

New York, NY, USA

950 Employees

135K-155K Annually

JPMorganChase

Technical Program Manager

Financial Services

Hybrid

New York, NY, USA

289097 Employees

Braze

Senior Site Reliability Engineer

Marketing Tech • Mobile • Software

Easy Apply

Hybrid

New York City, NY, USA

1918 Employees

130K-232K Annually

Citadel

Site Reliability Engineer

Information Technology • Software • Financial Services • Big Data Analytics

In-Office

New York, NY, USA

4000 Employees

105K-300K Annually

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: New York, NY

55 Employees

Year Founded: 2021

What We Do

Artificial intelligence is transforming the global financial services industry. Rogo is the first generative AI company built to help financial firms navigate this transformation. Our mission is simple: Improve how firms work by deploying bespoke AI solutions