Site Reliability Engineer

Sorry, this job was removed at 06:02 p.m. (CST) on Tuesday, Aug 12, 2025
Easy Apply
Be an Early Applicant
Bogotá, Bogotá, D.C.
Hybrid
Fintech • News + Entertainment • Software • Database • Financial Services
Where Credit Becomes Clear
The Role

Octus

Octus is a leading global provider of credit intelligence, data, and analytics. Since 2013, tens of thousands of professionals across hedge fund, investment banking, management consulting, and law firm verticals have come to rely on Octus to make better, faster, and more confident decisions in pace with the fast-moving credit markets.
For more information, visit: https://octus.com/

Working at Octus

Octus hires growth-minded innovators and trailblazers across the globe to drive our business and culture. Our core values – Action Oriented, Customer First Mindset, Effective Team Players, and Driven to Excel – define an organizational ethos that’s as high-performing as it is human. Among other perks, Octus employees enjoy competitive health benefits, matched 401k and pension plans, PTO, generous parental leave, gym subsidies, educational reimbursements for career development, recognition programs, pet-friendly offices (US only), and much more. 
Role

We’re looking for Site Reliability Engineers who can help us build, operate, and maintain high-performance, scalable, and reliable services for our production infrastructure across our cloud environment. Site Reliability Engineers combine engineering experience and an innate drive to improve existing systems and processes, with the creativity to develop novel solutions to evolving challenges. Our team strives to automate processes wherever possible, using whichever tools are best for the job. You’ll be the experts for the environments that you operate infrastructure in, helping partner teams build & configure their software to operate reliably within.We strongly believe in engineering teams being responsible for the operations of their services in production. In this role, you’ll work closely with engineers to advocate and participate in sensible, scalable, systems design and share responsibility with them in diagnosing, resolving, and preventing production issues.

What you'll do:

  • Identify, assess, and mitigate risks associated with our systems, applications, and infrastructure.
  • Proactively recognize sources of instability in distributed systems and analyze how complex systems fail from a reliability and resilience perspective.
  • Improve our applications availability, reliability, and observability and reduce outages to a minimum.
  • Implement DR strategies, including backups and recovery techniques with minimal downtime for different applications.
  • Automate and codify our tooling, processes, and infrastructure to speed up development and make them repeatable and error-proof.
  • Deep dive into issues and outages to establish root causes and communicate them to your business partners.
  • Write and maintain thorough documentation to share with your teammates around the world, allowing them all to function as a cohesive unit.
  • Participate in a 24/7 weekly on-call rotation with members of your team to troubleshoot incidents in a complex distributed systems environment.
  • Ability to create meaningful metrics and alerting for service health monitoring.

Skills and knowledge you should posses:

  • Bachelor's degree in Computer Science or a related field, or equivalent experience
  • 5+ years of experience in SRE, Devops or systems engineering
  • Proficient in command-line interface (CLI) operations, shell scripting (Python or Bash), and Linux system administration
  • Extensive experience working with Infrastructure as code technologies, preferably Terraform
  • Extensive experience working with major cloud providers, preferably AWS
  • Significant experience working with Observability and telemetry tools ( Datadog, AWS Cloudwatch,  New Relic, Prometheus, Grafana etc.)
  • Professional experience in working with at least one general purpose programming language (Python, PHP, Go, C# etc.)
  • Experience building CI/CD workflows with tools like Jenkins, CircleCI, Github actions or AWS Code pipeline
  • Fundamental understanding of Internet networking protocols: TCP/IP, TLS, DNS, HTTP, SMTP

Bonus points (nice skills to have):

  • Database Systems Fundamentals (MySQL/Postgres) and administering them at scale including schema and query optimization
  • Familiarity working with event driven systems and messaging infrastructure (Kafka, RabbitMQ, AWS Kinesis etc.)
  • Experience working with containerized and serverless applications such as Docker, AWS ECS, Kubernetes and AWS Lambda
  • Experience working with web servers such as Nginx, Apache, Tomcat etc.
  • Application security, infrastructure security and SOC2 compliance experience

Equal Employment Opportunity

Octus is committed to providing equal employment opportunities to all employees and applicants for employment without regard to race, colour, religion, sex, sexual orientation, gender identity, national origin, age, disability, genetic information, marital status, pregnancy, veteran status, or any other legally protected status. We strive to create an inclusive and diverse work environment where all individuals are valued, respected, and treated fairly. We believe that diversity enriches our workplace and enhances our ability to innovate and succeed.

Similar Jobs

Octus Logo Octus

Private Debt Specialist

Fintech • News + Entertainment • Software • Database • Financial Services
Easy Apply
Hybrid
Bogotá, Bogotá, D.C., COL
808 Employees

Octus Logo Octus

Data Engineer

Fintech • News + Entertainment • Software • Database • Financial Services
Easy Apply
Hybrid
Bogotá, Bogotá, D.C., COL
808 Employees
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York, NY
808 Employees
Year Founded: 2013

What We Do

Octus, was founded in 2013 with a simple conviction: credit decisions deserve clarity, not chaos. Markets were fragmented. Intelligence was gated. Data lived in silos. Professionals were forced to stitch together incomplete pictures while the clock kept running. We built Octus to change that.

Octus is the essential credit platform that tracks the entire credit lifecycle. From origination and underwriting to secondary trading, refinancing, distress and restructuring, we follow every development across leveraged loans, high-yield bonds, private credit and special situations. We structure millions of documents, surface risks, benchmark performance and deliver expert reporting and analysis in real time.

What began as a newsroom fused with legal and financial expertise has evolved into a unified ecosystem. Octus brings together proprietary data, expert-driven intelligence and integrated workflow tools so credit professionals can analyze situations, uncover opportunities, manage portfolios, execute trades and stay compliant without ever leaving the platform. Insight and action finally live in one place.

Today, nearly 40,000 professionals across the world’s leading banks, asset managers, CLO managers, law firms and advisors rely on Octus to move smarter and faster. They turn to us for breadth, depth and rigor forged over more than a decade, and for the speed, integration and precision that define the future of credit workflow.

Our mission is direct and ambitious: end fragmentation, collapse the distance between intelligence and execution, and transform insight into impact. Because in credit, clarity isn’t just an advantage. It’s everything.

Octus: Where credit becomes clear.

Why Work With Us

Octus is where real experts work together to cut through chaos and build clarity in credit. You feel your impact here. What you create shapes real decisions in the market. The problems matter. The people are sharp. If you want work that counts, this is the place.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

Octus Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Reorg has adopted a hybrid working policy. For non-remote employees located within a reasonable commuting distance to one of our offices, the requirement is to work from the office at least 2 days per week.

Typical time on-site: 2 days a week
HQNYC Office
Bucharest Office
El Segundo Office
London Office
Pune Office
Vilnius Office
Washington DC Office
Learn more

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account