Site Reliability Engineer II

Reposted 15 Hours Ago
Be an Early Applicant
Santiago de los Caballeros, DOM
In-Office
Senior level
Information Technology
The Role
Ensure production reliability through observability design, SLO/SLA definition, incident response and postmortems, DR validation and drills, capacity forecasting, autoscaling tuning, and cross-team collaboration to improve operational readiness and cost efficiency.
Summary Generated by Built In
InvestorFlow is the only company of its kind to deliver industry specialized CRM, built on Salesforce, and digital portals to help alternative asset firms find opportunities, create and manage relationships, and turn relationship insights into action with increased productivity and transparency.

We are looking for a Senior Site Reliability Engineer who ensures reliability through operational excellence, configuration-as-code adjustments, and close collaboration with Engineering and DevOps teams. This role focuses on participating in architectural design reviews, validating reliability standards, auditing production systems, and ensuring systems meet SRE production-readiness requirements.  The SRE will not build infrastructure or IaC, but must understand Infrastructure-as-Code concepts (Terraform/HCL reading proficiency as a nice-to-have) to assess and influence configurations. 

You Will

  • Design and implement comprehensive monitoring strategies rather than owning observability platforms outright. 
  • Collaborate with DevOps and Engineering on shared observability platforms (Grafana, Prometheus/Loki, Azure Monitor/Application Insights). 
  • Define golden signals dashboards, measure SLOs/SLIs/error budgets, and help implement actionable alerting. 
  • Drive structured logging standards, distributed tracing patterns, and OpenTelemetry implementation standards for teams to deploy and SRE to validate. 
  • Conduct monitoring/auditing of production systems to ensure instrumentation completeness. 
  • Take ownership of production incident response, lead incident handling, and drive remediation. 
  • Conduct blameless post-incident reviews and ensure follow-through on action items. 
  • Continuously improve operational processes, reliability practices, and team readiness. 
  • Monitor system resource utilization and forecast future needs. 
  • Tune autoscaling configurations in partnership with Engineering teams. 
  • Evaluate capacity efficiency and support cost optimization strategies. 
  • Validate DR environments and test failover processes—not build them. 
  • Ensure DR capabilities are functioning as-designed with clear documentation. 
  • Define and lead regular DR drills in partnership with Engineering/Platform teams. 
  • Work with the Non-Functional Testing team on resilience and DR scenario simulations. 
  • Support chaos experiment planning and validation as a nice-to-have capability. 

You Have

  • 5+ years in Site Reliability Engineering, Production Engineering, or related operations roles. 
  • Strong knowledge of cloud-native systems, preferably Microsoft Azure. 
  • Experience with observability tooling (Grafana ecosystem, Prometheus/Loki, Azure Monitor, Application Insights). 
  • Understanding of DR concepts, failover validation, and operational readiness. 
  • Familiarity with chaos engineering practices (nice-to-have). 
  • Ability to read Terraform/HCL is a plus but not required. 
  • Strong grasp of SRE principles (SLOs/SLIs, error budgets, toil reduction, postmortems). 
  • Strong collaboration and communication skills.   
  • Mindset We Value 
  • Treat observability as a foundational product feature — not an afterthought. 
  • Proactively break systems to strengthen them. 
  • Automate away repetitive pain and convert incidents into lasting defenses. 
  • Clearly articulate complex risks, trade-offs, and recovery approaches to both technical and non-technical stakeholders. 
  • Remain composed during incidents while relentlessly focused on prevention. 

InvestorFlow is an investor and deal engagement platform that prioritizes intelligent digital experiences, productivity, and engagement. Our cloud-native platform integrates deal flow management, fundraising, reporting, and investor services. We are proud to serve over 175 clients, including 25 of the top 50 alternative asset managers, managing more than $6 trillion in assets, 750 funds, and 90,000 LPs. Headquartered in San Francisco, California, we are committed to driving innovation and inclusivity in the financial industry. To learn more about our company, please visit www.investorflow.com. 

Top Skills

Application Insights
Azure
Azure Monitor
Grafana
Hcl
Loki
Opentelemetry
Prometheus
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Menlo Park, CA
85 Employees
Year Founded: 2015

What We Do

InvestorFlow is the only company of its kind to deliver industry specialized CRM, built on Salesforce, and digital portals to help alternative asset firms find opportunities, create and manage relationships, and turn relationship insights into action with increased productivity and transparency.

Similar Jobs

Forward Financing Logo Forward Financing

Senior Finance Analyst

Fintech • Financial Services
In-Office
Santiago de los Caballeros, DOM
529 Employees

TrueML Logo TrueML

Senior Devops Engineer

Fintech • Machine Learning • Payments • Social Impact • Software • Financial Services
In-Office or Remote
3 Locations
450 Employees
62K-74K Annually

Forward Financing Logo Forward Financing

Office Coordinator

Fintech • Financial Services
In-Office
Santiago de los Caballeros, DOM
529 Employees

G2i Logo G2i

Software Engineer

HR Tech • Other • Professional Services
In-Office or Remote
206 Locations
201 Employees
30-70 Hourly

Similar Companies Hiring

Axle Health Thumbnail
Logistics • Information Technology • Healthtech • Artificial Intelligence
Santa Monica, CA
19 Employees
Scrunch  Thumbnail
Artificial Intelligence • Information Technology • Marketing Tech • Software • SEO
Salt Lake City, Utah
Standard Template Labs Thumbnail
Artificial Intelligence • Information Technology • Software
New York, NY
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account