Job Brief:
We are hiring a hands-on Senior Site Reliability Engineer to own the reliability, observability, cost, and security of a live mobility platform operating at real-time scale. This role is not advisory and not support-only. You will own production including application, infrastructure, pipelines, and signals etc
VentureDive Overview:
Founded in 2012 by veteran technology entrepreneurs from MIT and Stanford, VentureDive is the fastest-growing technology company in the region that develops and invests in products and solutions that simplify and improve the lives of people worldwide. We aspire to create a technology organization and an entrepreneurial ecosystem in the region that is recognized as second to none in the world.
Key Responsibilities:
1. Production Reliability & Availability
- Own uptime, latency, error rates, and system stability
- Design and enforce SLOs, SLIs, and error budgets
- Ensure zero- or near-zero-downtime deployments
- Lead incident response, mitigation, and postmortems
- Expectation: Fear of deploying or operating prod is unacceptable.
2. Observability & Early Warning Systems
- Build and maintain metrics, logs, traces, and alerts
- Detect abnormal traffic patterns, latency spikes, network anomalies
- Identify trends before incidents happen
- Eliminate noisy alerts; create actionable signals
- Expectation: The system must warn us before users do.
3. Infrastructure Ownership (AWS)
- Own AWS infrastructure (compute, networking, storage, IAM)
- Enforce Infrastructure as Code (Terraform / equivalent)
- Ensure scalability for ride/reservation spikes, peak hours, and geo traffic
- Improve network ingress/egress visibility and performance
- Expectation: No manual, undocumented production infrastructure.
4. CI/CD Performance & Delivery
- Own CI/CD pipelines during and after migration to Bitbucket Pipelines
- Reduce pipeline execution time
- Improve build caching, parallelism, test efficiency
- Ensure safe, repeatable, and fast deployments
- Expectation: CI/CD is a productivity engine, not a bottleneck.
5. Cost Ownership & FinOps
- Monitor and optimize AWS and tooling costs
- Detect abnormal cost increases and notify proactively
- Right-size resources without impacting reliability
- Provide cost visibility to engineering leadership
- Expectation: Cost is a reliability signal, not an afterthought.
6. Security, Auditing & Incident Forensics
- Enforce secure configurations, secrets management, and least privilege
- Detect suspicious access, traffic, or behavior patterns
- Make investigations easier for Dev and QA teams
- Improve auditability and traceability across systems
- Expectation: Security incidents should be detectable, explainable, and recoverable.
7. Migration Awareness & System Change Detection
- Actively track behavior changes during platform migrations
- Detect regressions caused by architecture or traffic shifts
- Validate performance and reliability during transitions
- Expectation: Changes in system behavior should never go unnoticed
Required Experience
- 5+ years in SRE / DevOps / Platform Engineering
- Operating production systems at scale
- Strong AWS experience (security, networking, IAM, compute)
- Strong observability background (metrics, logs, alerts, traces)
- NewRelic and Elastic stack
- Cloud Watch & Cloud Trail
- PostgreSQL performance & reliability experience
- CI/CD pipeline optimization experience
- Incident response and postmortem leadership
Strongly Preferred
- Experience with real-time or high-traffic platforms (Plus point: Mobility)
- Hands-on experience with build systems, including compilation, dependency resolution, and artifact generation across multiple technology stacks
- Hands-on experience working with Ruby on Rails and/or NestJS (strong plus)
- Hands-on experience with application profiling and performance monitoring, including identifying, analyzing, and resolving performance bottlenecks in production systems
- Infrastructure migration experience
- Cost optimization (FinOps)
- Security incident investigation experience
What we look for beyond required skills
In order to thrive at VentureDive, you …are intellectually smart and curious …have the passion for and take pride in your work …deeply believe in VentureDive’s mission, vision, and values …have a no-frills attitude …are a collaborative team player …are ethical and honest
Are you ready to put your ideas into products and solutions that will be used by millions? You will find VentureDive to be a quick pace, high standards, fun and a rewarding place to work at. Not only will your work reach millions of users world-wide, you will also be rewarded with competitive salaries and benefits. If you think you have what it takes to be a VenDian, come join us ... we're having a ball!
#LI-Hybrid
What We Do
VentureDive is an award-winning digital development company that builds cutting-edge technology solutions to improve lives globally. Since its inception in 2012, the firm has enabled two tech unicorns and successfully driven digital transformation initiatives for large enterprises. Led by co-founders Atif Azim and Shehzaad Nakhoda, VentureDive has a presence in Silicon Valley, London, Portugal, Dubai, and Pakistan. To learn more, visit https://www.venturedive.com.








.png)