Software Engineer 2

Reposted 2 Days Ago
Be an Early Applicant
Bangalore, Bengaluru, Karnataka, IND
In-Office
Mid level
Security • Cybersecurity
The Role
The Software Engineer will own the observability stack, design developer tooling, manage platform features, and ensure system reliability through monitoring and automation.
Summary Generated by Built In
About the Role

Abnormal Security is looking for an experienced and driven Platform & Infra software engineer to join the PI team. Join us and help build the platforms that power Abnormal's growth

  • Observability Platform - Own and evolve the monitoring, metrics, and alerting infrastructure that every engineering team at Abnormal depends on. You'll work across the Prometheus, Chronosphere, and Grafana stack to ensure engineers can see what their systems are doing in real time — building dashboards, managing metric pipelines at scale, operating the PagerDuty alerting pipeline, and driving cost-efficient observability across all production environments (US, EU, and GovCloud).

  Your Impact

  • Own the observability stack (Prometheus, Chronosphere, Grafana, PagerDuty) that every team relies on to detect, diagnose, and resolve production issues — when you make it better, every engineer at Abnormal gets faster.
  • Design platforms and developer tooling that remove friction — reducing deployment times, simplifying pipeline authoring, and letting product teams focus on building rather than firefighting.
  • Drive SLAs and SLOs for critical shared infrastructure ensuring the systems behind our products are resilient and cost-efficient.
  •  Your architectural decisions on alerting pipelines and cross-environment deployments will define what products we can build and how quickly we deliver them to customers.
What you will do 
  • Work with the Tech Lead, Engineering Manager, and Product Manager to design, develop, and deliver key platform features — from technical design docs through production rollout
  • Own features end-to-end: scoping, implementation, testing, deployment, and post-launch monitoring across multiple environments (US, EU, GovCloud)
  • Take ownership of 1-3 key services within Observability (Prometheus, Chronosphere, Grafana, PagerDuty pipeline) or Data Infra (Airflow, Spark) and be accountable for their reliability, performance, and evolution
  • Participate in on-call rotations — triage, diagnose, and resolve production issues independently, building deep operational knowledge of the systems you own
  • Improve system resilience by converting runbooks into automated solutions, refining SLAs/SLOs, and proactively identifying performance bottlenecks and failure modes
  • Assume ownership of the reliability of everything you build, including comprehensive unit tests, integration testing, and observability instrumentation
  • Build platforms, tooling, and APIs that make it easier for other engineering teams to ship — whether that's faster pipeline deployments, better dashboards, or simpler alerting configuration
  • Partner with internal customers (product and engineering teams) to understand their needs and translate them into scalable platform capabilities
  • Communicate effectively in an async-first, distributed environment — proactively providing updates, discussing challenges, and proposing solutions without prompting
  • Mentor junior engineers on the team, helping them ramp up on service operations and development practices
  • Raise the bar of engineering excellence through code reviews, knowledge sharing, design discussions, and contributing to team best practices

Must Haves 

  • Backend Engineering & Distributed Systems (4+ years)
  • 4+ years of hands-on backend engineering experience designing, building, and operating production-grade distributed systems
  • Strong proficiency in Python — the primary language for Airflow DAGs, platform services, and automation tooling
  • Working proficiency in Golang — used for high-performance infrastructure components, metric pipelines, and platform services
  • Experience building systems that process data at scale — whether metric ingestion pipelines, stream/batch processing, or high-throughput API services
  • Demonstrated experience owning a service or platform end-to-end — from technical design through production deployment, monitoring, and iteration
  • Comfortable balancing feature development with operational responsibilities: you've shipped features and kept them running reliably at scale
  • Experience writing technical design documents that articulate trade-offs, propose solutions, and get buy-in from peers and tech leads
  • Track record of breaking down ambiguous problems into concrete, deliverable milestones
  • Experience with fault tolerance patterns — retries, circuit breakers, graceful degradation, backpressure — and knowing when to apply each
  • Proven incident response capability: you've been on-call, diagnosed production issues under pressure, and driven them to resolution
  • Strong testing discipline — unit tests, integration tests, and an understanding of what to test and how to keep test suites maintainable
  • Ability to design systems with a forward-looking perspective — thinking about how your architecture handles 10x growth, multi-region deployment, and evolving requirements
  • Ability to contribute to and influence cross-team technical direction — you're not just implementing specs, you're shaping the solution
  • Async-first communication excellence — strong written communication skills for design docs, Slack discussions, PR reviews, and status updates across time zones
  • Proactive communicator — you surface blockers early, share con
  • Solid understanding of monitoring, alerting, and observability principles — you've instrumented services, set up dashboards, defined SLIs/SLOs, or triaged production incidents using metrics and logs
Nice to Have 
  • Hands-on experience with Prometheus — PromQL queries, recording rules, alerting rules, relabeling configs, and understanding metric cardinality challenges at scale
  • Experience with Grafana — building dashboards, templating, managing datasources, and creating meaningful visualizations for operational and business metrics
  • Familiarity with commercial observability platforms like Chronosphere, Datadog, New Relic, or Honeycomb — understanding trade-offs between self-hosted and managed solutions
  • Experience designing or operating an alerting pipeline — PagerDuty, OpsGenie, or similar — including alert routing, escalation policies, and reducing noise/alert fatigue

Cloud Infrastructure & Kubernetes

  • Familiarity with AWS services — EC2, ECS, EKS, S3, RDS, IAM, CloudWatch, Lambda, SQS/SNS — and understanding how to architect cost-effective, secure cloud infrastructure
  • Experience with Kubernetes (K8s) — deploying and operating workloads, understanding pods/services/deployments, Helm charts, and debugging cluster-level issues
  • Exposure to Infrastructure-as-Code tools — Terraform, Pulumi, or CloudFormation — and understanding the value of declarative infrastructure management
  • Experience with CI/CD pipelines — GitHub Actions, Jenkins, or similar — and optimizing build/deploy times for platform services

Programming & Framework

  • Experience with Django or similar Python web frameworks — building APIs, managing migrations, and understanding ORM performance characteristics
  • Familiarity with gRPC or protobuf for inter-service communication in a microservices architecture

Technical Leadership & Platform Thinking

  • Experience leading a small team (2-4 engineers) to build a feature or component from scratch — scoping, task breakdown, code reviews, and delivery management
  • Experience building internal developer platforms or tooling — CLIs, SDKs, self-service portals, or automation that improved developer productivity
  • Track record of reducing operational toil — automating runbooks

Abnormal AI is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability, protected veteran status or other characteristics protected by law. For our EEO policy statement please click here. If you would like more information on your EEO rights under the law, please click here.

Skills Required

  • 4+ years of hands-on backend engineering experience designing, building, and operating production-grade distributed systems
  • Strong proficiency in Python
  • Working proficiency in Golang
  • Experience building systems that process data at scale
  • Experience writing technical design documents
  • Strong testing discipline
  • Solid understanding of monitoring, alerting, and observability principles
  • Familiarity with AWS services
  • Experience with Kubernetes
  • Experience with CI/CD pipelines

Abnormal Security Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Abnormal Security and has not been reviewed or approved by Abnormal Security.

  • Fair & Transparent Compensation Pay is considered aggressively benchmarked to leading tech markets with annual reviews, and feedback suggests engineering and sales roles are compensated competitively with strong upside potential.
  • Healthcare Strength Health coverage is portrayed as robust, including employer-paid premiums for employees in prior postings, One Medical access, and globally designed healthcare and parental leave.
  • Leave & Time Off Breadth Time off provisions include flexible/unlimited PTO, company holidays, and paid parental leave that the company positions as globally available.

Abnormal Security Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
San Francisco, CA
175 Employees
Year Founded: 2018

What We Do

The Abnormal Security platform protects enterprises from targeted email attacks. Abnormal Behavior Technology (ABX) models the identity of both employees and external senders, profiles relationships and analyzes email content to stop attacks that lead to account takeover, financial damage and organizational mistrust. Though one-click, API-based Office 365 and G Suite integration, Abnormal sets up in minutes and does not disrupt email flow. Abnormal Security was founded in 2018 by CEO Evan Reiser, CTO Sanjay Jeyakumar, Head of Machine Learning Jeshua Bratman, and Founding Engineers Abhijit Bagri and Dmitry Chechik. The team previously built behavioral profiling and machine learning technologies at Twitter, Google and Pinterest that are being applied to solve a problem that costs organizations $1 billion per year, according to the FBI. The Abnormal Security platform stops targeted phishing, business email compromise and account takeover attacks that have never been seen before.

Similar Jobs

SciPlay Logo SciPlay

Software Engineer

Gaming • Marketing Tech • Mobile • Software • App development
In-Office
Bangalore, Bengaluru Urban, Karnataka, IND
1000 Employees

Toast Logo Toast

Software Engineer

Cloud • Fintech • Food • Information Technology • Software • Hospitality
In-Office
Bangalore, Bengaluru Urban, Karnataka, IND
5000 Employees

WEX Inc. Logo WEX Inc.

Development Engineer

Fintech • Payments
In-Office
Bangalore, Bengaluru Urban, Karnataka, IND
4900 Employees

Abnormal Security Logo Abnormal Security

Software Engineer

Security • Cybersecurity
In-Office
Bangalore, Bengaluru, Karnataka, IND
175 Employees

Similar Companies Hiring

Oso Thumbnail
Software • Security • Infrastructure as a Service (IaaS)
New York, New York
36 Employees
Credal.ai Thumbnail
Software • Security • Productivity • Machine Learning • Artificial Intelligence
Brooklyn, NY
Milestone Systems Thumbnail
Artificial Intelligence • Other • Security • Software • Analytics • Big Data Analytics
Lake Oswego, OR
1500 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account