Senior Platform Engineer

Posted 3 Days Ago
Be an Early Applicant
Hiring Remotely in India
Remote
Senior level
Artificial Intelligence • eCommerce • Software
The Role
Design, operate, and scale cloud-native, multi-tenant SaaS platform infrastructure with emphasis on AWS, Kubernetes, GPU-backed AI workloads, Terraform automation, CI/CD, observability, reliability (SLIs/SLOs), security, disaster recovery, and cost optimization.
Summary Generated by Built In
Senior Platform Engineer / Senior DevOps Engineer

Location: Remote | Full-Time

About Evolphin

Evolphin is building the next generation of AI-powered media workflows for enterprise media teams

managing large image and video libraries at scale, including environments with tens of millions of video assets and extremely large metadata and embedding footprints. Its platform adds a conversational AI layer for extracting intelligence from media, enabling powerful search, conversational discovery, and automation of media workflows through AskAI.

Crop.photo extends that capability into e-commerce and retail, enabling smart cropping, image transformation, and image and video generation at scale for PDP and eCommerce catalog workflows.

Together, Evolphin and Crop.photo create a connected visual AI ecosystem where media can move across systems and be searched, understood, transformed, and prepared for downstream use at scale

The Role

We are seeking a Senior Platform Engineer who can own infrastructure architecture, reliability, scalability, and platform operations across our cloud environments.

This is not a traditional "pipeline management" role. We are looking for someone who can make architectural decisions, evaluate tradeoffs, and build platforms that support large-scale SaaS applications and AI workloads.

You will work closely with Engineering, Product, and AI teams to design systems that are secure, resilient, scalable, and cost-efficient.

Key ResponsibilitiesPlatform Architecture & System Design
  • Design and evolve cloud-native platform architecture supporting multi-tenant SaaS applications
  • Define infrastructure standards, deployment patterns, and platform best practices
  • Lead architecture reviews and evaluate technical tradeoffs across reliability, performance, security, and cost
  • Design highly available and fault-tolerant systems across multiple environments
AWS Infrastructure
  • Architect and manage large-scale AWS environments
  • Design networking architectures including VPCs, subnets, security groups, routing, load balancing, and connectivity patterns
  • Build secure deployment architectures aligned with security and compliance requirements
  • Implement disaster recovery, backup, and business continuity strategies
Kubernetes & Platform Operations
  • Design and operate production Kubernetes environments
  • Build scalable container orchestration strategies
  • Optimize cluster performance, networking, autoscaling, and workload scheduling
  • Improve developer experience through platform automation and self-service tooling
AI & GPU Infrastructure
  • Support AI and ML workloads running on AWS
  • Design infrastructure for model training and inference workloads
  • Manage GPU provisioning, utilization, scaling, and cost optimization
  • Collaborate with AI teams to improve deployment and operational efficiency
Reliability & Performance
  • Define and measure SLIs, SLOs, and operational metrics
  • Implement monitoring, observability, logging, alerting, and incident management practices
  • Drive performance optimization and capacity planning initiatives
  • Lead root cause analysis and reliability improvement effort
Infrastructure Automation
  • Build Infrastructure-as-Code solutions using Terraform
  • Design and optimize CI/CD pipelines
  • Automate provisioning, deployments, scaling, and operational workflows
Cost Optimization
  • Continuously evaluate cloud spending
  • Develop capacity planning models
  • Balance performance, reliability, and infrastructure costs
Required Experience
  • 6-8 years of experience in DevOps, Platform Engineering, SRE, or Cloud Infrastructure roles.
  • Proven experience designing and operating production-scale SaaS platforms.
  • Strong expertise in AWS architecture, networking, security, and deployment strategies.
  • Deep hands-on experience with Kubernetes, container orchestration, cluster operations, autoscaling, and workload management.
  • Experience designing highly available, fault-tolerant, and scalable distributed systems.
  • Strong understanding of system design, architecture trade-offs, and platform scalability.
  • Hands-on experience with Infrastructure as Code (Terraform preferred).
  • Experience building and maintaining CI/CD pipelines and deployment automation frameworks.
  • Strong Linux, networking, and systems engineering fundamentals.
  • Experience implementing observability, monitoring, logging, and incident management practices.
  • Experience with disaster recovery planning, backup strategies, and business continuity design.
  • Experience with cloud cost optimization, capacity planning, and resource utilization management.
  • Hands-on experience supporting AI/ML workloads in production environments.
  • Experience designing, provisioning, and operating GPU-based infrastructure for model training and/or inference workloads.
  • Experience managing and optimizing AWS Bedrock, OpenSearch, and DocumentDB or equivalent platforms.
  • Strong scripting and automation skills using Python, Bash, or similar languages.

Skills Required

  • 6-8 years in DevOps, Platform Engineering, SRE, or Cloud Infrastructure roles
  • Proven experience designing and operating production-scale SaaS platforms
  • Strong expertise in AWS architecture, networking, security, and deployment strategies
  • Deep hands-on experience with Kubernetes, cluster operations, autoscaling, and workload management
  • Experience designing highly available, fault-tolerant, and scalable distributed systems
  • Hands-on experience with Infrastructure as Code (Terraform preferred)
  • Experience building and maintaining CI/CD pipelines and deployment automation
  • Strong Linux, networking, and systems engineering fundamentals
  • Experience implementing observability, monitoring, logging, and incident management
  • Experience with disaster recovery planning, backup strategies, and business continuity
  • Experience with cloud cost optimization, capacity planning, and resource utilization management
  • Hands-on experience supporting AI/ML workloads in production and operating GPU-based infrastructure
  • Experience managing and optimizing AWS Bedrock, OpenSearch, and DocumentDB or equivalents
  • Strong scripting and automation skills using Python, Bash, or similar languages
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Ramon, California
14 Employees

What We Do

Crop.photo is an AI-powered service for bulk image editing & retouching, offering powerful tools for automating image cropping, resizing, background removal, and listing image analysis. The service is powered by advanced AI algorithms that streamline the image processing workflow for businesses of all sizes. For more information, visit https://crop.photo/about-us Crop.photo is the brainchild of Evolphin Software, Inc, a leading Silicon Valley based provider of digital & media asset management solutions. Evolphin has been serving creative operations teams across various industries for over a decade, helping them streamline their digital workflows, optimize media asset management, and automate their digital workflows. Evolphin's expertise in digital asset management, combined with cutting-edge AI technology, has resulted in the development of Crop.photo, a cloud-based service that simplifies image retouching for businesses of all sizes.

Similar Jobs

Trading Technologies Logo Trading Technologies

Senior Platform Engineer

Fintech • Information Technology
Remote or Hybrid
2 Locations
310 Employees
Remote
Nānakrāmguda, Rangareddi, Telangāna, IND
80303 Employees

Binance Logo Binance

Senior Software Engineer

Blockchain • Fintech • Software • Cryptocurrency • Metaverse
Remote or Hybrid
18 Locations
7696 Employees

Zscaler Logo Zscaler

Integration Engineer

Cloud • Information Technology • Security • Software • Cybersecurity
Easy Apply
Remote or Hybrid
India
8697 Employees

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account