About this role
Team Overview:
Data is at the core of the Aladdin platform, and increasingly, our ability to consume, store, analyze, and gain insight from data is a key component of our competitive advantage. The Data Engineering team is responsible for the data ecosystem within BlackRock. We engineer high performance data pipelines, provide a fabric to discover and consume data, and continually evolve our data storage capabilities. We believe in writing small, testable code with a focus on innovation. We are committed to open source, and we regularly contribute our work back to the community.
We are seeking top tier Cloud Native DevOps Platform Engineers to augment our Enterprise Data Platform team. Our objective is to extend our data lifecycle management practices to include structured, semi structured and unstructured data. This role requires a breadth of individual technical capabilities and competencies, though, most important, is a willingness and openness to learning new things across multiple technology disciplines. This role is for practitioners and not researchers.
Position Summary:
As a Data Platform Cloud/DevOps Engineer in the Data Engineering team, you will design, build, and maintain the cloud-native infrastructure that powers Aladdin's Enterprise Data Platform. You will enable data engineers, AI engineers, and application developers by providing scalable, reliable, and cost-efficient infrastructure for data processing, AI/ML workloads, and analytics services.
Key Responsibilities:
Infrastructure and Cloud Engineering
- Design, deploy, and manage cloud-native infrastructure across AWS, Azure, and private clouds
- Implement Infrastructure as Code (IaC) using Terraform, Ansible, and CloudFormation for repeatable, auditable deployments
- Manage Kubernetes clusters for scalable, reliable, and secure application and data workloads
- Deploy and configure service mesh, HashiCorp Vault, cert-manager, and other Kubernetes-native frameworks
- Design and implement network architectures including VPCs, load balancers, and ingress/egress controls
- Deploy and configure LLM serving platforms like MCP/agent orchestrators, chatbots, vector embedding services and secured API gateways for generative AI applications
CI/CD and Automation
- Build and maintain CI/CD pipelines using ArgoCD, Azure DevOps, Jenkins, and GitHub Actions
- Implement GitOps workflows for automated, auditable infrastructure and application deployments
- Automate repetitive operational tasks using Python and Bash to improve team efficiency and reduce manual errors
- Develop self-service infrastructure provisioning capabilities for engineering teams
- Maintain version control best practices and collaborative development workflows
- Build and maintain MLOps CI/CD pipelines for automated model deployment to production environments
Site Reliability Engineering (SRE)
- Implement monitoring, logging, and observability solutions using Prometheus, Grafana, ELK Stack, and Datadog
- Define and track Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets for data platform services
- Build automated alerting systems to proactively detect infrastructure issues and performance degradation
- Perform capacity planning and performance tuning for production infrastructure
- Conduct reliability analysis and implement preventive measures to improve system uptime
- Collaborate with operational teams on incident escalation and system reliability improvements
- Implement chaos engineering practices to test infrastructure resilience and fault tolerance
Cloud Cost Optimization and FinOps
- Monitor and optimize cloud infrastructure costs across AWS, Azure, and private cloud environments
- Right-size compute, storage, and networking resources based on utilization metrics and cost-performance analysis
- Develop cost dashboards and reports to provide visibility into infrastructure spending trends
- Collaborate with finance and engineering teams on cloud budget planning and forecasting
- Evaluate and recommend cost-effective architectural alternatives (e.g., spot instances, reserved capacity, serverless options)
Desired Skills
Cloud and Infrastructure
- Expert-level experience with AWS, Azure, or GCP cloud platforms and services
- Proficiency with Infrastructure as Code tools (Terraform, Ansible, CloudFormation)
- Templating with Helm, ArgoCD, Ansible, and Terraform
- Deep knowledge of Kubernetes (K8s) APIs, controllers, operators, and stateful workloads
- Understanding of the K8s Operator Pattern -- comfort and courage to wade into (predominantly golang based) operator implementation code bases
- Comfortable building atop K8s native frameworks including service mesh (Istio), secrets management (cert-manager, HashiCorp Vault), log management (Splunk), observability (Prometheus, Grafana, AlertManager).
CI/CD and Automation
- Hands-on experience with CI/CD platforms (ArgoCD, Azure DevOps, Jenkins, GitHub Actions)
- Proficiency in scripting languages (Python, Bash) for automation and infrastructure tooling
- Experience implementing GitOps principles and workflows
- Version control expertise (Git, branching strategies, collaborative development)
Site Reliability Engineering (SRE)
- Experience implementing monitoring and observability solutions (Prometheus, Grafana, ELK Stack, Datadog)
- Knowledge of SRE principles including SLOs, SLIs, error budgets, and reliability engineering
- Experience implementing and operating telemetry-based monitoring, alerting, and incident response systems.
- Performance tuning and capacity planning experience for production systems
- Experience with chaos engineering and reliability testing
FinOps and Cost Management
- Experience with cloud cost optimization strategies and FinOps practices
- Ability to analyze infrastructure costs and identify optimization opportunities
- Knowledge of cost allocation, tagging strategies, and chargeback mechanisms
- Familiarity with cloud cost management tools (AWS Cost Explorer, Azure Cost Management, CloudHealth)
Nice to have skills
- Certifications: AWS Certified DevOps Engineer, CKA (Certified Kubernetes Administrator), HashiCorp Terraform Associate, FinOps Certified Practitioner
- Experience with AI/ML infrastructure platforms (Kubeflow, MLflow, Ray, model serving frameworks)
- Understanding of Natural/Large Language Models
- Experience with basic prompt engineering, LLM fine tuning, and chatbot implementations in modern python SDKs like langchain and/or transformers
- Familiarity with policy-as-code tools (Open Policy Agent, Kyverno)
- Experience with multi-cloud and hybrid cloud architectures
We are looking for candidates with 5+ years of hands-on experience in Data Platform DevOps/Cloud or related Engineering practices. If interested, candidates must submit a copy of your most recent resume and cover letter.
Our benefits
To help you stay energized, engaged and inspired, we offer a wide range of benefits including a strong retirement plan, tuition reimbursement, comprehensive healthcare, support for working parents and Flexible Time Off (FTO) so you can relax, recharge and be there for the people you care about.
Our hybrid work model
BlackRock’s hybrid work model is designed to enable a culture of collaboration and apprenticeship that enriches the experience of our employees, while supporting flexibility for all. Employees are currently required to work at least 4 days in the office per week, with the flexibility to work from home 1 day a week. Some business groups may require more time in the office due to their roles and responsibilities. We remain focused on increasing the impactful moments that arise when we work together in person – aligned with our commitment to performance and innovation. As a new joiner, you can count on this hybrid model to accelerate your learning and onboarding experience here at BlackRock.
About BlackRock
At BlackRock, we are all connected by one mission: to help more and more people experience financial well-being. Our clients, and the people they serve, are saving for retirement, paying for their children’s educations, buying homes and starting businesses. Their investments also help to strengthen the global economy: support businesses small and large; finance infrastructure projects that connect and power cities; and facilitate innovations that drive progress.
This mission would not be possible without our smartest investment – the one we make in our employees. It’s why we’re dedicated to creating an environment where our colleagues feel welcomed, valued and supported with networks, benefits and development opportunities to help them thrive.
For additional information on BlackRock, please visit @blackrock | Twitter: @blackrock | LinkedIn: www.linkedin.com/company/blackrock
BlackRock is proud to be an Equal Opportunity Employer. We evaluate qualified applicants without regard to age, disability, family status, gender identity, race, religion, sex, sexual orientation and other protected attributes at law.
Top Skills
What We Do
As the world’s largest asset manager, BlackRock partners with investors around the globe to help them (and those on whose behalf they invest) plan for life’s most important goals – like retirement, home ownership and their children’s education. Our clients range from governments, foundations and other large institutions to those investing on behalf of individuals, including firefighters, nurses, teachers and factory workers.
BlackRock was founded with the idea of creating a better asset management firm — one that was purpose-driven, focused on clients and risk management, and propelled by data and technology. Our breakthrough Aladdin® platform is BlackRock’s technological backbone, helping investors see and manage their whole portfolios in one place – from constructing investments to monitoring risk and executing trades. Used by hundreds of external institutions around the world, Aladdin combines powerful analytics and a common language to help investment teams make faster, more informed decisions across public and private markets. It’s a key part of our business and one of the reasons we’re trusted to manage more assets than any other investment manager today.
At BlackRock, we challenge conventions and raise the bar for what’s possible. We harness technology to unlock new solutions, simplify complexity, and deliver investment strategies that meet people where they are. Whether it’s retirement planning, wealth building or navigating market shifts, we’re here to help clients invest more easily, more affordably and with more choice as we chart a path toward financial well-being together.
Learn more: Careers.BlackRock.com
Why Work With Us
Without our people, technology is irrelevant. When we combine the power of people with the power of technology, we amplify our ability to create better outcomes for our employees, clients, shareholders and society alike.
Gallery
BlackRock Teams
BlackRock Offices
Hybrid Workspace
Employees engage in a combination of remote and on-site work.
BlackRock has 25,000 employees across more than 100 offices in over 40 countries around the world.






