Senior Specialist - Infra - Cloud - IN
The role is responsible for maintenance of Cloud Infrastructure Services for clients.
Key Responsibilities and Duties
- Acts as the ongoing interface between the client and the system or application.
- They have experience of working in technologies such as Amazon Web Services, Google Cloud Platform, VMWare Cloud Administration and Azure.
Educational Requirements
- University (Degree) Preferred
Work Experience
- 5+ Years Required; 7+ Years Preferred
Physical Requirements
- Physical Requirements: Sedentary Work
Career Level
8IC
Position Summary: Describe below the primary purpose and function of this job
We're seeking a Senior Platform Reliability Engineer who reports to AI COE lead to maintain our enterprise-scale Generative AI platform on AWS. This role serves as a critical bridge between AI development teams and platform operations, ensuring reliable deployment and scaling of LLM services, vector databases, and associated infrastructure. The position focuses on maintaining high availability of AI services while optimizing cost and performance
Key Duties & Responsibilities: List up to 5 key duties and responsibilities, management responsibilities and time spent (if applicable)
AI Platform Infrastructure (35%)
Primary Focus Areas:
- Maintain scalable Kubernetes clusters for LLM deployments
- Manage vector store infrastructure (Pinecone/Weaviate/Faiss)
- Optimize DynamoDB performance for high-throughput AI operations
- Configure and maintain S3 data lakes for model artifacts
- Implement efficient model serving architectures
Development Support & Integration (25%)
- Collaborate with ML engineers on model deployment pipelines
- Maintain APIs for model inference services
- Implement A/B testing infrastructure for model variants
- Create developer tooling for model deployment and monitoring
- Support integration of new LLM models into production2
Observability & Performance (20%)
Monitoring & Metrics
Implement custom metrics for LLM performance
Monitor vector store query latencies
Track model inference costs
Set up distributed tracing for API calls
Optimization
Tune Kubernetes resources for model serving
Optimize vector store query performance
Implement caching strategies for frequent queries
Manage auto-scaling policies3
Security & Compliance (20%)
- Implement IAM roles and security policies
- Manage API authentication and rate limiting
- Ensure data privacy compliance for AI operations
- Monitor and prevent token/cost abuse
- Implement model access controls
Management/Leadership Responsibility: Is management of people a primary focus of the role? If so, how many direct and indirect employees are managed? Do any of them manage a function or process?
NA
Budget Responsibility: Does the position have responsibility for Revenue, Operating (expense) Budget, etc.? If so, what is the scope?
N/A
Impact:
NA
NA
Business or Industry Expertise: Describe the degree of knowledge and understanding required of TIAA’s business and industry, commercial environment and of competitors products and services.
Interactions / Interpersonal Skills: Describe the nature and level of interactions this job has with others, both internally and externally. Explain any specific interpersonal skills necessary to successfully perform this role (i.e., negotiation skills, represents business at external events or to governmental bodies, etc. ).
Job Requirements And Qualifications: Indicate the minimum and preferred education and experience for the job and any licenses and certifications required
Required Education:
Masters
Preferred Education:
Masters
Skills and Abilities:
- Must have 9-13 Yrs of relevant experience.
- Team Player – ability to work in global team environment.
- Collaboration skills with business-driven team, business development and Stakeholder
Technical Requirements
Core Skills
- 5+ years experience with AWS services
- Deep expertise in Kubernetes administration
- Strong Python programming skills
- Experience with Infrastructure as Code (Terraform)
- Understanding of ML/LLM deployment patterns
Required AWS Experience
- Primary Services:
- EKS (Kubernetes)
- S3 & DynamoDB
- VPC & Networking
- IAM & Security
- CloudWatch & Monitoring
- API Gateway
AI Infrastructure Experience
- Vector database deployment
- LLM serving frameworks
- API development and gateway management
Required Licenses/Certifications:
Licenses/Certifications
- AWS Certified DevOps / Solutions Architect
- CKA Certified.
Related Skills
Agile Methodology, Analytical Skills, Automation, Cloud Platforms, Configuration Management, Data Management, Infrastructure Deployment, Infrastructure Support, IT Infrastructure, Network Administration/Maintenance, Problem Solving, Programming, Project Management, Relationship Management, Technology Systems
_____________________________________________________________________________________________________
Company Overview
TIAA Global Capabilities was established in 2016 with a mission to tap into a vast pool of talent, reduce risk by insourcing key platforms and processes, as well as contribute to innovation with a focus on enhancing our technology stack. TIAA Global Capabilities is focused on building a scalable and sustainable organization , with a focus on technology , operations and expanding into the shared services business space.
Working closely with our U.S. colleagues and other partners, our goal is to reduce risk, improve the efficiency of our technology and processes and develop innovative ideas to increase throughput and productivity.
We are an Equal Opportunity Employer. TIAA does not discriminate against any candidate or employee on the basis of age, race, color, national origin, sex, religion, veteran status, disability, sexual orientation, gender identity, or any other legally protected status.
Accessibility Support
TIAA offers support for those who need assistance with our online application process to provide an equal employment opportunity to all job seekers, including individuals with disabilities.
If you are a U.S. applicant and desire a reasonable accommodation to complete a job application please use one of the below options to contact our accessibility support team:
Phone: (800) 842-2755
Email: [email protected]
Privacy Notices
For Applicants of TIAA, Nuveen and Affiliates residing in US (other than California), click here.
For Applicants of TIAA, Nuveen and Affiliates residing in California, please click here.
For Applicants of TIAA Global Capabilities, click here.
For Applicants of Nuveen residing in Europe and APAC, please click here.
Similar Jobs
What We Do
Every worker deserves a secure retirement. For more than 100 years, weʼve delivered it for millions of people—and weʼre not done yet. Founded to help educators retire with dignity, today weʼre a market-leading retirement company fueled by world-class asset management.
But weʼre not just another legacy financial services firm. Weʼre fighting harder than ever before for our clients and the many Americans who need us.
And weʼre hiring. When you work at TIAA, youʼre making a difference in the lives of our clients. Weʼre always on the lookout for great people to become part of our coalition of champions and are committed to providing equal opportunity across all employment practices as we believe our employees have a right to a diverse and inclusive workplace. Join our team today in the fight to help more people to and through retirement.
Why Work With Us
TIAA provides financial security for millions and offers our employees opportunities to grow in a culture that embraces diversity, innovation, and high performance.






