Technical Consultant - Azure Platform, Monitoring & Observability, Identity Access Management, Cloud Infra
Scope:
The Monitoring & Observability Engineer is responsible for ensuring end-to-end visibility across the Azure platform by implementing and maintaining monitoring, alerting, and observability solutions. This role focuses on proactively monitoring platform health, performance, availability, and security, enabling rapid detection and resolution of issues while supporting reliable and efficient platform operations. The engineer plays a key role in fostering a proactive operational culture through continuous monitoring, analytics, and operational insights.
Our current technical environment:
Our technical environment includes modern cloud and DevOps tooling with IaC (Terraform, Ansible, ARM, Bicep), CI/CD (Azure DevOps, Jenkins, GitHub Actions), and container orchestration (Docker, Kubernetes, AKS). We leverage Azure and OCI platforms, automation frameworks, microservices architecture, and observability tools, while also adopting emerging technologies such as GenAI and AI/ML.
What you’ll do:
Monitor the health, availability, performance, and security of Azure platform services using Azure Monitor, Log Analytics, Application Insights, and Elastic.
Maintain and monitor dashboards, alerts, and key operational metrics across platform services including IAM, APIM, MongoDB, Stratosphere, Portal Shell, Portal Collaboration, and Event Framework.
Respond to monitoring alerts, perform initial triage, and escalate incidents to appropriate L2/L3 teams in accordance with defined procedures.
Monitor authentication services, token issuance processes, and access management operations within Azure AD / Entra ID to ensure service availability and compliance.
Track API gateway performance metrics, including latency, error rates, throttling events, and quota utilization, and report anomalies to support teams.
Review logs, traces, and monitoring data to identify operational issues, performance degradation, and potential service disruptions.
Execute synthetic monitoring checks and validate end-to-end user journeys to ensure platform functionality and availability.
Follow established runbooks and operational procedures to support incident resolution and routine maintenance activities.
Collaborate with engineering and operations teams to improve monitoring coverage, alert accuracy, and operational efficiency.
Participate in shift operations, monitoring reviews, and continuous improvement initiatives aimed at reducing incident response times and enhancing platform reliability.
Monitor the health, availability, performance, and security of Azure platform services using Azure Monitor, Log Analytics, Application Insights, and Elastic.
Maintain and monitor dashboards, alerts, and key operational metrics across platform services including IAM, APIM, MongoDB, Stratosphere, Portal Shell, Portal Collaboration, and Event Framework.
Respond to monitoring alerts, perform initial triage, and escalate incidents to appropriate L2/L3 teams in accordance with defined procedures.
Monitor authentication services, token issuance processes, and access management operations within Azure AD / Entra ID to ensure service availability and compliance.
Track API gateway performance metrics, including latency, error rates, throttling events, and quota utilization, and report anomalies to support teams.
Review logs, traces, and monitoring data to identify operational issues, performance degradation, and potential service disruptions.
Execute synthetic monitoring checks and validate end-to-end user journeys to ensure platform functionality and availability.
Follow established runbooks and operational procedures to support incident resolution and routine maintenance activities.
Collaborate with engineering and operations teams to improve monitoring coverage, alert accuracy, and operational efficiency.
Participate in shift operations, monitoring reviews, and continuous improvement initiatives aimed at reducing incident response times and enhancing platform reliability.
What we are looking for:
2–4 years in Azure cloud operations or SRE roles.
Immediate Joiners Preferred
BE/Btech/Engineering Degree must
Relocation to Coimbatore preferred
In-Person interviews
Strong Azure platforms fundamentals expert
Identity access management tools expertise
Monitoring tools and Kubernetes
Hands-on experience with KQL, Log Analytics workspaces, and Azure Workbooks.
Demonstrated ability to design alert hierarchies and reduce alert fatigue.
Familiarity with APIM diagnostic settings and event hub log forwarding.
Experience monitoring MongoDB Atlas or similar NoSQL databases.
Knowledge of OAuth 2.0 / OIDC flows for IAM health monitoring.
Exposure to event-driven architectures (Azure Event Grid, Service Bus, Event Hubs).
Strong communication skills — ability to translate metrics into business impact.
AZ-900 / AZ-104 / AZ-204 certifications preferred.
Strong exposure to cloud technologies
Application & Production Monitoring and Support
Our Values
If you want to know the heart of a company, take a look at their values. Ours unite us. They are what drive our success – and the success of our customers. Does your heart beat like ours? Find out here: Core Values
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status.
Skills Required
- Bachelor's degree in IT/Computer Science/Electronics or equivalent diploma
- 2-5 years of relevant infrastructure support/administration experience
- Relevant certifications (Microsoft, Red Hat, VMware, CCNA, ITIL Intermediate)
- Strong knowledge of Windows/Linux server administration, virtualization
- Knowledge of TCP/IP, DNS, DHCP, firewall basics, load balancers
- Understanding of SAN/NAS concepts, backup tools, data recovery
- Basic administration of Azure, AWS, or GCP
- Hands-on experience with monitoring and ITSM tools
- Ability to analyze logs and perform RCA
- Good understanding of ITIL processes
- Strong communication and collaboration skills
Blue Yonder Compensation & Benefits Highlights
The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Blue Yonder and has not been reviewed or approved by Blue Yonder.
-
Leave & Time Off Breadth — PTO is described as generous or “unlimited” in the U.S., alongside paid holidays, sick time, and two paid volunteer days. These policies are often highlighted as strengths that support work–life balance.
-
Flexible Benefits — Remote-work options and flexible arrangements are emphasized as part of the package. This flexibility is valued alongside compensation and can help offset middling pay for some roles.
-
Healthcare Strength — Medical, dental, and vision coverage are provided, with mental health/EAP support and HSA/FSA options referenced. These core coverages are portrayed as solid and comprehensive.
Blue Yonder Insights
What We Do
Blue Yonder is the world leader in digital supply chain and omni-channel commerce fulfillment. Our intelligent, end-to-end platform enables retailers, manufacturers and logistics providers to seamlessly predict, pivot and fulfill customer demand. With Blue Yonder, you can make more automated, profitable business decisions that deliver greater growth and re-imagined customer experiences. Blue Yonder - Fulfill your Potential Blue Yonder’s tagline “Fulfill Your Potential” reflects the company’s mission to empower every organization and person on the planet to fulfill their potential. Each day, our global teams of associates and business partners work together to accelerate global economic growth, increase sustainability and prosperity with a Sonoran Spirit.








