Role Overview:
The Infrastructure SME plays a critical role in ensuring that the underlying IT infrastructure fully supports Business Continuity and Disaster Recovery (BCDR) objectives, with a strong focus on DR Automation. This role bridges infrastructure and automation teams, ensuring resilience, scalability, and seamless failover/failback execution across all infrastructure layers.
Key Responsibilities
1. Infrastructure Architecture & Readiness
· Review and validate the end-to-end infrastructure architecture supporting automated DR failover and failback, including:
o Network, Security, Compute, storage, virtualization, containers, and data center components
· Ensure the design supports high availability, resiliency, and recoverability aligned with business requirements.
2. DR Automation Integration
· Act as the primary bridge between infrastructure teams and the DR Automation team, ensuring alignment and seamless collaboration.
· Review and validate automated failover/failback workflows across infrastructure components, including:
o Network, Security, Servers, storage, DNS, virtualization platforms, and container environments
· Collaborate on the development of pre-failover validation scripts to ensure readiness before execution.
3. Recovery Objectives & Capacity Planning
· Review and validate infrastructure-level RTOs, ensuring alignment with application and business recovery requirements.
· Ensure sufficient capacity and performance within DR sites and automation platforms to support:
o Full failover scenarios
o Partial or phased failover scenarios
4. Technical Leadership & Engagement
· Lead and actively participate in technical discussions and workshops across:
o Discovery
o Validation
o Tabletop exercises
· Provide domain expertise and recommendations to ensure robust infrastructure design and DR strategy alignment.
5. Performance & Validation
· Oversee and validate infrastructure performance testing during and after DR failover/failback activities.
· Ensure that systems meet defined performance benchmarks and recovery objectives post-recovery.
6. Compliance & Audit Readiness
· Review and ensure adherence to audit and regulatory requirements, particularly around:
o Logging
o Monitoring
o Traceability of DR activities
· Support audit readiness by ensuring proper documentation and controls are in place.
7. Cross-Functional Collaboration
· Collaborate with Application, Network, Security, Database, and Business teams to ensure end-to-end alignment.
· Coordinate with stakeholders to ensure dependencies are properly managed across infrastructure and application layers.
8. Continuous Improvement & Optimization
· Identify opportunities to optimize infrastructure resilience, performance, and cost efficiency.
· Drive continuous improvement initiatives based on test results, incidents, and evolving business needs.
Requirements
· Strong expertise in enterprise infrastructure design and operations (network, compute, storage, virtualization, cloud).
· Hands-on experience with Disaster Recovery architectures and DR automation tools.
· Deep understanding of failover/failback mechanisms and infrastructure dependencies.
· Experience in capacity planning, performance testing, and high availability design.
· Knowledge of regulatory and compliance requirements related to DR and infrastructure.
· Strong stakeholder communication and cross-team coordination skills.
Preferred Qualifications
· Experience in large-scale BCDR and DR Automation programs.
· Certifications in infrastructure technologies, cloud platforms, or DR/BCDR frameworks.
Similar Jobs
What We Do
DeepSource stands as a trusted partner for businesses seeking cutting-edge AI services in computer vision, natural language processing, and predictive analytics. With a particular focus on Arabic NLP and ChatGPT bot development, DeepSource is dedicated to empowering companies with groundbreaking solutions that streamline operations, optimize workflows, and enhance user experiences. Our commitment to excellence is evident in our approach to addressing a wide range of AI needs, from hiring top talent and managing end-to-end AI projects to providing tailored consulting and comprehensive training programs. DeepSource's team of experts is equipped with extensive knowledge and experience in various AI technologies, which enables them to develop and deploy advanced solutions across multiple industries. Our adaptive strategies and innovative methodologies allow businesses to stay competitive in today's rapidly evolving digital landscape








