Technical Program Manager

Posted 4 Hours Ago
Be an Early Applicant
Hiring Remotely in USA
Remote
150K-195K Annually
3-5 Years Experience
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
We build infrastructure for machine learning
The Role
The Technical Program Manager will oversee multiple data center deployment projects, manage stakeholder communications, allocate resources, develop mitigation strategies for risks, create schedules and budgets, ensure quality standards are met, maintain documentation, and collaborate with technical experts. Travel to project sites will be required.
Summary Generated by Built In

Voltage Park is on a mission to make machine learning infrastructure accessible to all, from large enterprises and research universities to seed-stage startups and nonprofits. Providing seamless access to compute with pricing and inventory transparency is the future of access to GPUs, and we are the only cloud provider offering a platform that shows all available GPUs with transparent, market-based pricing, in addition to long-term reserve contracts for our customers. 

We are looking for a highly experienced and self-driven Technical Program Manager with expertise in data center deployment to join our team. The ideal candidate will have a deep understanding of data center infrastructure, including power, cooling, networking, and server technologies, combined with strong project management skills to ensure projects are delivered on time, within scope, and on budget. The role offers flexibility, as it can be remote, but requires travel to various project sites.

Responsibilities:

  • Project Planning and Execution: Oversee the entire lifecycle of multiple concurrent data center deployment projects, including design, construction, testing, commissioning, and handover to operations.

  • Stakeholder Management: Serve as the main point of contact for all stakeholders, including clients, vendors, contractors, and internal teams. Facilitate communication and collaboration, ensuring clear and consistent updates with vendors and on-site teams to meet project deadlines.

  • Resource Management: Efficiently manage and allocate resources (teams, tools, and budget) to ensure each phase of the project is adequately staffed and equipped to meet tight schedules.

  • Risk Management: Identify potential risks, develop mitigation strategies, and proactively address issues to minimize disruptions and maintain timelines.

  • Scheduling & Budgeting: Create and manage project schedules, budgets, and resource plans, ensuring alignment with contractual agreements and project constraints.

  • Quality Assurance: Ensure all deployments meet technical and quality standards, adhering to industry best practices and local regulations.

  • Documentation: Maintain detailed project documentation, including design documents, and provide regular progress updates.

  • Technical Oversight and Collaboration: Work closely with technical experts to distill, convey, and translate guidance on best practices for data center design and deployment, including considerations for power, cooling, networking, and physical security.

  • Continual Improvement: Identify and implement process improvements and innovative solutions to optimize data center operations.

  • On-Site Project Oversight: Travel up to 25% to various data center sites to monitor progress, resolve issues, and ensure smooth execution. Coordinate with local teams, contractors, and vendors during visits.

Qualifications:

  • A minimum of 3 years’ experience in data center deployment, infrastructure project management, or technical project management in data centers. Experience in high-performance computing and GPU technologies is advantageous.

  • Proven experience in managing complex technical projects from start to finish, with a solid understanding of project management tools and methodologies.

  • Extensive knowledge of data center infrastructure, including servers, storage, networking, power, and cooling systems, as well as familiarity with industry standards and best practices.

  • Strong leadership, communication, and team management skills, with the ability to effectively collaborate with diverse teams and stakeholders.

  • Demonstrated problem-solving skills, including the ability to assess technical issues, develop solutions, and make critical decisions under pressure.

  • Familiarity with industry standards such as TIA-942, Uptime Institute guidelines, ASHRAE, and other compliance, security, and risk management protocols.

  • Adaptability to work in a dynamic and fast-paced environment, prioritize tasks effectively, and adjust to evolving requirements.

  • Willingness and ability to travel to data center locations, sometimes on short notice.

  • Bachelor’s degree in computer science, information technology, or a related field, or equivalent experience. Relevant certifications (e.g. PMP, PRINCE2, ITIL, CDCP, CDCS) are a plus.

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. If you require an accommodation during the job application process, please notify your recruiter. 

Compensation Range: $150K - $195K

What the Team is Saying

Melissa Du
The Company
HQ: Berkeley, CA
45 Employees
Remote Workplace
Year Founded: 2023

What We Do

The market for cutting-edge ML compute is broken. Startups, researchers and even big AI labs are scrambling to buy or rent access to the latest chips for ML training. But demand far outstrips supply, and what’s available is only accessible to the well-resourced, placing an artificial damper on innovation.

To solve this challenge, we've launched Voltage Park, and we’re on a mission to make machine learning infrastructure accessible to all, from large enterprises and research universities, to seed-stage startups and nonprofits.

With around 24,000 NVIDIA H100 GPUs, the Voltage Park cloud is one of the most powerful collections of cutting-edge ML compute in the world. Our clusters consist of 80GB H100 SXM5 GPUs fully interconnected with 3.2T InfiniBand. We currently offer bare-metal access for large-scale users that need peak performance. We will add support for short-term leases and hourly billing soon as we spin up our infrastructure along with support for familiar tools like Slurm, Kubernetes, and Mosaic for easy integration into existing training frameworks.

Why Work With Us

You’ll play a pivotal role as a member of the founding team that will change the face of machine learning infrastructure. As an early hire, you’ll have outsize influence in defining the company’s culture and ensuring mission success.

Voltage Park Offices

Remote Workspace

Employees work remotely.

Voltage Park is a 100% remote company.

Typical time on-site: None
HQBerkeley, CA

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account