Principal Software Engineer, Architecture (AI/ML)

Posted 15 Days Ago
Be an Early Applicant
Seattle, WA
Hybrid
Senior level
Cloud • Enterprise Web • Software • Infrastructure as a Service (IaaS)
DigitalOcean is the cloud of choice for developers, startups, and SMBs around the world.
The Role
As a Principal Software Engineer focusing on AI/ML, you will architect and implement large-scale cloud services, develop and optimize AI/ML models, and lead technical strategy and mentorship efforts within the team. Your role involves driving innovative cloud solutions, managing AI/ML pipelines, and collaborating with cross-functional teams.
Summary Generated by Built In
Do you ever wonder what happens inside the cloud?

DigitalOcean (NYSE: DOCN) simplifies cloud computing so builders can spend more time creating software that changes the world. With our mission-critical infrastructure and fully managed offerings, DigitalOcean enables startups and small and medium-sized businesses (SMBs) to rapidly deploy and scale modern applications. As a remote-first organization, our employees, like our customers, are based around the world.

We want people who are passionate about staying on top of the latest cloud infrastructure and AI/ML trends, with an excellent aptitude for supporting internal employees and teams.

We are looking for a highly experienced, highly motivated Principal Software Engineer, Architecture (AI/ML) with a Computer Science, Engineering, or AI/ML background. You will be involved in the architecture, design, implementation, verification, and integration of the next generation of DigitalOcean Cloud Computing software with a strong emphasis on AI/ML-driven solutions.

What You’ll Be Doing:

  • Working at the forefront of cloud, distributed computing, and AI/ML technologies.
  • Serving as the architect driving the technical strategy and direction for our large-scale cloud services, including machine learning model deployment and orchestration.
  • Developing AI/ML models to optimize cloud infrastructure, improve system reliability, and enhance user experience.
  • Building and refining machine learning pipelines and frameworks to support scalable AI/ML solutions.
  • Owning the primary responsibility for establishing a pragmatic long-term technical direction for our software services, ensuring alignment with our customers, business goals, and internal teams.
  • Leading a team of highly passionate technical leads to evolve our service architecture, with alignment across several product technical roadmaps.
  • Leading by example through direct contribution and providing direction in establishing development and operational practices, with specific attention to AI/ML model lifecycle management.
  • Serving as the technical lead on our most demanding, cross-functional projects.
  • Actively mentoring individuals and the engineering community on advanced technical issues, including best practices in AI/ML.

What We’ll Expect From You:

Architect-level experience in the following domains:

    • Proven expertise in large-scale cloud and AI/ML services, and a deep understanding of cloud computing’s potential in enhancing AI/ML applications.
    • Demonstrated ability to lead and mentor large software and AI/ML teams.
    • Experience with web and cloud-native services is a must-have, with experience deploying scalable AI/ML solutions in production.
    • Adept at Systems Thinking with an ability to decompose complex problems into simple, straight-forward solutions, including AI/ML-specific challenges like model drift and data dependency management.
    • Strong grasp of system interdependencies, limitations, and expertise in AI/ML optimization techniques for performance, scalability, and accuracy.

AI/ML Expertise:

    • Hands-on experience in AI/ML frameworks and libraries, such as TensorFlow, PyTorch, or Scikit-Learn, and model-serving frameworks such as TensorFlow Serving or ONNX.
    • Proven experience in developing and deploying models for performance-intensive applications at web-scale.
    • Understanding of the MLOps lifecycle, including data engineering, model training, validation, deployment, and monitoring.
    • Understanding of key HPC technologies including RDMA, InfiniBand/RoCE, GPUDirect and other storage technologies 
  • Knowledge in performance, scalability, enterprise system architecture, and engineering best practices with an emphasis on the integration of AI/ML.
  • Leverage knowledge of open-source, industry standards, and prior art in architecture decisions with AI/ML considerations.
  • Balance technical leadership and savvy with strong business judgment to make the right decisions about technology, demonstrating simplicity and creativity.
  • Master’s degree or higher preferred in Computer Science, AI/ML, or a related field.
  • 15+ years professional experience in web-scale system software development.
  • 5+ years experience demonstrating an established track record in Deep Learning and Machine Learning
  • 3+ years recent experience as an ML engineer, data science engineer, or similar
  • In-depth experience in two or more of the following areas: Cloud Computing, Storage, Networking, Platform-as-a-Service, Infrastructure-as-a-Service, Software-as-a-Service.
  • Excellent communication skills at all levels

Why You’ll Like Working for DigitalOcean:

  • We are proud to work here. You’ll be a part of a cutting-edge technology company with an upward trajectory, who are proud to simplify cloud computing so builders can spend more time creating software that changes the world. As a member of the team, you will be a Shark who thinks big, bold, and scrappy, like an owner with a bias for action and a powerful sense of responsibility for customers, products, employees, and decisions. 
  • We prioritize career development. At DO, you’ll do the best work of your career. You will work with some of the smartest and most interesting people in the industry. We are a high-performance organization that will always challenge you to think big. Our organizational development team will provide you with resources to ensure you keep growing. We provide employees with reimbursement for relevant conferences, training, and education. All employees have access to LinkedIn Learning's 10,000+ courses to support their continued growth and development.
  • We care about your well-being. Regardless of your location, we will provide you with a competitive array of benefits to support your overall well-being, from one-time work from home stipend to wellness allowance to flexible time off policy, to name a few. While the philosophy around our benefits is the same worldwide, specific benefits may vary based on local regulations and preferences.
  • We reward our employees. The salary range for this position is between $225,000.00 - $338,000.00 based on market data, relevant years of experience, and skills. You may qualify for a bonus in addition to base salary; bonus amounts are determined based on company and individual performance. We also provide equity compensation to eligible employees, including equity grants upon hire and the option to participate in our Employee Stock Purchase Program. 
  • We value diversity and inclusion. We are an equal-opportunity employer, and recognize that diversity of thought and background builds stronger teams and products to serve our customers. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.

*This is a remote role

#LI-Remote

#LI-KR1

Top Skills

PyTorch
Scikit-Learn
TensorFlow
The Company
HQ: New York , NY
900 Employees
Hybrid Workplace
Year Founded: 2012

What We Do

DigitalOcean (NYSE: DOCN) simplifies cloud computing so builders can spend more time creating software that changes the world. With our mission-critical infrastructure and fully managed offerings, DigitalOcean enables startups and small and medium-sized businesses (SMBs) to rapidly deploy and scale modern applications. As a remote-first organization, our employees, like our customers, are based around the world.

Why Work With Us

Here you'll get to work with some of the smartest, most interesting people around; solving unique and complex technical challenges on a scale matched by few companies. If you get excited about stretching yourself in new ways, developing yourself to your fullest potential, with amazingly supportive friends and colleagues; we want to talk to you!

Gallery

Gallery

Similar Jobs

The Walt Disney Company Logo The Walt Disney Company

Sr. Machine Learning Engineer

AdTech • Digital Media • News + Entertainment
Hybrid
Seattle, WA, USA
200000 Employees
149K-231K Annually

Two Barrels LLC Logo Two Barrels LLC

iOS App Developer

eCommerce • Legal Tech • Professional Services • Software • Data Privacy
Remote
Hybrid
Spokane, WA, USA
950 Employees
120K-120K Annually

Two Barrels LLC Logo Two Barrels LLC

Senior iOS App Developer

eCommerce • Legal Tech • Professional Services • Software • Data Privacy
Remote
Hybrid
Spokane, WA, USA
950 Employees
150K-150K Annually

IonQ Logo IonQ

Software Engineer - Compiler

Artificial Intelligence • Hardware • Software • Quantum Computing
Easy Apply
Bothell, WA, USA
415 Employees

Similar Companies Hiring

TrainingPeaks (A Peaksware Company) Thumbnail
Software • Fitness
Louisville, CO
69 Employees
bet365 Thumbnail
Software • Gaming • eSports • Digital Media • Automation
Denver, Colorado
6100 Employees
Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account