Principal Site Reliability Engineer

Posted 15 Hours Ago
Be an Early Applicant
San Francisco, CA
204K-323K Annually
Expert/Leader
Cloud
If you’re ready to build your future — and the future of technology — then you’re in the right place.
The Role
The role involves leading and enhancing the availability and resilience of software engineering solutions. You will work with teams to design and develop resilient applications, champion best practices, use infrastructure-as-code, integrate with APIs and microservices, and troubleshoot complex issues while maintaining service availability.
Summary Generated by Built In

To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.

Job Category

Software Engineering

Job Details

About Salesforce

We’re Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too — driving your performance and career growth, charting new paths, and improving the state of the world. If you believe in business as the greatest platform for change and in companies doing well and doing good – you’ve come to the right place.

Job Details

(Lead/Principal/Architect) Software Engineer - Availability Engineering
Our Availability engineering teams are responsible for driving ‘best in class’ availability, you will work with delivery teams deploying Customer facing / supporting software across a multi substrate engineering platform that collectively ships hundreds of features to production for tens of millions of users across all industries every day. Our users count on our applications and platforms to be highly reliable, lightning fast, supremely secure, and to preserve all of their customizations and integrations every time we ship. You will need deep experience with concurrency, large scale systems, proficiency with solving real-world data management challenges, a strong understanding of how to craft solutions that are highly available, and a proven ability to design, develop, and optimize the core back-end systems.


What you’ll be doing:

  • As part of a specialist unit focused on availability and resilience, you will embed with delivery teams, acting in a Lead capacity, creating bandwidth and prioritizing a focus on corrective and proactive availability measures.

  • You will be contributing to designing, developing, debugging, and operating resilient applications and platforms deployed across distributed systems that run across thousands of compute nodes in multiple data centers.

  • You will champion resiliency best practices; Observability tool integration, horizontal/vertical sizing & auto-scaling, release rollback & recovery workflows, integration tests and validation procedures for applications running on self host infra as well as public cloud platforms such as AWS, GCP, Azure & Alibaba

  • Using and contributing to open source technology (Spinnaker, Zookeeper, etc.)

  • Developing / leveraging Infrastructure-as-Code using Terraform.

  • Building / integrating with API’s and microservices deployed on containerization frameworks such as Kubernetes, Docker, Mesos etc

  • Resolving complex technical issues and driving innovations that improve system availability, resilience, and performance

  • You have experience balancing live runtime management, feature delivery, and retirement of technical debt

  • Participate in the team’s on-call rotation to address complex problems in real-time and keep services operational and highly available

Required Skills:

  • A related technical degree required, (masters preferred)

  • 15+ years of hands on software development experience

  • 5+ year in a Tech Lead, Principal or Architect capacity

  • Ability to reverse engineer solutions via independent code and architecture review, envision, define and then contribute to delivery of availability improvement refactoring projects

  • Mastery of one or more object oriented delivery with languages such as Java, Golang, APEX, Python.

  • Deep experience working with core web technologies: HTTP, JSON, REST, XML

  • Proficiency with databases including Oracle or other relational and/or NoSQL solutions

  • Experience owning and operating multiple instances of a critical service

  • Running critical infrastructure services; monitoring, alerting, logging, tracing and reporting

  • Subject matter expertise on Service ownership best practices, SLO/I/A definition, driving proactive operational awareness and experience with Incident / Problem management

  • Thorough knowledge of Agile development methodology with experience in both Test / Behavioral Driven Development practices

Accommodations

If you require assistance due to a disability applying for open positions please submit a request via this Accommodations Request Form.

Posting Statement

At Salesforce we believe that the business of business is to improve the state of our world. Each of us has a responsibility to drive Equality in our communities and workplaces. We are committed to creating a workforce that reflects society through inclusive programs and initiatives such as equal pay, employee resource groups, inclusive benefits, and more. Learn more about Equality at www.equality.com and explore our company benefits at www.salesforcebenefits.com.

Salesforce is an Equal Employment Opportunity and Affirmative Action Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status. Salesforce does not accept unsolicited headhunter and agency resumes. Salesforce will not pay any third-party agency or company that does not have a signed agreement with Salesforce.

Salesforce welcomes all.

Pursuant to the San Francisco Fair Chance Ordinance and the Los Angeles Fair Chance Initiative for Hiring, Salesforce will consider for employment qualified applicants with arrest and conviction records.

For Washington-based roles, the base salary hiring range for this position is $204,400 to $296,400.

For California-based roles, the base salary hiring range for this position is $223,000 to $323,400.

Compensation offered will be determined by factors such as location, level, job-related knowledge, skills, and experience. Certain roles may be eligible for incentive compensation, equity, benefits. More details about our company benefits can be found at the following link: https://www.salesforcebenefits.com.

Top Skills

Java
Python
The Company
HQ: San Francisco, CA
72,000 Employees
Hybrid Workplace

What We Do

Salesforce is the #1 AI CRM, where Humans with agents drive customer success together. Through Agentforce, our groundbreaking suite of customizable agents and tools, Salesforce brings autonomous AI agents, unified data from any source, and best-in-class Customer 360 apps together on one integrated platform to help companies connect with customers in a whole new way.

Salesforce is democratizing AI agents for businesses of every size and industry so every company can embrace a workforce without limits. Our low code, open, and secure platform helps companies build and customize Salesforce fast so they can safely scale AI-powered work to every customer and employee experience and transform their business.

Salesforce is proud to be the market leader, but we’re even more proud to lead in philanthropy, innovation and culture. Guided by core values of trust, customer success, innovation, equality, and sustainability, Salesforce is more than a business — we’re a platform for change.

Why Work With Us

There’s no typical day in the life of a Salesforce employee. You could be transforming our next AI innovation — or transforming your community. Closing deals — or closing your laptop for a day of Volunteer Time Off. Driving change for our customers — or driving change within one of our high-performing teams.

Gallery

Gallery

Similar Jobs

Easy Apply
3 Locations
1100 Employees

BAE Systems, Inc. Logo BAE Systems, Inc.

Principal Site Reliability Engineer - Hybrid

Aerospace • Hardware • Information Technology • Security • Software • Cybersecurity • Defense
Hybrid
San Diego, CA, USA
40000 Employees
112K-191K Annually

Atlassian Logo Atlassian

Principal Site Reliability Engineer

Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Remote
San Francisco, CA, USA
11000 Employees
167K-269K Annually
Santa Clara, CA, USA
15289 Employees
147K-238K Annually

Similar Companies Hiring

Fulcrum GT Thumbnail
Software • Legal Tech • Cloud
Hoffman Estates, Illinois
501 Employees
Eastwall Thumbnail
Software • Information Technology • Consulting • Cloud • Big Data Analytics • Artificial Intelligence • App development
Denver, CO
20 Employees
Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account