Sr. Site Reliability Engineer

Posted 10 Days Ago
Be an Early Applicant
Hoffman Estates, IL
110K-140K Annually
Senior level
Artificial Intelligence • Machine Learning • Software
We transform the way retail and automotive brands use their data while also transforming ourselves.
The Role
The Sr. Site Reliability Engineer at CDK Global will manage solutions and cloud infrastructures, ensuring reliability, scalability, and performance of enterprise-grade systems. Responsibilities include improving solution lifecycles, troubleshooting distributed system issues, and collaborating with cross-functional teams to maintain reliability standards.
Summary Generated by Built In

At CDK Global, we are focused on connections that allow us to deliver world-class software, support, and data insights. Our values define who we are and how we show up for each other, our customers, and our communities.

Our values: Stay Curious, Own It, Be Open, Create Possibilities

The CDK Global technology team is looking for collaborative innovators who are passionate about making their mark on emerging enterprise software products. We are crafting and honing cloud technology for the automotive retail industry that will change the landscape for automotive dealers, original equipment manufacturers (OEMs), independent software vendors (ISVs) and the customers they serve. Shared Services is the foundation upon which other CDK products are built. One of the key roles we are looking for is a Sr. Site Reliability Engineer, who will have the opportunity to manage the solutions and cloud infrastructures for our foundational platform that supports all the modern/strategic products at CDK. If combining software and systems engineering expertise to build and run large-scale, massively distributed, fault-tolerant, highly available and performant enterprise-grade solutions is your passion, this role is for you.

Growth potential, flexibility and material impact on the success and quality of next-gen solutions, make CDK an excellent choice for those who thrive in challenging, fast-paced engineering environments. The possibilities for impact are endless. We have exceptional opportunities to evolve our industry by driving change through modern technologies.

If you are an engineer who is passionate about technology solutions, want to work with the best software craftsmen in the industry and are looking for an exciting career with a leader in the automotive retail vertical, blazing the trail on the digital frontier, you may have found your new home.

Responsibilities:

  • Engage in and improve the whole lifecycle of solutions, from inception and design, through to build/test, deployment, operation and refinement
  • Ensure our solutions are reliable, fault-tolerant, secure, efficiently scalable, available, reachable and cost-effective
  • Measure, monitor and proactively alert on resource consumption, error rates, traffic anomalies, availability, performance, reachability and overall system health
  • Quickly respond to and prevent disruptions to users. If a disruption does occur, quickly respond to and resolve incidents efficiently
  • Expertly troubleshoot issues with distributed systems, interactions between cloud technology layers and components, common dependencies at scale
  • Practice sustainable incident response, blameless postmortems and prompt implementation of recommended changes to prevent recurrence
  • Contribute to the development and implementation of routine maintenance automation and alerting
  • Recommend configurations optimal of cloud technology solutions and modify the code base that defines systems or cloud technologies to improve the reliability, availability, efficiency, observability, performance and operability of supported products
  • Collaborate well with cross-functional teams across product, architecture, engineering, infrastructure, and security to ensure that reliability standards are integrated into the development and deployment of all solutions
  • Maintain up-to-date documentation on system configurations, incident response protocols, and operational best practices
  • Earnestly participate in code/design reviews, and regular meetings with the engineering teams that develop and/or manage the products in question
  • Research and maintain an awareness in industry trends, advances in distributed systems and cloud technologies, tools, and/or processes for maintaining and improving product availability, reliability, efficiency, observability, and/or performance
  • Contribute to the implementation of new solutions within the team by identifying ways they can be applied to solve persistent problems
  • Ensure that uniform enterprise-wide architecture and design standards are adhered to high availability of products, services and database.

Minimum Qualifications:

  • Bachelor’s degree, or equivalent experience, in Computer Science, Engineering, or related field, with 8+ years of relevant experience with large-scale enterprise-grade solutions.
  • A strong background in architecture / design and currently working in a similar role, in a forward-thinking and fast paced business.
  • 4+ years professional SRE experience relevant to the responsibilities listed above, including event driven architectures, cloud native and distributed / SaaS solutions
  • 4+ years of experience with CI/CD pipelines, infrastructure as code, proactive monitoring, smart alerting, ensuring performance / scalability and proactive capacity management of enterprise-grade solutions.
  • Expertise troubleshooting across the entire stack: network, server, operating system, and application
  • Expertise with monitoring and alerting tools (e.g., New Relic, Prometheus, Grafana)
  • Strong analytical and problem-solving skills, with a keen attention to detail
  • Experience with Microservices, Java, Node, Kafka / RabbitMQ, Oracle / PostgreSQL, MongoDB / DynamoDB, React / Angular, Istio, NGINX, F5, AWS API Gateway, ECS, Cloudformation, Terraform
  • Experience deploying, maintaining and troubleshooting containerized applications
  • A level of comfort with Linux
  • Solid communication and collaboration skills

Preferred Qualifications:

  • Certification in AWS or related cloud technologies
  • Automotive retail experience

Compensation: $110,000 - $140,000

CDK Global is committed to fair and equitable compensation practices. Compensation packages are based on several factors, including but not limited to skills, experience, certifications, and work location. The total compensation package for this position may also include annual performance bonus, benefits and/or other applicable incentive compensation plans.We offer Medical, dental, and vision benefits in addition to:

  • Paid Time Off (PTO)

  • 401K Matching Program

  • Tuition Reimbursement

At CDK, we believe inclusion and diversity are essential in inspiring meaningful connections to our people, customers and communities. We are open, curious and encourage different views, so that everyone can be their best selves and make an impact.

CDK is an Equal Opportunity Employer committed to creating an inclusive workforce where everyone is valued. Qualified applicants will receive consideration for employment without regard to race, color, creed, ancestry, national origin, gender, sexual orientation, gender identity, gender expression, marital status, creed or religion, age, disability (including pregnancy), results of genetic testing, service in the military, veteran status or any other category protected by law.

Applicants for employment in the US must be authorized to work in the US. CDK may offer employer visa sponsorship to applicants.

Top Skills

Cloud Technology
The Company
HQ: Austin, TX
9,000 Employees
On-site Workplace
Year Founded: 2006

What We Do

We’re Neuron at CDK Global. We use artificial intelligence and machine learning to produce predictive data insights for dealers and automakers. We’re committed to helping dealers connect and serve their customer base while growing their businesses in the way they envision.

After the acquisition of Square Root on February 1st, our enterprise software, CoEFFICIENT®, is further breaking through organizational silos, uncovers each dealership's unique needs, and helps achieve business goals to improve customer experiences.

Why Work With Us

Our culture is at the core of everything we do. As we grow, we’re not only looking to hire the best and brightest, but we’re also looking for people who share our values of Own It, Stay Curious, Be Open and Create Possibilities. We pride ourselves on having a diverse workforce. We value and celebrate the uniqueness of individuals and the different

Gallery

Gallery

Similar Jobs

Adyen Logo Adyen

Senior Site Reliability Engineer, Internal Services

Fintech • Payments • Financial Services
Easy Apply
Chicago, IL, USA
4196 Employees

CCC Intelligent Solutions Logo CCC Intelligent Solutions

Senior Site Reliability Engineer

Artificial Intelligence • Automotive • Internet of Things • Software
Chicago, IL, USA
1500 Employees
98K-155K Annually

DATAMAXIS, Inc Logo DATAMAXIS, Inc

Sr. Site Reliability Engineer

Information Technology • Software • Analytics
Chicago, IL, USA
28 Employees
125K Annually

Egen Logo Egen

Site Reliability Engineer

Artificial Intelligence • Big Data • Cloud • Information Technology • Machine Learning
Remote
Hybrid
Naperville, IL, USA
240 Employees

Similar Companies Hiring

bet365 Thumbnail
Software • Gaming • eSports • Digital Media • Automation
Denver, Colorado
6100 Employees
Jobba Trade Technologies, Inc. Thumbnail
Software • Professional Services • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees
InCommodities Thumbnail
Renewable Energy • Machine Learning • Information Technology • Energy • Automation • Analytics
Austin, TX
234 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account