Site Reliability Engineer
About DAT
DAT is an award-winning employer of choice and a next-generation SaaS technology company that has been at the leading edge of innovation in transportation supply chain logistics for 45 years. We continue to transform the industry year over year, by deploying a suite of software solutions to millions of customers every day - customers who depend on DAT for the most relevant data and most accurate insights to help them make smarter business decisions and run their companies more profitably. We operate the largest marketplace of its kind in North America, with 227 million freight posts in 2021, and a database of $126 billion of annual global shipment market transaction data. We have co-headquarters in Portland, OR and Denver, CO, and additional offices in MO, TX, and Bangalore, India. For additional information, see www.DAT.com/company.
The Opportunity
DAT Solutions is seeking an SRE Engineer to join our expanding team in Denver, Colorado.
Candidate profile
We are looking for a talented SRE Engineer who can apply their skills and experience to observe and maintain a world class SaaS environment. Supporting escalations within incident/problem management, working closely with feature engineering teams to ensure visibility along the product chain.
This position offers an outstanding opportunity for a highly-motivated individual who possesses a solid development/deployment/operations services background; comfortable working with a skilled and rapidly changing technical environment.
The ideal candidate possesses strong problem-solving skills and some industry good practices knowledge. Applying your experiences and scripting skills to automate scaling, management of server resources and to facilitate Reliability practices.
What You’ll Do
- Ensure DAT’s production environment by monitoring availability, system health, application health/performance, and trends to identify where anomalies happen.
- Design and build software and systems to manage platform infrastructure and applications.
- Improve reliability, quality, availability, and speed-to-market of our platforms.
- Measure and optimize system performance, push our capabilities forward, and innovate to continually improve.
- Provide primary operational support and engineering for multiple large distributed software applications in our data centers and cloud implementations.
- Gather/analyze metrics from compute, infrastructure and applications to assist in performance tuning and fault finding.
- Partner with development teams to improve service availability/uptime through rigorous testing and release procedures.
- Participate in system design consulting, platform management, and capacity planning.
- Create sustainable systems and services through automation and uplifts
- Balance feature development speed and reliability with well-defined service level objectives
The Skills and Experience You’ll Bring
- Ability to program (structured and OO) with more than one high level language, such as Python, Java, C/C++, Ruby, and JavaScript
- Experience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as dynamic resource management frameworks (Mesos, Kubernetes, Yarn)
- A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
- Experience partnering with development and procurement teams to identify SPOF’s, M&A gaps, and driving results to completion.
- Experience mentoring junior team members.
- Enjoys the breaking/learning/mastery cycle.
Experience in the following technologies a plus:
- Node.js
- Java
- Stages of Software Reliability
Education and / or Experience:
Bachelor's degree in computer science or other equivalent engineering or science field, or equivalent combination of education and experience with reliability of a large enterprise estate.
2+ years of experience in a previous SRE/Operations/DevOps role.
In compliance with Colorado's Equal Pay for Equal Work Act, the minimum salary for this role is $110,000.00 + benefits. The maximum compensation for this role can vary significantly depending on your job-related skills and experience. DAT considers factors such as scope and responsibilities of the position, candidate's work experience, education and training, core skills, internal equity, and market and business elements when extending an offer.
DAT embraces the value of a diverse workforce, and believes it is a core strength of our company that we encourage those values in every DAT employee, at every level of our organization, regardless of tenure or rank. We provide equal employment opportunities (EEO) to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, disability, genetic information, marital status, amnesty, or status as a covered veteran in accordance with applicable federal, state, and local laws.
Equal Opportunity Employer/Protected Veterans/Individuals with Disabilities
The contractor will not discharge or in any other manner discriminate against employees or applicants because they have inquired about, discussed, or disclosed their own pay or the pay of another employee or applicant. However, employees who have access to the compensation information of other employees or applicants as a part of their essential job functions cannot disclose the pay of other employees or applicants to individuals who do not otherwise have access to compensation information, unless the disclosure is (a) in response to a formal complaint or charge, (b) in furtherance of an investigation, proceeding, hearing, or action, including an investigation conducted by the employer, or (c) consistent with the contractor’s legal duty to furnish information. 41 CFR 60-1.35(c)
#LI-PP1