Site Reliability Engineer

Posted Yesterday
Be an Early Applicant
27 Locations
Remote
74K-90K Annually
Mid level
Big Data
We enable rapid analysis and management of the world’s largest datasets.
The Role
Maintain and expand Ocient's hosted data warehouse services with a focus on high availability, performance, observability, automation, security, and incident management. Build monitoring, logging, alerting, CI/CD, and automate Linux server deployments while supporting backup, DR, and test infrastructure.
Summary Generated by Built In
About Ocient:
Ocient is building OcientAIQ™ – a complete ecosystem for delivering trusted agentic AI solutions at petabyte scale, for the organizations that can't afford to get AI wrong. Our customers protect networks, secure nations, and power the global economy. The problems we solve are genuinely hard, and the work matters.
 
Founded in 2016 by the team that built Cleversafe (acquired by IBM in 2015), Ocient is headquartered in Chicago with a remote-first global team. We are a carbon-neutral company backed by leading investors including Greycroft, OCA Ventures, In-Q-Tel, and Buoyant Ventures.

Do not contact Ocient directly to apply for a role. For security purposes, any applications received via email will be deleted.

Job Title: Site Reliability Engineer
Location: Remote (United Kingdom)
Hiring Manager: Service Delivery Engineering Manager
Estimated salary range: £74,000 to £90,000
• The salary offered for this position will be based on a candidate’s experience and skill demonstrated during interviews and other evaluations

Position Overview
Ocient is searching for an experienced Site Reliability Engineer with strong problem-solving skills and a passion for solving hard problems to help maintain and expand Ocient's "as a service" offering of its cutting-edge data warehouse.

Responsibilities
  • Support the design and operations of Ocient's hosted database and related services — including message queues and storage systems — ensuring high availability, performance, and efficiency.
  • Design and maintain monitoring, log centralization, and alerting for all services to facilitate
observability and incident management.
  • Automate deployment and configuration Linux-based servers, including the OS and the
numerous applications that compose our hosted offerings.
  • Develop and maintain rigorous security practices to protect our applications and customer
data.
  • Assist with automation of testing pipelines for the Ocient DB and monitoring of test
infrastructure.

Ideal Qualifications
  • 3+ years of experience in system administration in production environments.
  • Scripting experience with Bash, Python, or other languages.
  • Experience with system and software monitoring and alerting tools, such as the ELK stack,
  • Graylog, InfluxDB, Prometheus, Zabbix, Grafana, Dynatrace, or others.
  • Experience with configuration management software such as Ansible, Puppet, or Chef.
  • Experience with data archiving, backup and disaster recovery
  • Continuous Integration / Continuous Deployment experience with Jenkins, Gitlab CI or
  • others.
  • Experience with source control tools like Git.
  • Ability to work flexible hours and serve in on call rotations.

An Exceptional Candidate Will Have:
  • Knowledge of OWASP principles for application security.
  • Experience with server / system virtualization and containerization technologies e.g.,
  • ProxMox, KVM, VMware.
  • Experience with SQL and Database Administration.
  • Experience managing and operating cloud infrastructure. (e.g. AWS, GCP, Azure)
  • Experience with SSAE18 SOC2 Compliance.
  • Experience with networking administration, including VPN, proxy, DNS, and firewall
configuration.

Interview Requirements: All interviews are conducted via video and require candidates to have their camera on for the duration of the session. The use of video filters, face-altering effects, or virtual backgrounds is not permitted for security and verification purposes.

We are not open to using an agency or staffing company at this time. We do not accept unsolicited agency or staffing resumes and we are not responsible for any fees related to unsolicited resumes. 

Ocient is an equal employment opportunity employer. All qualified applicants will receive consideration for employment without regard to race, creed, color, religion, sex (including pregnancy status), sexual orientation, gender identity, national origin or ancestry, ethnicity, citizenship status, age, physical or mental disability, veteran status, marital status, parental status, genetic information, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, please contact [email protected] for more information.

All official Ocient job postings and recruiting communications will come directly from our team via our Careers page, LinkedIn, or from an ocient.com email address. If you receive communication about a role from any other source, please treat it with caution and direct questions to [email protected].

Skills Required

  • 3+ years of experience in system administration in production environments.
  • Linux system administration and automation.
  • Scripting experience with Bash or Python.
  • Experience with system and software monitoring and alerting tools (ELK Stack, Graylog, InfluxDB, Prometheus, Zabbix, Grafana, Dynatrace).
  • Experience with configuration management software such as Ansible, Puppet, or Chef.
  • Experience with data archiving, backup and disaster recovery.
  • Continuous Integration / Continuous Deployment experience with Jenkins or GitLab CI.
  • Experience with source control tools like Git.
  • Ability to work flexible hours and serve in on call rotations.
  • Knowledge of OWASP principles for application security.
  • Experience with server/system virtualization and containerization (ProxMox, KVM, VMware).
  • Experience with SQL and database administration.
  • Experience managing and operating cloud infrastructure (AWS, GCP, Azure).
  • Experience with SSAE18 SOC2 Compliance.
  • Experience with networking administration including VPN, proxy, DNS, and firewall configuration.
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Chicago, IL
95 Employees
Year Founded: 2016

What We Do

Ocient is a Chicago-based, venture-funded startup building a SQL compliant, exabyte-scale database platform that achieves better performance than Hadoop and NoSQL systems. It is a distributed system optimized for NVMe drives, RDMA networks and high core count processors and is written in C++. We are led by a management team with seven successful st

Gallery

Gallery

Similar Jobs

Acclaim AI Logo Acclaim AI

Site Reliability Engineer

Artificial Intelligence • Information Technology • Cybersecurity
Remote
27 Locations
69 Employees

Replit Logo Replit

Site Reliability Engineer

Artificial Intelligence • Cloud • Machine Learning • Software • Database • App development • Generative AI
Remote
26 Locations
300 Employees

Nebius Logo Nebius

Senior Site Reliability Engineer

Artificial Intelligence • Information Technology • Consulting
In-Office or Remote
27 Locations
473 Employees

Alpaca Logo Alpaca

Site Reliability Engineer

Fintech • Information Technology
Remote
26 Locations
132 Employees

Similar Companies Hiring

Granica Thumbnail
Artificial Intelligence • Big Data • Cloud • Machine Learning • Software • Business Intelligence • Data Privacy
Mountain View, California
45 Employees
MassMutual India Thumbnail
Big Data • Fintech • Information Technology • Insurance • Financial Services
Hyderabad, Telangana
Prolaio Thumbnail
Artificial Intelligence • Big Data • Healthtech • Mobile • Wearables • Analytics
Chicago, IL
82 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account