Site Reliability Engineer

Job Posted 6 Days Ago Posted 6 Days Ago
Hiring Remotely in USA
Remote
Senior level
Blockchain • Web3
The Role
As a Site Reliability Engineer at Syndica, you will maintain blockchain infrastructure, ensure reliability and performance, and utilize monitoring tools. You’ll work with teams to enhance system security and automate processes.
Summary Generated by Built In

About us:

Syndica is creating the Cloud of Web3. We supply the most critical applications in Web3 with enterprise-grade RPC infrastructure and developer tools tailored for the Solana ecosystem. Joining our team means you'll be held to a high standard, technically challenged, and grow close to a group of individuals passionate about building new infrastructure technologies.

We are backed by strategic partners, investors, and advisors who are all-in on our mission: Chamath Palihapitiya of Social Capital, Steve Jang of Kindred Ventures, Joe McCann of Asymmetric, Jump Crypto, Coinbase Ventures, Solana Ventures, Circle Ventures, and many more.

About you:

  • Great collaborator with 5+ years of experience in a DevOps or SRE role

  • Proficiency in scripting languages (Python, Shell) and experience with at least one modern programming language (Go, Rust, Typescript, etc.)

  • Experience deploying large-scale systems reliably

  • Experience using Kubernetes

  • Working knowledge of web and network protocols and standards (HTTP, TLS, DNS, etc)

  • Working knowledge of information security issues

  • Experience writing automation tools & eagerness to "automate all the things"

  • Commitment to implementing reliability and security best practices

  • Capacity planning experience, including resource optimization and load testing

  • Systematic problem-solving approach, combined with a strong sense of ownership and drive

Standout experience:

  • Experience with Prometheus/Grafana for metrics aggregation/visualization and other monitoring and alerting tools

  • Experience with infrastructure-as-code tools such as Terraform, Ansible, Chef

  • Experience in Building and managing Virtualized systems (KVM, OVM, Containers/Docker) and ability to read and understand source code

  • Knowledge of one or more load testing tools (K6, Locust, JMeter, etc.)

  • Experience with configuration of CI/CD pipelines

About the role:

As a Site Reliability Engineer, you will be accountable for maintaining and operating Syndica’s blockchain infrastructure platform with other infrastructure engineers.

A successful candidate must have demonstrable experience working with at least one major cloud platform language (AWS, Azure, or GCP) via Kubernetes and previous work in SaaS application development and operations.

You will be working closely with the Data and Infrastructure teams on building robust solutions to ensure the highest level of reliability, performance and security of our services. Your work will span the entire end-to-end lifecycle of our systems: initial design and deployment, ongoing monitoring and incident response, and comprehensive analysis of systems to iteratively improve reliability, performance, and security.

Key responsibilities:

  • Administer overall site availability, security, latency, and system health.

  • Effective provisioning, installation/configuration, operation, and maintenance of services and system software and related infrastructure.

  • Develop comprehensive monitoring solutions to provide full visibility to the different system components using tools like Kubernetes, Prometheus, Grafana, ELK, Datadog, New Relic, etc.

  • Enable the development team to release code quickly and reliably by ensuring full observability of systems and automated detection of performance and integration issues.

  • Formulate technical performance measures and implement them using queries, logs, code instrumentation and other analytics tools.

  • Design dashboards and visualizations that effectively convey technical measures

  • Troubleshoot issues at multiple layers of deployment, from hardware, to operating environment, network, and application to conduct root cause analysis and make recommendations from your findings.

  • Work with development teams to ensure best practices for scalability, reliability, and security are designed and implemented from the start.

  • Forecast changes in demand and capacity to establish appropriate scalability plans and drive decisions on the right-sizing of servers, storage and other resources.

  • Design and perform high-throughput stress testing to determine system capacity limits and identify points of failure.

  • Troubleshoot critical customer issues related to Syndica’s RPC, APIs, and App Deployments.

What does success in this role look like?

  • In three months, you will have become our go-to for overall site availability, security, latency, and system health. You will have taken on independent code review responsibilities and be collaborating on the design of new features.

  • In six months, you have earned the trust of the team. You are delivering tasks through the entire SDLC, from design through development, with minimal guidance, and you are helping to effectively mentor new engineers joining the team.

  • In twelve months, you have established a cadence of predictable, on-time delivery without cutting corners.

Top Skills

Ansible
AWS
Azure
Chef
Datadog
Docker
Elk
GCP
Go
Grafana
Jmeter
K6
Kubernetes
Locust
New Relic
Prometheus
Python
Rust
Shell
Terraform
Typescript
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Houston, TX
11 Employees
On-site Workplace

What We Do

Syndica is a developer infrastructure company building the cloud of web3.

Syndica is building the next generation of developer infrastructure for Web3. Developers have arrived to the Solana ecosystem, but the infrastructure they need hasn’t - until now. We are dedicated to building developer infrastructure that just works. Syndica offers highly scalable RPC node infrastructure, with advanced logging and analytics.

Our team is composed of the brightest crypto-native minds from places like Messari and 0x Labs.

We are backed by strategic partners, investors, and advisors who are all-in on our mission: Chamath of Social Capital, Sam Bankman-Fried of Alameda Research, Solana Ventures, and many more.

Learn more about us in Forbes: https://www.forbes.com/sites/ninabambysheva/2021/11/03/chamath-palihapitiyas-social-capital-co-leads-investment-in-solana-based-startup/?sh=63066b2a6964

Similar Jobs

RunPod Logo RunPod

Site Reliability Engineer

Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
Easy Apply
Remote
USA
62 Employees

GitLab Logo GitLab

Intermediate Site Reliability Engineer, US Public Sector Services

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
US
2350 Employees
104K-222K Annually

DFIN Logo DFIN

Principal Site Reliability Engineer - Cloud (Remote)

Artificial Intelligence • Fintech • Information Technology • Software • Data Privacy
Remote
Hybrid
United States
2600 Employees

Cisco Meraki Logo Cisco Meraki

Lead Site Reliability Engineer, Observability - Remote

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
Easy Apply
Remote
Hybrid
San Francisco, CA, USA
3000 Employees
148K-236K Annually

Similar Companies Hiring

Chainlink Labs Thumbnail
Web3 • Payments • Internet of Things • Cryptocurrency • Blockchain
US
680 Employees
Alchemy Thumbnail
Web3 • Software • Information Technology • Cryptocurrency • Blockchain
San Francisco, CA
200 Employees
Block Thumbnail
Software • Payments • Fintech • Financial Services • eCommerce • Cryptocurrency • Blockchain
Oakland, CA
12000 Employees
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account