Nexaminds

Senior Site Reliability Engineer

Posted 8 Days Ago

Be an Early Applicant

Hiring Remotely in México

Remote

Senior level

Artificial Intelligence • Cloud • Information Technology • Machine Learning • Natural Language Processing • Consulting

Cutting-edge AI, Machine Learning and Cloud Transformation company at the forefront of technological innovation

The Role

The Senior Site Reliability Engineer at Nexaminds is responsible for designing, implementing, and maintaining reliable infrastructure while automating provisioning and monitoring systems, collaborating across teams, and optimizing performance.

Summary Generated by Built In

Unlock Your Future with Nexaminds!

At Nexaminds, we're on a mission to redefine industries with AI. We're passionate about the limitless potential of artificial intelligence to transform businesses, streamline processes, and drive growth.

Join us on our visionary journey. We're leading the way in AI solutions, and we're committed to innovation, collaboration, and ethical practices. Become a part of our team and shape the future powered by intelligent machines. If you're driven by ambition, success, fun, and learning, Nexaminds is where you belong.

Location: MEXICO

Eligibility Notice This position is only open to candidates who are Mexican citizens currently residing in Mexico. Applications from candidates who do not meet this legal and operational requirement will not be considered. We appreciate your interest and encourage you to apply to roles that match your location.

Nexaminds is looking for a Senior Site Reliability Engineer to design, implement, and maintain scalable, reliable, and secure infrastructure supporting critical applications and services. The ideal candidate has strong experience with cloud technologies, automation, observability, and performance optimization, and enjoys working in a fast-paced, highly collaborative environment.

Qualifications we are looking for:

5+ years of experience as a Site Reliability Engineer or in a similar role supporting production environments.
Strong experience with AWS cloud infrastructure and Kubernetes.
Hands-on experience with Infrastructure as Code (IaC) using Terraform.
Experience with automation and scripting using Python, Bash/Shell, or Go.
Deep understanding of Linux/Unix systems and networking fundamentals.
Experience with monitoring and observability tools such as Datadog, Prometheus, and Grafana.
Familiarity with CI/CD pipelines and DevOps best practices.
Strong troubleshooting, performance optimization, and root cause analysis skills.
Excellent communication and collaboration skills.
Advanced English communication skills.

Preferred Qualifications:

Experience with Terragrunt, Puppet, Rundeck, Java, or Spring Framework.
Familiarity with observability tools such as Loki, Vector, or VictoriaMetrics.
Experience working with additional cloud platforms such as GCP or Azure.
AWS or Kubernetes certifications such as AWS Solutions Architect or Certified Kubernetes Administrator (CKA).

Job duties:

Design, build, and maintain scalable infrastructure across AWS and hybrid/on-premise environments.
Automate infrastructure provisioning and configuration using Terraform, Terragrunt, Puppet, and scripting tools.
Develop and maintain CI/CD pipelines and infrastructure automation workflows.
Monitor system reliability, performance, and availability through observability and alerting solutions.
Implement and manage SLOs/SLIs to support reliability and operational goals.
Troubleshoot complex production issues across infrastructure, Kubernetes, networking, databases, and application layers.
Execute production changes with strong operational discipline, including rollback planning and validation.
Collaborate with security teams to implement and maintain security and compliance best practices.
Participate in incident response, root cause analysis, postmortems, and on-call rotations.
Design and validate disaster recovery and business continuity strategies.
Support cloud cost optimization initiatives and FinOps practices.
Maintain operational documentation, runbooks, and troubleshooting guides.
Collaborate closely with engineering, platform, and cross-functional teams to drive successful project delivery.

What you can expect from us

Here at Nexaminds, we're not your typical workplace. We're all about creating a friendly and trusting environment where you can thrive. Why does this matter? Well, trust and openness lead to better quality, innovation, commitment to getting the job done, efficiency, and cost-effectiveness.

Stock options 📈
Remote work options 🏠
Flexible working hours 🕜
Benefits above the law
But it's not just about the work; it's about the people too. You'll be collaborating with some seriously awesome IT pros.
You'll have access to mentorship and tons of opportunities to learn and level up.

Ready to embark on this journey with us? 🚀🎉 If you're feeling the excitement, go ahead and apply!

Skills Required

5+ years of experience as a Site Reliability Engineer or in a similar role
Strong experience with AWS cloud infrastructure and Kubernetes
Hands-on experience with Infrastructure as Code (IaC) using Terraform
Experience with automation and scripting using Python, Bash/Shell, or Go
Deep understanding of Linux/Unix systems and networking fundamentals
Experience with monitoring and observability tools such as Datadog, Prometheus, and Grafana
Familiarity with CI/CD pipelines and DevOps best practices
Strong troubleshooting, performance optimization, and root cause analysis skills
Excellent communication and collaboration skills
Advanced English communication skills

View all jobs at Nexaminds

View Nexaminds Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

80 Employees

Year Founded: 2023

What We Do

Nexaminds is a cutting-edge AI and Cloud Transformation company at the forefront of technological innovation. We specialize in developing advanced artificial intelligence solutions that empower businesses to unlock their full potential. With a team of skilled AI and Cloud experts and a passion for revolutionizing industries, we are dedicated to delivering intelligent, scalable, and customized solutions tailored to meet our clients' specific needs. We strive to create innovative AI-driven solutions that enhance efficiency, productivity, and decision-making processes across industries. By leveraging state-of-the-art technology and fostering a culture of creativity, we aim to be the trusted partner that enables organizations to embrace the future with confidence.