Position Overview:
As a Reliability Software Engineer in the Risk team, you will play a critical role in ensuring the performance, stability and availability of the Risk software systems, as well as their day-to-day operations. Squarepoint's Risk platform is responsible for position management, profit/loss computation, inventory/locate management and internal order routing. These critical systems need to be performant, resilient, and capable of timely processing of high volumes of trading data. As such, the team requires a high software development capacity, along with strong analytical skills.
You will primarily be building firm-wide platforms focused on extending Squarepoint's observability, preventing functional regressions and performance regressions, and automating operational flows. You will also make use of these platforms by implementing domain-specific logic on top of them, tailored to the requirements of the relevant sub-teams of Risk. Here are some examples of our projects:
- Observability: Our health check platform is designed to make the implementation of health checks as easy as possible, for any team at Squarepoint. It supports generic health checks that can be set-up through configuration-only, as well as a "plug-n-play" architecture allowing fully custom health checks to be integrated and ran by the platform.
- Preventing functional/performance regressions: We are building a platform that will facilitate and automate benchmarking by abstracting away the scheduling of jobs, the hardware resourcing, the metric collection, the reporting of results, and the integration to Gitlab.
- Automation: We are building a self-serve automation platform that will allow users to request changes to our system configuration through a Jira portal. Once the necessary approvals gathered, the platform automatically schedules a job to apply the requested changes.
Operations are important to ensure business continuity, as-such our responsibilities also include:
- Level-2 support: In order to ensure business uptime, every member of the team contributes to a daily support ROTA. During business hours, people on-duty will prioritise responding to incidents over their project work. On average, people are on-duty one day per week.
- Incident management: Root cause analyses are performed to understand the source of incidents and to raise appropriate remedial actions.
- Day-to-day operations: Until they're automated, the team is responsible for tweaking our system configurations to address user requests and correcting historical data in our databases.
Required Qualifications:
- Education: Bachelor’s degree in Computer Science or related subject
- Experience: 4+ years proven experience in Software Engineering, Software Reliability, or similar role with hand-on experience in software development and providing L2 support
- Experience of developing in Python, and familiarity with version control systems such as git
- Experience working in a Linux environment
- Problem-Solving Skills: Strong analytical and problem-solving skills with a keen eye for detail and a proactive approach to resolving issues
- Communication: Excellent communication and collaboration skills to work effectively with cross-functional teams
- Adaptability: Ability to work in a fast-paced and dynamic environment, adapting to changing priorities and requirements
- Automation and Tooling: Experience developing automation tools and implementing configuration management
Nice to have:
- Experience with Kafka or AMPS
- Experience with Kubernetes or Slurm
- Experience developing with PostgreSQL, Clickhouse or KDB/q
Top Skills
What We Do
Squarepoint Capital is a leading global investment management firm that develops quantitative investment strategies to achieve high quality returns for our clients. We are a data and technology driven firm who specialize in developing automated trading systems that execute across global financial markets.
.png)








