{"id":109799,"date":"2026-03-14T12:07:19","date_gmt":"2026-03-14T12:07:19","guid":{"rendered":"https:\/\/www.lafosse.com\/job\/hpc-sre-quant-research-hft\/"},"modified":"2026-03-14T12:07:20","modified_gmt":"2026-03-14T12:07:20","slug":"118783","status":"publish","type":"job_listing","link":"https:\/\/www.lafosse.com\/job\/118783\/","title":{"rendered":"HPC SRE &#8211; Quant Research, HFT"},"content":{"rendered":"<div class=\"otQkpb\" data-animation-nesting=\"\" data-sfc-cp=\"\" data-sfc-cb=\"\" data-complete=\"true\" data-processed=\"true\" data-sae=\"\"><u><strong>Network Site Reliability Engineer &#8211; Python\/GO, Observability, Monitoring, HPC<\/strong><\/u><\/div>\n<div class=\"otQkpb\" data-animation-nesting=\"\" data-sfc-cp=\"\" data-sfc-cb=\"\" data-complete=\"true\" data-processed=\"true\" data-sae=\"\">&nbsp;<\/div>\n<div class=\"Y3BBE\" data-sfc-cp=\"\" data-sfc-cb=\"\" data-hveid=\"CAEIAhAA\" data-complete=\"true\" data-processed=\"true\">Within the Network Engineering Team,&nbsp;this role is critical in ensuring our clients High-Performance Computing (HPC)&nbsp;environments are supported by a resilient, data-driven, and software-defined network foundation.<\/div>\n<div class=\"Fsg96\" data-sfc-cp=\"\" data-sfc-cb=\"\" data-complete=\"true\" data-processed=\"true\">&nbsp;<\/div>\n<div class=\"Y3BBE\" data-sfc-cp=\"\" data-sfc-cb=\"\" data-hveid=\"CAEIBBAA\" data-complete=\"true\" data-processed=\"true\">We are seeking a&nbsp;Networks focused Site Reliability Engineer (SRE)&nbsp;with a focus on&nbsp;Observability, Telemetry, and Monitoring. In this role, you will apply a software engineering mindset to network operations, bridging the gap between traditional networking and modern Site Reliability Engineering (SRE).<\/div>\n<div class=\"Y3BBE\" data-sfc-cp=\"\" data-sfc-cb=\"\" data-hveid=\"CAEIBRAA\" data-complete=\"true\" data-processed=\"true\">You will be responsible for ensuring our high-performance network infrastructure is not just functional, but deeply visible. You will build the tooling and automation that allow the team to move from reactive troubleshooting to proactive, automated remediation and &#8220;self-healing&#8221; infrastructure.<\/div>\n<div class=\"Y3BBE\" data-sfc-cp=\"\" data-sfc-cb=\"\" data-hveid=\"CAEIBhAA\" data-complete=\"true\" data-processed=\"true\">Key Responsibilities:<\/div>\n<ul class=\"KsbFXc U6u95\" data-sfc-cb=\"\" data-complete=\"true\" data-processed=\"true\">\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEIBxAA\" data-complete=\"true\" data-sae=\"\">Reliability Engineering:&nbsp;Apply SRE principles to the network; define and maintain&nbsp;SLIs, SLOs, and Error Budgets&nbsp;for network latency, packet loss, and availability.<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEIBxAB\" data-complete=\"true\" data-sae=\"\">HPC Connectivity &amp; Performance:&nbsp;Support low-latency, high-throughput network architectures (e.g., RDMA, RoCE) designed for intensive HPC and financial data workloads.<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEIBxAC\" data-complete=\"true\" data-sae=\"\">Advanced Telemetry:&nbsp;Design and manage high-cardinality telemetry pipelines to collect and analyze flow logs, metrics, and traces at scale.<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEIBxAD\" data-complete=\"true\" data-sae=\"\">Network Automation (Python\/Go):&nbsp;Build and maintain internal software tools, APIs, and &#8220;self-healing&#8221; scripts to automate routine operations and complex failure recoveries.<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEIBxAE\" data-complete=\"true\" data-sae=\"\">Infrastructure-as-Code (IaC):&nbsp;Use&nbsp;Terraform&nbsp;to manage complex network configurations and observability stacks (Prometheus, Grafana, OpenSearch) as code.<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEIBxAF\" data-complete=\"true\" data-sae=\"\">Observability &amp; Monitoring:&nbsp;Implement automated alerting and dashboarding that provide real-time insights into network health and traffic patterns.<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEIBxAG\" data-complete=\"true\" data-sae=\"\">Incident Management &amp; Post-Mortems:&nbsp;Lead technical troubleshooting for complex outages and conduct &#8220;blameless post-mortems&#8221; to drive systemic improvements.<\/li>\n<\/ul>\n<div class=\"AdPoic\" data-animation-nesting=\"\" data-sfc-cp=\"\" data-sfc-cb=\"\" data-complete=\"true\" data-processed=\"true\" data-sae=\"\">&nbsp;<\/div>\n<div class=\"AdPoic\" data-animation-nesting=\"\" data-sfc-cp=\"\" data-sfc-cb=\"\" data-complete=\"true\" data-processed=\"true\" data-sae=\"\">Your Present Skillset<\/div>\n<ul class=\"KsbFXc U6u95\" data-sfc-cb=\"\" data-complete=\"true\" data-processed=\"true\">\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEICRAA\" data-complete=\"true\" data-sae=\"\">3+ years of experience&nbsp;in a Network Reliability (NRE), SRE, or Network Operations role within a high-performance environment.<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEICRAB\" data-complete=\"true\" data-sae=\"\">Software Engineering Mindset:&nbsp;Strong proficiency in&nbsp;Python&nbsp;and&nbsp;Go&nbsp;for building automation, custom exporters, or network management tools.<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEICRAC\" data-complete=\"true\" data-sae=\"\">Observability Stack Expertise:&nbsp;Hands-on experience with&nbsp;Prometheus, Grafana, OpenSearch\/Elasticsearch, and distributed tracing.<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEICRAD\" data-complete=\"true\" data-sae=\"\">Networking Fundamentals:&nbsp;Deep knowledge of TCP\/IP, BGP, EVPN, and routing\/switching concepts in a high-bandwidth environment.<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEICRAE\" data-complete=\"true\" data-sae=\"\">Infrastructure as Code:&nbsp;Proven experience using&nbsp;Terraform&nbsp;to ensure scalable, repeatable, and version-controlled network deployments.<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEICRAF\" data-complete=\"true\" data-sae=\"\">HPC Awareness:&nbsp;Familiarity with the networking requirements of high-performance computing, such as non-blocking fabrics and low-latency interconnects.<\/li>\n<\/ul>\n<div class=\"AdPoic\" data-animation-nesting=\"\" data-sfc-cp=\"\" data-sfc-cb=\"\" data-complete=\"true\" data-processed=\"true\" data-sae=\"\">&nbsp;<\/div>\n<div class=\"AdPoic\" data-animation-nesting=\"\" data-sfc-cp=\"\" data-sfc-cb=\"\" data-complete=\"true\" data-processed=\"true\" data-sae=\"\">Desirable Experience<\/div>\n<ul class=\"KsbFXc U6u95\" data-sfc-cb=\"\" data-processed=\"true\" data-complete=\"true\">\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEICxAA\" data-complete=\"true\" data-sae=\"\">Streaming Telemetry:&nbsp;Experience with gNMI, gRPC, or Kafka for real-time network data streaming.<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEICxAB\" data-sae=\"\" data-complete=\"true\">CI\/CD for Networking:&nbsp;Familiarity with &#8220;NetDevOps&#8221; workflows, including automated testing (Pytest\/Go test) and pipeline validation for network changes.<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEICxAC\" data-complete=\"true\" data-processed=\"true\" data-sae=\"\">Container Networking:&nbsp;Knowledge of Kubernetes networking, CNI plugins, and Service Mesh (e.g., Istio or Cilium).<\/li>\n<li class=\"dF3vjf\" data-sfc-cb=\"\" data-hveid=\"CAEICxAD\" data-complete=\"true\" data-processed=\"true\" data-sae=\"\">Traffic Engineering:&nbsp;Experience with segment routing or advanced load-balancing strategies for high-performance workloads.<\/li>\n<\/ul>\n<p>, <\/p>\n","protected":false},"author":0,"featured_media":0,"template":"","meta":{"_acf_changed":false,"inline_featured_image":false,"_promoted":"","_job_location":"Greater London","_application":"daniel.greenwood.745334137.0@applybe.com","_company_name":"Cloud, Infrastructure & Networks","_company_website":"","_company_tagline":"","_company_twitter":"","_company_video":"","_filled":0,"_featured":0,"_remote_position":0,"_job_salary":"\u00a3120k - 180k per year + BONUS","_job_salary_currency":"\u00a3","_job_salary_unit":"annum","_links_to":"","_links_to_target":""},"job-types":[575,576,2],"job_listing_discipline":[11,1195],"job_listing_function":[880,286],"job_listing_industry":[797,779],"job_listing_technology":[1071,1070,889,512],"job_listing_level":[541],"job_listing_location":[580],"job_listing_filter":[713,718,717],"jobrelay_source":[659,1252],"class_list":{"0":"post-109799","1":"job_listing","2":"type-job_listing","3":"status-publish","4":"hentry","5":"job_listing_discipline-cloud-infrastructure-and-services","6":"job_listing_discipline-infrastructure-and-services","7":"job_listing_function-it-networks","8":"job_listing_function-network-engineer","9":"job_listing_location-london","10":"job_listing_filter-flexible-working","11":"job_listing_filter-office-based","12":"job_listing_filter-permanent","14":"job-type-hybrid-working","15":"job-type-office-based","16":"job-type-permanent"},"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.9 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>HPC SRE - Quant Research, HFT - La Fosse<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.lafosse.com\/job\/118783\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"HPC SRE - Quant Research, HFT - La Fosse\" \/>\n<meta property=\"og:description\" content=\"Network Site Reliability Engineer &#8211; Python\/GO, Observability, Monitoring, HPC &nbsp; Within the Network Engineering Team,&nbsp;this role is critical in ensuring our clients High-Performance Computing (HPC)&nbsp;environments are supported by a resilient, data-driven, and software-defined network foundation. &nbsp; We are seeking a&nbsp;Networks focused Site Reliability Engineer (SRE)&nbsp;with a focus on&nbsp;Observability, Telemetry, and Monitoring. In this role, you\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.lafosse.com\/job\/118783\/\" \/>\n<meta property=\"og:site_name\" content=\"La Fosse\" \/>\n<meta property=\"article:modified_time\" content=\"2026-03-14T12:07:20+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\n\t    \"@context\": \"https:\/\/schema.org\",\n\t    \"@graph\": [\n\t        {\n\t            \"@type\": \"WebPage\",\n\t            \"@id\": \"https:\/\/www.lafosse.com\/job\/118783\/\",\n\t            \"url\": \"https:\/\/www.lafosse.com\/job\/118783\/\",\n\t            \"name\": \"HPC SRE - Quant Research, HFT - La Fosse\",\n\t            \"isPartOf\": {\n\t                \"@id\": \"https:\/\/www.lafosse.com\/#website\"\n\t            },\n\t            \"datePublished\": \"2026-03-14T12:07:19+00:00\",\n\t            \"dateModified\": \"2026-03-14T12:07:20+00:00\",\n\t            \"breadcrumb\": {\n\t                \"@id\": \"https:\/\/www.lafosse.com\/job\/118783\/#breadcrumb\"\n\t            },\n\t            \"inLanguage\": \"en-US\",\n\t            \"potentialAction\": [\n\t                {\n\t                    \"@type\": \"ReadAction\",\n\t                    \"target\": [\n\t                        \"https:\/\/www.lafosse.com\/job\/118783\/\"\n\t                    ]\n\t                }\n\t            ]\n\t        },\n\t        {\n\t            \"@type\": \"BreadcrumbList\",\n\t            \"@id\": \"https:\/\/www.lafosse.com\/job\/118783\/#breadcrumb\",\n\t            \"itemListElement\": [\n\t                {\n\t                    \"@type\": \"ListItem\",\n\t                    \"position\": 1,\n\t                    \"name\": \"Home\",\n\t                    \"item\": \"https:\/\/www.lafosse.com\/\"\n\t                },\n\t                {\n\t                    \"@type\": \"ListItem\",\n\t                    \"position\": 2,\n\t                    \"name\": \"HPC SRE &#8211; Quant Research, HFT\"\n\t                }\n\t            ]\n\t        },\n\t        {\n\t            \"@type\": \"WebSite\",\n\t            \"@id\": \"https:\/\/www.lafosse.com\/#website\",\n\t            \"url\": \"https:\/\/www.lafosse.com\/\",\n\t            \"name\": \"La Fosse\",\n\t            \"description\": \"Recruitment, Leadership, &amp; Talent Solutions\u00a0Across Tech,\u00a0Digital, &amp;\u00a0Change\",\n\t            \"potentialAction\": [\n\t                {\n\t                    \"@type\": \"SearchAction\",\n\t                    \"target\": {\n\t                        \"@type\": \"EntryPoint\",\n\t                        \"urlTemplate\": \"https:\/\/www.lafosse.com\/?s={search_term_string}\"\n\t                    },\n\t                    \"query-input\": {\n\t                        \"@type\": \"PropertyValueSpecification\",\n\t                        \"valueRequired\": true,\n\t                        \"valueName\": \"search_term_string\"\n\t                    }\n\t                }\n\t            ],\n\t            \"inLanguage\": \"en-US\"\n\t        }\n\t    ]\n\t}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"HPC SRE - Quant Research, HFT - La Fosse","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.lafosse.com\/job\/118783\/","og_locale":"en_US","og_type":"article","og_title":"HPC SRE - Quant Research, HFT - La Fosse","og_description":"Network Site Reliability Engineer &#8211; Python\/GO, Observability, Monitoring, HPC &nbsp; Within the Network Engineering Team,&nbsp;this role is critical in ensuring our clients High-Performance Computing (HPC)&nbsp;environments are supported by a resilient, data-driven, and software-defined network foundation. &nbsp; We are seeking a&nbsp;Networks focused Site Reliability Engineer (SRE)&nbsp;with a focus on&nbsp;Observability, Telemetry, and Monitoring. In this role, you","og_url":"https:\/\/www.lafosse.com\/job\/118783\/","og_site_name":"La Fosse","article_modified_time":"2026-03-14T12:07:20+00:00","twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.lafosse.com\/job\/118783\/","url":"https:\/\/www.lafosse.com\/job\/118783\/","name":"HPC SRE - Quant Research, HFT - La Fosse","isPartOf":{"@id":"https:\/\/www.lafosse.com\/#website"},"datePublished":"2026-03-14T12:07:19+00:00","dateModified":"2026-03-14T12:07:20+00:00","breadcrumb":{"@id":"https:\/\/www.lafosse.com\/job\/118783\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.lafosse.com\/job\/118783\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.lafosse.com\/job\/118783\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.lafosse.com\/"},{"@type":"ListItem","position":2,"name":"HPC SRE &#8211; Quant Research, HFT"}]},{"@type":"WebSite","@id":"https:\/\/www.lafosse.com\/#website","url":"https:\/\/www.lafosse.com\/","name":"La Fosse","description":"Recruitment, Leadership, &amp; Talent Solutions\u00a0Across Tech,\u00a0Digital, &amp;\u00a0Change","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.lafosse.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/www.lafosse.com\/wp-json\/wp\/v2\/job-listings\/109799","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.lafosse.com\/wp-json\/wp\/v2\/job-listings"}],"about":[{"href":"https:\/\/www.lafosse.com\/wp-json\/wp\/v2\/types\/job_listing"}],"wp:attachment":[{"href":"https:\/\/www.lafosse.com\/wp-json\/wp\/v2\/media?parent=109799"}],"wp:term":[{"taxonomy":"job_listing_type","embeddable":true,"href":"https:\/\/www.lafosse.com\/wp-json\/wp\/v2\/job-types?post=109799"},{"taxonomy":"job_listing_discipline","embeddable":true,"href":"https:\/\/www.lafosse.com\/wp-json\/wp\/v2\/job_listing_discipline?post=109799"},{"taxonomy":"job_listing_function","embeddable":true,"href":"https:\/\/www.lafosse.com\/wp-json\/wp\/v2\/job_listing_function?post=109799"},{"taxonomy":"job_listing_industry","embeddable":true,"href":"https:\/\/www.lafosse.com\/wp-json\/wp\/v2\/job_listing_industry?post=109799"},{"taxonomy":"job_listing_technology","embeddable":true,"href":"https:\/\/www.lafosse.com\/wp-json\/wp\/v2\/job_listing_technology?post=109799"},{"taxonomy":"job_listing_level","embeddable":true,"href":"https:\/\/www.lafosse.com\/wp-json\/wp\/v2\/job_listing_level?post=109799"},{"taxonomy":"job_listing_location","embeddable":true,"href":"https:\/\/www.lafosse.com\/wp-json\/wp\/v2\/job_listing_location?post=109799"},{"taxonomy":"job_listing_filter","embeddable":true,"href":"https:\/\/www.lafosse.com\/wp-json\/wp\/v2\/job_listing_filter?post=109799"},{"taxonomy":"jobrelay_source","embeddable":true,"href":"https:\/\/www.lafosse.com\/wp-json\/wp\/v2\/jobrelay_source?post=109799"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}