Senior Solutions Architect (High-Performance Computing)

Company: New Digital Research Infrastructure Organization
Position Title: Senior Solutions Architect (High-Performance Computing)
Location: Remote, anywhere in Canada
Type: Full Time
Salary Range: Interested candidates are asked to supply a statement that details salary expectations for the role.

NDRIO’S WORK AREAS INCLUDE:

  • Emerging digital science capabilities, including computational and data intensive science, research data; information, laboratory, and experimental systems
  • Scientific computing including the development and acquisition of world-class, secure, High Performance Computing (HPC) capability for various research workloads
  • Nationally recognized research engineering capabilities

POSITION SUMMARY:

The Senior Solutions Architect (HPC) is responsible for leading the creation of a technology framework and providing technical leadership in support of NDRIO initiatives in High-Performance Computing, with a focus on all datacenter functional areas including: High Performance Computing, High Throughput Computing, parallel workflows, MPI, OpenMP, Virtualization, Orchestration tools, High Performance File Systems, Swap storage, warm storage, cold storage, compute, storage, and networking at scale; scientific workstations, Edge computing. As a member of the Strategy & Planning team, the Senior Solutions Architect (HPC) reports to the Director of Architecture providing architectural leadership and best practices for the organization. The Senior Solutions Architect (HPC) will deliver innovative and quality tools and services required by technicians to address challenges in computationally and data-intensive applications. This position will collaborate with other Senior Solutions Architects, NDRIO team members, various working groups, NDRIO executives, and NDRIO Board.

OUR IDEAL CANDIDATE:

We are looking for a collaborator who understands the needs of our varied stakeholders and can produce results according to their requirements. They will have the ability to be agile and flexible in responding to the changing context and shifting priorities of the research ecosystem.

They will be self-motivated individuals that show initiative and strive to understand how their work aligns with NDRIO’s success. They will also be able alternate between working independently and with a team to achieve outcomes. They will have strong attention to detail and take pride in the quality of their work. 

They will have demonstrated high-level stakeholder engagement, strong analytical and critical thinking skills, good written and oral communication, sound professional judgement and a desire to learn about Canada’s research ecosystem.

RESPONSIBILITIES:

  • Work independently to gather requirements and translate those into HPC and HTC solutions
  • Effective whiteboard storytelling to be able to communicate designs and strategies
  • Build Bill of Materials (BoM), create illustrations, elevation, rack, network, dataflow, and other diagrams and provide detailed Statements of Work for technicians
  • Relay technical information to both technical and non-technical staff
  • PoC, Lab & Dev Implementation of systems designed
  • Produce post-Implementation project documentation
  • Provide proficient end-user systems administration training
  • Provide Subject Matter Expert (SME) level expertise in all datacenter functional areas – High Performance Computing, High Throughput Computing, parallel workflows, MPI, OpenMP, Virtualization, Orchestration tools, High Performance File Systems, Swap storage, warm storage, cold storage, compute, and networking at scale; scientific workstations, and Edge computing..
  • Participate in occasional speaking engagements to provide architectural and technical expertise

QUALIFICATIONS

  • Masters’ degree in Computer Science, or in related area, with 10+ years of experience; or 15+ years of equivalent experience / training.
  • Advanced Knowledge of HPC, High Throughput Computing, and associated Cyberinfrastructure.
  • Experience working in an Academic environment
  • Strong written and verbal communication skills.
  • Proven ability to manage personnel on projects from different disciplines, geographically dispersed, and with culturally diverse teams.
  • Proven leadership skills with a collaborative management system that will facilitate interaction within all levels of the organization, as well as corporate partners.
  • Highly advanced skills and demonstrated experience associated with one or more of the following:
  • Proven record of integrating cutting edge hardware and software resources into complex system solutions.
  • Advanced knowledge of HPC middleware stack including cluster management tools, job schedulers and resources managers. Examples include: Slurm, HTCondor, PBS (or derivatives), Maui, Onesis, OpenHPC, Rocks, etc.
  • Advanced knowledge of high-performance storage technology: eg., CEPH, HPFS, BeeGFS, etc.
  • In depth experience in cluster management tasks including deployment, configuration, and troubleshooting of compute nodes, management nodes, network switches, high-performance file systems, and file servers.
  • HPC hardware power and performance analysis
  • Software performance analysis
  • Research, design, modification, implementation, and deployment of HPC / data science applications and tools of large-scale scope.
  • Experience researching and evaluating new technology and solutions for complex environments.

NICE-TO-HAVES:

  • Experience in Virtualization, Containerization, and Public and Private Cloud technologies and associated management and orchestration tools
  • Experience with Python, Matlab, R, and other scientific and engineering software and scripting tools.
  • Cisco, Cray, Dell, HPE, or IBM training
  • Experience working in large Datacenters
  • Some exposure to Quantum technology
  • Working knowledge of the French language.

BENEFITS / WORK PERKS

In addition to a competitive salary and a rewarding career where you can truly make a difference in the Canadian research community, we offer a comprehensive benefits package that meets the various needs of our diverse team and that spans across Canada, including: 

  • Comprehensive Benefits Plan, including 
  • Health
  • Dental
  • Long-Term Disability
  • Life Insurance
  • Flexible Spending Account 
  • Mental Health Supports 
  • Defined Benefit Pension Plan 
  • Paid Vacation
  • Remote Work – Flex Hours 

To apply, please send your resume and covering letter with the subject line Senior Solutions Architect (HPC) to HR-​RH@​engagedri.​ca. Please also supply a statement that details your salary expectations for the role.

NDRIO recognizes that challenges remain in achieving the full participation of equity-seeking groups (including women, Indigenous Peoples, persons with disabilities, members of visible minority/​racialized groups, and members of LGBTQ2+) in research careers and is committed to identifying and eliminating barriers that may exist within its own hiring process, programs, and practices.