The Argonne Leadership Computing Facility’s (ALCF) mission is to accelerate major scientific discoveries and engineering breakthroughs for humanity by designing and providing world-leading computing facilities in partnership with the computational science community. We help researchers solve some of the world’s largest and most complex problems with our unique combination of supercomputing resources and computational science expertise.

ALCF’s performance engineering group is looking for a post-doctoral appointee to perform research and development on a collection of tracers and their uses, in the context of the upcoming exascale platforms, and Aurora in particular. By applying techniques derived from Model Centric Debugging, the candidate will collaborate with application developers and other Argonne Computer Scientists to improve the scope and usefulness of the tracing framework for Heterogeneous computing APIs. The work will take place in a multi-disciplinary environment and will offer opportunities to interact with a wide range of talents from the whole spectrum of HPC research. The successful candidate will be expected to present and publish their work at major symposia and journals.

The successful candidate is expected to contribute into several of the following areas:

Profiling accelerator usage of HPC applications
Debugging accelerator usage
Capturing traces that can be reinjected in simulation frameworks
Extracting kernels for replay, allowing study and tuning in a sand-box
Lightweight and transparent monitoring of platform usage
Required Skills:

Recent or soon-to-be-completed PhD in related field
Comprehensive knowledge in C/C++ programming under Unix/Linux.
Comprehensive knowledge of one or more libraries and tools such as OpenCL, CUDA/HIP, ROCm, Level0
Candidate should have the ability to create, maintain, and support high-quality software.
Ability to model Argonne’s Core Values: Impact, Safety, Respect, Integrity, and Teamwork.
Preferred Skills:

Comprehensive knowledge in System Programming
Experience related to parallel algorithms, I/O architectures, or performance evaluation and tuning.
Extensive expertise in multicore systems, threading, and scientific application codes.
Good written, and communication skills.
Cover letter (optional); uploaded as a PDF document

