Julien Demouth

Senior Distinguished EngineerNVIDIA


Julien is the engineering lead at NVIDIA where he supervises projects related to Deep Learning inference for LLMs, Generative AI and Gaming. He is the engineering lead for TensorRT-LLM. Prior to that, Julien participated in the creation of the CUTLASS library, he was the main developer of the inference engine for NVIDIA DLSS, a project he still supervises. He also regularly contributes code to NVIDIA’s MLPerf submissions. Julien is one of the co-inventors of the Tensor Core hardware unit on NVIDIA GPUs.


Invitation-Only. NVIDIA: Efficient deployment and inference of GPU-accelerated LLMs

WhereCreativity Room