Bruno Magalhaes
|
Hi! I'm a research engineer on the fields of Machine Learning (ML) and High Performance Computing (HPC). I work at Microsoft Research Cambridge on Project Silica, where I create large parallel-distributed ML models and pipelines on the cloud.
Prior to this, I completed a PhD in Computational Neuroscience at EPFL, researching large-scale variable-step simulation of brain-inspired spiking neural networks. Before that, I was an HPC research engineer at the Blue Brain Project at EPFL, focused on distributed computing, storage and multicore/GPU algorithms on supercomputers.
|
On the side, I maintain a publications bookmark where I summarize several papers of interest, and a resources page where I keep track of related books and material available online. My google scholar page indexes most of my scientific publications. When time allows, I post about HPC and ML:
2023
|
Distributed training of a GPT model (part 2): pipeline parallelism, Megatron-LM model parallelism and communication quantization
|
2023
|
Distributed training of a GPT model with DeepSpeed's ZeRO, sharding, offloading, and activation checkpointing
|
2023
|
Building a GPT model in C++, and benchmarking LibTorch, PyTorch, TorchScript and torch.compile
|
2023
|
Building a GPT model in PyTorch from scratch
|
2020
|
AI Supercomputing (part 2): Encoder-Decoder, Transformers, BERT, Sharding, and model compression
|
2020
|
AI Supercomputing: Levels of Parallelism, Linear Regression, Deep Neural Nets and Convolutional Neural Nets
|
2020
|
Generative Adversarial Networks
|
2019
|
Variational Autoencoders
|
2019
|
Variational Inference: ELBO, Mean-Field Approximation, CAVI and Gaussian Mixture Models
|
2019
|
Exponential Family of Distributions
|
2018
|
Bayesian Linear Regression, Maximum Likelihood and Maximum-A-Priori
|
2018
|
Statistics for ML Engineers
|
2018
|
Algebra for ML Engineers
|
2018
|
Deep Neural Networks, backpropagation, autodiff, dropout, CNNs and embeddings
|
2017
|
Unsupervised Learning basics and Principal Component Analysis
|
2017
|
Variable Timestep Simulation of the Electrical Activity of Neurons
|
2017
|
Closed-form Linear Regression and Matrix Factorization, and loss functions
|
2016
|
Numerical Resolution of the Electrical Activity of Detailed Neuron Models
|
2016
|
The Leaky Integrate-and-Fire Neuron Model and The Brunel Network
|
2015
|
Distributed Orthogonal Slicing for Load Balancing of Large Spatial Datasets
|
2015
|
Distributed Matrix Transpose Algorithms
|
2014
|
Distributed Sorting Algorithms
|