Alex Spies - Personal Academic Website

About

Welcome to my personal page! I am currently in my final year as a Ph.D. candidate in CS at Imperial College London, advised by Prof. A. Russo and Prof. M. Shanahan.

My research is focused on understanding Deep Learning models to improve their safety and robustness. Recently, I have been most excited by approaches to reverse-engineer the internal representations of Large Language Models through the use of Mechanistic Interpretability techniques. To this end, I co-lead the UnSearch Research Team, which aims to develop a systematic understanding of how/if transformers learn to represent goals, and how they "reason" with respect to their goals.

Alongside my research, I am also deeply passionate about teaching - having contributed to the development and instruction of several courses at Imperial College London, most notably those on 'Maths for Machine Learning' and 'Deep Learning'. I've also recently been working on some side projects including an automated AI Safety Reading List, and a sweeper for ML experiments on various kinds of nodes (SSH, Local etc.).

I recently completed a fantastic stay in Tokyo as a Research Fellow at the National Institute of Informatics under the guidance of Prof. K Inoue. Prior to my Ph.D. I completed an MSc in AI and ML at Imperial, and an MPhys in Physics with Theoretical Physics at the University of Manchester. I am happy to answer questions about these experiences, and to provide general advice on career progression and applications (though it may take me a few weeks to respond!).