Workshop In-person

Mechanistic Interpretability Workshop at ICML 2026

📅 Friday, 10 July 2026 in 33 days

📍 Seoul, South Korea

The field-defining workshop for cracking open the black box — reverse-engineering what neural networks actually compute. Co-located with ICML 2026 in Seoul.

Mechanistic Interpretability Workshop at ICML 2026. Friday 10 July 2026, COEX Convention & Exhibition Center, Seoul, South Korea. Organised by Neel Nanda (Google DeepMind) with co-organisers from Harvard, Oxford, Imperial, Northeastern and Yonsei. Part of ICML 2026 (workshop registration).

The leading venue for mechanistic interpretability — the effort to understand a model's internal weights and activations well enough to predict behaviour, ensure reliability, and detect deceptive or adversarial computation. Topics span feature geometry, circuit analysis, and interpretability for safety and scientific discovery.

It matters because interpretability is the bet that we can make AI trustworthy by understanding it rather than just testing it from the outside. This series (after ICML 2024 and a 600+-attendee NeurIPS 2025 edition) has become the gravitational centre of a community spanning academia, frontier labs and independent researchers — a genuinely high-signal counterpoint to capability-focused conferences.

Related events