

by Events
This Tuesday LISA is excited to welcome Luca Baroni, Mo Baker and Dan Wilhelm, visiting researchers at Meridian Cambridge, to share a new mech interp technique recently submitted to COLM - further details below.
Model Organisms Are Leaky: Perplexity Differencing Often Reveals Finetuning ObjectivesWe present a simple perplexity diffing method that identifies the finetuning objective in a large majority of model organisms from the literature (N=76, including those containing backdoors, hidden false facts, and concerning behaviors) in a fully unsupervised manner without prior assumptions. These results demonstrate widespread tendencies from model organisms to overgeneralize finetuned behaviors beyond the intended context. The method is contrastive, but even reference models belonging to other model families remain competitive.
Findings will be presented in a 20 minute talk, followed by an extended Q&A.
Sign in to view full event details
Create a free account to see descriptions, save events, and more
Meridian X LISA: Model Organisms Are Leaky: Perplexity Differencing Often Reveals Finetuning Objectives is a free independent taking place on Tuesday, April 28, 2026 at London Initiative for Safe AI (LISA), 25 Holywell Row, London EC2A 4XE, UK, London, United Kingdom. This independent is organised by Events. Attendance is free โ register to secure your spot. The event runs for approximately 1 hour.
Join this independent over 1 hour for an engaging session of learning, discussion, and networking with fellow attendees.
This independent in London is ideal for:
This evening independent is part of the growing events scene in London. Whether you're based in London or visiting for the independent, it's a great opportunity to connect with the local community. Browse more upcoming events in London on Rifio.
Meridian X LISA: Model Organisms Are Leaky: Perplexity Differencing Often Reveals Finetuning Objectives covers topics including AI. Find similar events by browsing these topics on Rifio.