Mitigating Privacy Risk via Forget Set-Free Unlearning

Abstract

Training machine learning models requires the storage of large datasets, which often contain sensitive or private data. Storing data is associated with a number of potential risks which increase over time, such as database breaches and malicious adversaries.

Machine unlearning is the study of methods to efficiently remove the influence of training data subsets from previously-trained models. Existing unlearning methods typically require direct access to the "forget set"---the data to be forgotten-and organisations must retain this data for unlearning rather than deleting it immediately upon request, increasing risks associated with the forget set.

We introduce partially-blind unlearning---utilizing auxiliary information to unlearn without explicit access to the forget set. We also propose a practical framework Reload, a partially-blind method based on gradient optimization and structured weight sparsification to operationalize partially-blind unlearning.

We show that Reload efficiently unlearns, approximating models retrained from scratch, and outperforms several forget set-dependent approaches. On language models, Reload unlearns entities using <0.025% of the retain set and <7% of model weights in <8 minutes on Llama2-7B. In the corrective case, Reload achieves unlearning even when only 10% of corrupted data is identified.

Privacy Risk

In many facets of modern life, individuals consent for institutions to collect and use their personal data. The act of collecting and storing a user's data poses inherent risk to the user.

Informally, modern machine learning systems expose the user to two types of risk: dataset risk represents the user risk associated with an institution storing a user's data, while model risk represents the additional risk to the user when their data is used to train a machine learning model.

Conventional unlearning algorithms admit a cumulative user risk totalling the sum of the green, blue, and red regions. By allowing user data to be deleted immediately once a request for deletion is made, Reload eliminates the user risk associated with the red region.

Reload Unlearning

Reload, combines insights from three families of unlearning algorithms - gradient-based, structured sparsity-based, and finetuning-based algorithms. An organisation using Reload would be able to immediately delete user data once a request for deletion is made without inhibiting downstream unlearning.

1. Ascent: Reload performs a single gradient ascent step to move the model weights away from convergence on the forget set using cached gradients.

2. Re-initialisation: It computes a knowledge value for each weight to identify its reliance on the forget data, selectively re-initializing the weights below a specified quantile threshold to yield new parameters.

3. Finetuning: Finally, the modified model is fine-tuned to convergence on the retain set via gradient-based optimization to recover overall performance.

Empirical Results

Methodological Introspection

In observing model representations during Reload, we can visualise each step of the process of unlearning the class "8" from a ResNet-18 model trained on the SVHN dataset, showing the contributon of each component of Reload to the unlearning process.

Classical Unlearning Results

Using Reload a practitioner can unlearn random and correlated data from an image classification model and unlearn entities from language models without accessing that data. This enables private, secure, and efficient unlearning while strengthening guarantees provided by legislature such as GDPR and the "right to be forgotten". Full results are in the paper.

Corrective Unlearning

Reload can remove data corruptions even when only 10% of the corrupted data is identified, without using that data. This means that using Reload, practitioners can mitigate the dangers presented by adversarial, incorrect, or outdated data without needing to retrain a model from scratch or be worried about the influence of outdated data.

BibTeX

@inproceedings{
  newatia2026mitigating,
  title={Mitigating Privacy Risk via Forget Set-Free Unlearning},
  author={Aviraj Newatia and Michael Cooper and Viet Nguyen and Rahul G Krishnan},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://openreview.net/forum?id=d3R0TF7w5f}
  }