The memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (2024)

The memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (2)

Advanced Search

nips

research-article

Free Access

  • Authors:
  • Peter Nickl RIKEN Center for AI Project, Tokyo, Japan

    RIKEN Center for AI Project, Tokyo, Japan

    Search about this author

    ,
  • Lu Xu RIKEN Center for AI Project, Tokyo, Japan

    RIKEN Center for AI Project, Tokyo, Japan

    Search about this author

    ,
  • Dharmesh Tailor University of Amsterdam, Amsterdam, Netherlands

    University of Amsterdam, Amsterdam, Netherlands

    Search about this author

    ,
  • Thomas Möllenhoff RIKEN Center for AI Project, Tokyo, Japan

    RIKEN Center for AI Project, Tokyo, Japan

    Search about this author

    ,
  • Mohammad Emtiyaz Khan RIKEN Center for AI Project, Tokyo, Japan

    RIKEN Center for AI Project, Tokyo, Japan

    Search about this author

Published:30 May 2024Publication History

  • 0citation
  • 0
  • Downloads

Metrics

Total Citations0Total Downloads0

Last 12 Months0

Last 6 weeks0

  • Get Citation Alerts

    New Citation Alert added!

    This alert has been successfully added and will be sent to:

    You will be notified whenever a record that you have chosen has been cited.

    To manage your alert preferences, click on the button below.

    Manage my Alerts

    New Citation Alert!

    Please log in to your account

  • Publisher Site

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing Systems

The memory perturbation equation: understanding model's sensitivity to data

Pages 26923–26949

PreviousChapterNextChapter

The memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (3)

ABSTRACT

Understanding model's sensitivity to its training data is crucial but can also be challenging and costly, especially during training. To simplify such issues, we present the Memory-Perturbation Equation (MPE) which relates model's sensitivity to perturbation in its training data. Derived using Bayesian principles, the MPE unifies existing sensitivity measures, generalizes them to a wide-variety of models and algorithms, and unravels useful properties regarding sensitivities. Our empirical results show that sensitivity estimates obtained during training can be used to faithfully predict generalization on unseen test data. The proposed equation is expected to be useful for future research on robust and adaptive learning.

References

  1. Vincent Adam, Paul Chang, Mohammad Emtiyaz Khan, and Arno Solin. Dual Parameterization of Sparse Variational Gaussian Processes. Advances in Neural Information Processing Systems, 2021. 20Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (4)
  2. Chirag Agarwal, Daniel D'Souza, and Sara Hooker. Estimating Example Difficulty using Variance of Gradients. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022. 3Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (5)Cross Ref
  3. Devansh Arpit, Stanisław Jastrzębski, Nicolas Ballas, David Krueger, Emmanuel Bengio, Maxinder S Kanwal, Tegan Maharaj, Asja Fischer, Aaron Courville, Yoshua Bengio, and Simon Lacoste-Julien. A Closer Look at Memorization in Deep Networks. In International Conference on Machine Learning, 2017. 3, 6Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (7)
  4. Gregor Bachmann, Thomas Hofmann, and Aurélien Lucchi. Generalization Through The Lens of Leave-One-Out Error. In International Conference on Learning Representations, 2022. 2, 7Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (8)
  5. Samyadeep Basu, Phil Pope, and Soheil Feizi. Influence Functions in Deep Learning Are Fragile. In International Conference on Learning Representations, 2021. 6Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (9)
  6. Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006. 5, 7Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (10)Digital Library
  7. R Dennis Cook. Detection of Influential Observation in Linear Regression. Technometrics, 1977. 1, 3, 5Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (12)
  8. R Dennis Cook and Sanford Weisberg. Characterizations of an Empirical Influence Function for Detecting Influential Cases in Regression. Technometrics, 1980. 1, 2, 3Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (13)
  9. R Dennis Cook and Sanford Weisberg. Residuals and Influence in Regression. Chapman and Hall, 1982. 14Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (14)
  10. Corinna Cortes and Vladimir Vapnik. Support-Vector Networks. Machine learning, 1995. 4Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (15)
  11. Erik Daxberger, Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Matthias Bauer, and Philipp Hennig. Laplace Redux-Effortless Bayesian Deep Learning. Advances in Neural Information Processing Systems, 2021. 8Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (16)
  12. Gintare Karolina Dziugaite and Daniel M Roy. Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters Than Training Data. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, 2017. 2, 7Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (17)
  13. Vitaly Feldman and Chiyuan Zhang. What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation. In Advances in Neural Information Processing Systems, 2020. 3Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (18)
  14. Wing K Fung and CW Kwan. A Note on Local Influence Based on Normal Curvature. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 1997. 1, 3Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (19)Cross Ref
  15. Ryan Giordano, Tamara Broderick, and Michael I Jordan. Covariances, robustness and variational Bayes. Journal of Machine Learning Research, 19(51), 2018. 5Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (21)
  16. Satoshi Hara, Atsushi Nitanda, and Takanori Maehara. Data Cleansing for Models Trained with SGD. In Advances in Neural Information Processing Systems, 2019. 3Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (22)
  17. Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini, Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, and Stefano Soatto. Estimating Informativeness of Samples with Smooth Unique Information. In International Conference on Learning Representations, 2021. 3Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (23)
  18. James Hensman, Nicolo Fusi, and Neil D Lawrence. Gaussian Processes for Big Data. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, 2013. 20Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (24)
  19. Alexander Immer, Matthias Bauer, Vincent Fortuin, Gunnar Rätsch, and Mohammad Emtiyaz Khan. Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning. In International Conference on Machine Learning, 2021. 2, 7Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (25)
  20. Alexander Immer, Maciej Korzepa, and Matthias Bauer. Improving Predictions of Bayesian Neural Nets via Local Linearization. International Conference on Artificial Intelligence and Statistics, 2021. 7Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (26)
  21. Louis A Jaeckel. The Infinitesimal Jackknife. Technical report, Bell Lab., 1972. 1, 3Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (27)
  22. Yiding Jiang, Behnam Neyshabur, Hossein Mobahi, Dilip Krishnan, and Samy Bengio. Fantastic Generalization Measures and Where To Find Them. In International Conference on Learning Representations, 2020. 2, 7, 10Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (28)
  23. Angelos Katharopoulos and Francois Fleuret. Not All Samples Are Created Equal: Deep Learning with Importance Sampling. In International Conference on Machine Learning, 2018. 3Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (29)
  24. Mohammad Emtiyaz Khan. Variational Bayes Made Easy. Fifth Symposium on Advances in Approximate Bayesian Inference, 2023. 5Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (30)
  25. Mohammad Emtiyaz Khan, Alexander Immer, Ehsan Abedi, and Maciej Korzepa. Approximate Inference Turns Deep Networks into Gaussian Processes. Advances in Neural Information Processing Systems, 2019. 4, 7, 17Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (31)
  26. Mohammad Emtiyaz Khan and Wu Lin. Conjugate-Computation Variational Inference: Converting Variational Inference in Non-Conjugate Models to Inferences in Conjugate Models. In International Conference on Artificial Intelligence and Statistics, 2017. 4, 15, 20Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (32)
  27. Mohammad Emtiyaz Khan, Didrik Nielsen, Voot Tangkaratt, Wu Lin, Yarin Gal, and Akash Srivastava. Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam. In International Conference on Machine Learning, 2018. 6, 17, 18, 25Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (33)
  28. Mohammad Emtiyaz Khan and Hávard Rue. The Bayesian Learning Rule. Journal of Machine Learning Research, 2023. 1, 4, 5, 6, 16, 17, 18Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (34)
  29. George S Kimeldorf and Grace Wahba. A Correspondence Between Bayesian Estimation on Stochastic Processes and Smoothing by Splines. The Annals of Mathematical Statistics, 1970. 4Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (35)Cross Ref
  30. Diederik Kingma and Jimmy Ba. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations, 2015. 6Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (37)
  31. Pang Wei Koh, Kai-Siang Ang, Hubert Teo, and Percy S Liang. On the Accuracy of Influence Functions for Measuring Group Effects. In Advances in Neural Information Processing Systems, 2019. 2, 3, 5Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (38)
  32. Pang Wei Koh and Percy Liang. Understanding Black-Box Predictions via Influence Functions. In International Conference on Machine Learning, 2017. 1, 3, 5, 7Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (39)Digital Library
  33. Aran Komatsuzaki. One Epoch is All You Need. ArXiv e-Prints, 2019. 3Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (41)
  34. Pierre-Simon Laplace. Mémoires de Mathématique et de Physique. Tome Sixieme, 1774. 5Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (42)
  35. Wu Lin, Mark Schmidt, and Mohammad Emtiyaz Khan. Handling the Positive-Definite Constraint in the Bayesian Learning Rule. In International Conference on Machine Learning, 2020. 6, 9, 17, 18Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (43)
  36. Ilya Loshchilov and Frank Hutter. Decoupled Weight Decay Regularization. International Conference on Learning Representations, 2019. 23Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (44)
  37. David JC MacKay. Information Theory, Inference and Learning Algorithms. Cambridge University Press, 2003. 5Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (45)Digital Library
  38. Roman Novak, Yasaman Bahri, Daniel A Abolafia, Jeffrey Pennington, and Jascha Sohl-Dickstein. Sensitivity and Generalization in Neural Networks: An Empirical Study. In International Conference on Learning Representations, 2018. 1, 2, 3Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (47)
  39. Kazuki Osawa, Satoki Ishikawa, Rio Yokota, Shigang Li, and Torsten Hoefler. ASDL: A Unified Interface for Gradient Preconditioning in PyTorch. In NeurIPS Workshop Order up! The Benefits of Higher-Order Optimization in Machine Learning, 2023. 8Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (48)
  40. Mansheej Paul, Surya Ganguli, and Gintare Karolina Dziugaite. Deep Learning on a Data Diet: Finding Important Examples Early in Training. In Advances in Neural Information Processing Systems, 2021. 3, 6, 7Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (49)
  41. Daryl Pregibon. Logistic Regression Diagnostics. The Annals of Statistics, 1981. 14Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (50)Cross Ref
  42. Garima Pruthi, Frederick Liu, Satyen Kale, and Mukund Sundararajan. Estimating Training Data Influence by Tracing Gradient Descent. In Advances in Neural Information Processing Systems, 2020. 3, 6Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (52)
  43. Kamiar Rahnama Rad and Arian Maleki. A Scalable Estimate of the Out-of-Sample Prediction Error via Approximate Leave-One-Out Cross-Validation. Journal of the Royal Statistical Society Series B: Statistical Methodology, 2020. 7Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (53)Cross Ref
  44. Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In International Conference on Machine Learning, 2014. 7, 17Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (55)
  45. Hugh Salimbeni, Stefanos Eleftheriadis, and James Hensman. Natural Gradients in Practice: Non-Conjugate Variational Inference in Gaussian Process Models. In International Conference on Artificial Intelligence and Statistics, 2018. 20Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (56)
  46. Frank Schneider, Lukas Balles, and Philipp Hennig. DeepOBS: A Deep Learning Optimizer Benchmark Suite. In International Conference on Learning Representations, 2019. 21Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (57)
  47. Bernhard Schölkopf, Ralf Herbrich, and Alex J Smola. A Generalized Representer Theorem. In International Conference on Computational Learning Theory, 2001. 4Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (58)
  48. Saurabh Singh and Shankar Krishnan. Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020. 21Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (59)Cross Ref
  49. Ryutaro Tanno, Melanie F Pradier, Aditya Nori, and Yingzhen Li. Repairing Neural Networks by Leaving the Right Past Behind. Advances in Neural Information Processing Systems, 2022. 5Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (61)
  50. Luke Tierney and Joseph B Kadane. Accurate Approximations for Posterior Moments and Marginal Densities. Journal of the American Statistical Association, 1986. 5Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (62)Cross Ref
  51. Mariya Toneva, Alessandro Sordoni, Remi Tachet des Combes, Adam Trischler, Yoshua Bengio, and Geoffrey J. Gordon. An Empirical Study of Example Forgetting during Deep Neural Network Learning. In International Conference on Learning Representations, 2019. 3, 6Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (64)
  52. Robert Weiss. An Approach to Bayesian Sensitivity Analysis. Journal of the Royal Statistical Society Series B: Statistical Methodology, 1996. 3Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (65)Cross Ref
  53. Fuzhao Xue, Yao Fu, Wangchunshu Zhou, Zangwei Zheng, and Yang You. To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis. ArXiv e-Prints, 2023. 3Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (67)
  54. Hongtu Zhu, Joseph G. Ibrahim, Sikyum Lee, and Heping Zhang. Perturbation Selection and Influence Measures in Local Influence Analysis. The Annals of Statistics, 2007. 1Google ScholarThe memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (68)Cross Ref

Cited By

View all

The memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (70)

    Recommendations

    • hom*otopy perturbation method for fractional Fornberg-Whitham equation

      This article presents the approximate analytical solutions to solve the nonlinear Fornberg-Whitham equation with fractional time derivative. By using initial values, the explicit solutions of the equations are solved by using a reliable algorithm like ...

    • Semi-supervised Learning with Multimodal Perturbation

      ISNN '09: Proceedings of the 6th International Symposium on Neural Networks on Advances in Neural Networks

      In this paper, a new co-training style semi-supervised algorithm is proposed, which employs Bagging based multimodal perturbation to label the unlabeled data. In detail, through perturbing the training data, input attributes and learning parameters ...

      Read More

    • A singular perturbation of the heat equation with memory

      In this paper we consider a hyperbolic equation, with a memory term in time, which can be seen as a singular perturbation of the heat equation with memory. The qualitative properties of the solutions of the initial boundary value problems associated ...

      Read More

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    Get this Publication

    • Information
    • Contributors
    • Published in

      The memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (71)

      NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing Systems

      December 2023

      80772 pages

      • Editors:
      • A. Oh,
      • T. Naumann,
      • A. Globerson,
      • K. Saenko,
      • M. Hardt,
      • S. Levine

      Copyright © 2023 Neural Information Processing Systems Foundation, Inc.

      Sponsors

        In-Cooperation

          Publisher

          Curran Associates Inc.

          Red Hook, NY, United States

          Publication History

          • Published: 30 May 2024

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Conference

          Funding Sources

          • The memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (72)

            Other Metrics

            View Article Metrics

          • Bibliometrics
          • Citations0
          • Article Metrics

            • Total Citations

              View Citations
            • Total Downloads

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0

            Other Metrics

            View Author Metrics

          • Cited By

            This publication has not been cited yet

          Digital Edition

          View this article in digital edition.

          View Digital Edition

          • Figures
          • Other

            Close Figure Viewer

            Browse AllReturn

            Caption

            View Table of Contents

            Export Citations

              Your Search Results Download Request

              We are preparing your search results for download ...

              We will inform you here when the file is ready.

              Download now!

              Your Search Results Download Request

              Your file of search results citations is now ready.

              Download now!

              Your Search Results Download Request

              Your search export query has expired. Please try again.

              The memory perturbation equation | Proceedings of the 37th International Conference on Neural Information Processing Systems (2024)
              Top Articles
              Latest Posts
              Article information

              Author: Sen. Emmett Berge

              Last Updated:

              Views: 6004

              Rating: 5 / 5 (60 voted)

              Reviews: 91% of readers found this page helpful

              Author information

              Name: Sen. Emmett Berge

              Birthday: 1993-06-17

              Address: 787 Elvis Divide, Port Brice, OH 24507-6802

              Phone: +9779049645255

              Job: Senior Healthcare Specialist

              Hobby: Cycling, Model building, Kitesurfing, Origami, Lapidary, Dance, Basketball

              Introduction: My name is Sen. Emmett Berge, I am a funny, vast, charming, courageous, enthusiastic, jolly, famous person who loves writing and wants to share my knowledge and understanding with you.