Show HN: PILF, The ultimate solution to catastrophic oblivion on AI models

(github.com)

23 points | by NetRunnerSu 13 hours ago

3 comments

upghost 5 hours ago
This looks absolutely fantastic, please accept my meagre professional jealousy. I have long bemoaned manual hyperparam fiddling . I have on occasion dabbled with nonparametric ("genetic") methods of hyperparam tuning inspired by AutoML... but then you still have to manually tune the evolutionary hyperparams.
Finding a way to derive this from the gradients is amazing.
[-]
- NetRunnerSu 44 minutes ago
  This is definitely not just another machine learning method. It comes from a complete cognitive science theory, rooted in a complete understanding of intelligence and consciousness.
  https://github.com/dmf-archive/IPWT
  also produced an interesting science fiction world.(really just like that? RE knows...)
  :)
Ifkaluva 8 hours ago
It’s an interesting idea, I have two questions.
- Surprise is detected by the norm of the gradients. So, doesn’t this suggest that the model already has a way of adjusting to surprise?
- Is there a danger of model instability when the gradients become larger and the learning rate is also increased?
[-]
- NetRunnerSu 8 hours ago
  1. an overly strong surprise is like PTSD in humans - it changes the model's previously learned experience forever, this is what we want to avoid
  2. it's bound to happen, and our PILR-S is designed to keep the learning rate within the bell curve and decreasing as the surprise decreases (less new information, less learning).
  [-]
  - derefr 37 minutes ago
    But doesn’t this lead to the opposite problem: creating a model that can never learn to let go of an early-life mental model picked up from a skewed dataset?
    By analogy to humans: if this model were raised in a cult, and then let out into the real world, it would be seemingly incapable of unlearning the cult’s indoctrination, despite the real-world data all contradicting it — as all of this real-world data would be too surprising for the model to accept.
    Or, for a maybe-more-likely situation you might encounter in e.g. incremental model re-training of old models for chronologically-newer info: a model trained this way would “stubbornly” refuse to accept any major shift in scientific consensus on a topic.
    The human cognitive architecture seems to solve this problem by 1. buffering this rejected-for-being-too-out-there info in a way where it can at least be pattern-recognized; and then 2. noticing when a lot of different, seemingly independent, seemingly trustworthy sources begin matching on the rejected pattern. At that point, the human brain seems to swing the other way — experiencing a “crisis of faith” per se.
    [-]
    - NetRunnerSu 23 minutes ago
      That's a brilliant and crucial point. You've pinpointed the central dialectic of this architecture: the trade-off between stability (resisting catastrophic forgetting) and plasticity (updating core beliefs).
      You are absolutely right that a poorly configured model could become "dogmatic," incapable of escaping an early "cult" indoctrination. This cognitive rigidity, however, is not a hardcoded flaw but a tunable personality trait .
      This is where the remaining hyperparameters come into play. We still define:
      1. The initial `learning_rate`, setting its baseline openness.
      2. The `sigma_threshold` for the surprise EMA, which defines its "trust window."
      A narrow sigma creates a conservative, "skeptical" model, while a wider sigma creates a more "open-minded" one that is more willing to entertain paradigm shifts.
      So, the paradigm shift is this: we are no longer micromanaging how the model learns moment-to-moment. Instead, we are defining its cognitive temperament or learning style. Your "crisis of faith" mechanism is the logical next step—a meta-learning process we are actively exploring. Thank you for the incredibly sharp insight.
hackingonempty 5 hours ago
Parameters I'd Like to Fiddle