Initialization schemes

Hypothesis: some conditions should be true for a randomly initialized network to learn effectively:

  1. Initialization => should be unbiased w.r.t data used a. Either pick a better initialization scheme, or b. Perform semi-supervised learning technique to minimize entropy ( #todo cite here) -
  2. A randomly initialized architecture should map each input to a "different mapping" for each input - as it is unbiased

Rethinking the value of network pruning


Learning Discrete Representations via Information Maximizing Self-Augmented Training

  • Weight Agnostic Neural Networks
    Some-what of a contradiction to the Lottery Ticket Hypothesis (LTH)
    • Well not quite but it is close