AudioMAE


Tags: paper ml
State: None
Source: https://arxiv.org/abs/2207.06405

Take-aways

  • performs fine-tuning with masked inputs for speed benefits + form of regularization
  • fine-tuning: average pool and linear transformation
  • no benefit with pre-training on imagenet
  • no benefit in using contrastive objectives, e.g. InfoNCE