EWC mini-batch sampling

Hi, Thank you so much for this awesome repo. It's the clearest implementation I found out there :)

I have a question regarding the mini-batch sampling. In the code, it is commented that it gives similar performance to (sub-sampling with batch_size=1, i.e., the correct mathematical way). But I'm worried that they are very different. 
So I'm curious to know whether there are papers that used this sampling instead and they confirmed its similar performance?

The reason for my doubt is that in general, the expected value of the squared gradients of log-likelihoods which is an estimator for the diagonal of the Fisher matrix is not the same as the expected squared expected gradients of log-likelihoods.

Thank you for your consideration,
Arash

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EWC mini-batch sampling #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

EWC mini-batch sampling #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions