<rcurtin[m]>
🤦 🤦 🤦 I discovered today that in my comparison scripts where I'm comparing PyTorch against the refactored mlpack convolution code and varying batch size, that I never actually use the batch size in the PyTorch scripts and so I have been comparing all this time against a batch size of 1, no matter what I set it to in my scripts
<JeffinSam[m]>
Gsoc has arrived :)
<shrit[m]>
Oh gosh
<jjb[m]>
Whoops.
<shrit[m]>
rcurtin: but did the results of batch size 1 in mlpack are similar to batch size =1 in pytorch
<JeffinSam[m]>
Feb 7-21 :)
<rcurtin[m]>
yeah, batch size 1 for mlpack and PyTorch are pretty much the same
<shrit[m]>
why the batch size did not change in the script
<shrit[m]>
?
<rcurtin[m]>
but I noticed that mlpack's performance degraded (when the learning rate is held constant) when the batch size increased. that observation makes sense, but I noticed that PyTorch seemingly did not degrade with increasing batch size
<rcurtin[m]>
haha, because I wrote `batch_size = int(sys.argv[1])` but then never used the variable anywhere 😂 😂
<rcurtin[m]>
now I am running the simulations correctly, and it seems like results are pretty much the same between pytorch and mlpack. I need to finish them before I am sure, but I think everything is working right here
<rcurtin[m]>
so, I can finally move on to the next thing 😄
<zoq[m]1>
Do you start with the same initial weights?
<rcurtin[m]>
they aren't the exact same; my guess is that's probably what's different here
<rcurtin[m]>
but mostly my goal here is just to make sure something isn't horribly wrong, and that seems to be true
<rcurtin[m]>
my training was also for only 1 epoch, and I used the same learning rate for both libraries. I suspect if I ran until convergence and tuned the learning rate for both libraries, that I would be able to produce the same performance
<zoq[m]1>
Agreed, just wanted to make sure I interpret the results correctly, because the results start to differ quite a bit with a higher batch-size.