ChanServ changed the topic of #mlpack to: "mlpack: a fast, flexible machine learning library :: We don't always respond instantly, but we will respond; please be patient :: Logs at
mrohit[m] has quit [Remote host closed the connection]
mrohit[m] has joined #mlpack
vivekp has quit [Ping timeout: 250 seconds]
vivekp has joined #mlpack
Megatron3 has joined #mlpack
Megatron3 has quit [Client Quit]
< davida> rcurtin: The tests I have run so far on the Windows 10 setup with the latest pull of MLPACK (3.0.4 + any recent changes on the master) are showing no problems.
< davida> My exercises are all converging correctly on Windows.
< davida> Now really hoping that zoq can release the change to RNN to allow different sized samples.
pd09041999 has joined #mlpack
zoq_ has joined #mlpack
gtank____ has joined #mlpack
gtank____ is now known as gtank___
adm64 has joined #mlpack
< adm64> is there a way to perform evaluation (see r-squared) with mlpack? i'm using decision foreset
zoq_ is now known as zoq
< davida> adm64: there is an MSE class in MLPACK with Evaluate() function. Can this help you?
< adm64> thanks @davida - anything in the command line? that's what we currently use
< ShikharJ> davida: That's good to hear!
pd09041999 has quit [Ping timeout: 268 seconds]
akhandait has joined #mlpack
< akhandait> zoq: Do we have some functionality for loading a dataset partially and continuously as we train a model. Or is loading the entire dataset at once before we begin training the only option? That leads to memory shortage and limits the size of models and datasets.
< akhandait> I am talking about something like the Dataloader functionality in pytorch
< davida> akhandait: Could you batch train in a loop (around model.Train()) and modify the dataset in each loop with the new input data?
< davida> adm64: Sorry, I am not familiar with the command line tools. I am using the MLPACK C++ library in my own code.
adm64 has quit [Quit: Page closed]
< akhandait> Yeah, but that will lead to a lot of overhead in loading the data each time. Also, we would need to break the dataset beforehand.
< akhandait> I don’t think that’s a tidy approach
< akhandait> We also can’t shuffle the data after each epoch then
< akhandait> davida:
< davida> akhandait: ... but isn't that what PyTorch Dataloader is basically doing under the hood anyway.
< davida> You can shuffle the data by setting the Shuffle = True in the optimizer flag.
< davida> optimizer paramter.
< davida> If I recall the code well, each loop thru' the total dataset will be shuffled.
< akhandait> davida: I really doubt that’s what the dataloader does, but I am not sure, I will check their source.
< davida> So if your dataset is 1000 and your batchsize is 100 and your maxIterations is 1000, each batch would get shuffled 100 times
< akhandait> But we will still need to break the datatset according to our batch size every time we want to train
pd09041999 has joined #mlpack
< davida> ... as it takes 10 batches of 100 to loop thru' your dataset. Excuse me the maxIterations would need to be larger since if I recall well maxIterations will be referenced against the batch size as well. So for 100 shuffles maxIterations would need to be 100000.
< akhandait> davida: I think that will work, but that will only shuffle the data in a batch and not the entire dataset before we make batches. So in every epoch, the batches will have the same data.
< davida> akhandait: I am not clear on what you mean by batch here. The code in the optimizer shuffles the entire dataset once per loop thru' the dataset. If you mean when you add more data to the dataset, then you could pre-shuffle the new dataset int he Armadillo matrix with some pretty simple code.
< davida> Use the arma::shuffle function
< akhandait> That’s not quite what I was saying, but it’s okay. Thanks. I will try these things. :)
pd09041999 has quit [Ping timeout: 246 seconds]
pd09041999 has joined #mlpack
pd09041999 has quit [Ping timeout: 244 seconds]
< jenkins-mlpack2> Project docker mlpack nightly build build #135: STILL UNSTABLE in 8 hr 31 min:
saurabh has joined #mlpack
saurabh has quit [Client Quit]
saurabh has joined #mlpack
saurabh97 has joined #mlpack
pd09041999 has joined #mlpack
pd09041999 has quit [Max SendQ exceeded]
saurabh has quit [Quit: Leaving]
saurabh97 has quit [Quit: Leaving]
akhandait has quit [Quit: Connection closed for inactivity]
< davida> zoq: Did you manage to make any progress on the RNN sequence inputs we talked about last week?
vivekp has quit [Ping timeout: 250 seconds]
vivekp has joined #mlpack