#mlpack on 2018-06-11 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

02:11 vivekp has joined #mlpack

05:53 sulan_ has joined #mlpack

07:47 sulan_ has quit [Read error: Connection reset by peer]

07:48 sulan_ has joined #mlpack

08:13 __sulan__ has joined #mlpack

08:16 sulan_ has quit [Ping timeout: 264 seconds]

09:35 < jenkins-mlpack> Project docker mlpack nightly build build #346: STILL UNSTABLE in 2 hr 21 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20nightly%20build/346/

14:00 < zoq> manish7294: https://github.com/mlpack/mlpack/pull/1429 should fix the issue, also did you test AMSGrad?

14:21 ImQ009 has joined #mlpack

14:54 wenhao has joined #mlpack

15:09 manish7294 has joined #mlpack

15:13 < manish7294> zoq: Thanks for solving the issue and these good suggestions. The changing batch size has made the batch precalculation of lmnn redundant :)

15:13 < manish7294> Either way It was not making much of a difference.

15:14 < manish7294> AMSGrad also works great :)

15:18 < manish7294> rcurtin: As per the findings BigBatchSGD(both adaptive search and line search) or AMSGrad are good options to replace SGD.

15:24 __sulan__ has quit [Quit: Leaving]

16:14 < rcurtin> manish7294: great to hear the different optimizers worked better; do you have benchmarking results for them?

16:15 < rcurtin> I saw your comments on the LMNN PR also; I haven't had a chance to dig in deeply, but did calling Impostors() only once every 100 iterations help?

16:16 < manish7294> rcurtin: I have mostly tested them on iris, vc2 and covertype 5k points dataset and by looking at the results I would say results are quite similar but they help in avoiding divergence

16:17 < manish7294> calling impostors after 100 iteration is leading to errors.

16:23 < manish7294> Let me verify the 100 iteration idea once again

16:28 witness_ has quit [Quit: Connection closed for inactivity]

16:43 < manish7294> With sgd on 5k covertype data I am getting " [WARN ] SGD: converged to -nan; terminating with failure. Try a smaller step size? " within a second of starting and with BigBatchSgd it does not seem to converge.

16:44 < manish7294> with BigBatchSGD coordinates values seems to remain oscillating between few values.

16:53 sumedhghaisas has joined #mlpack

16:56 vivekp has quit [Read error: Connection reset by peer]

16:57 < sumedhghaisas> Atharva: Hi Atharva

16:58 < sumedhghaisas> Hows it going?

16:58 < Atharva> I am just about to post to the blog.

16:58 < sumedhghaisas> Maybe we can speed up the mail thread with IRC :)

16:58 < Atharva> It's done.

16:58 < sumedhghaisas> Nice! I will take a look at it later

16:58 < Atharva> The tasks for this week

16:59 < sumedhghaisas> umm... Have you updated the PR?

17:00 vivekp has joined #mlpack

17:01 < Atharva> No, I am just trying to the debug the failing Jacobian test, but I am not quite sure what that test does.

17:01 < Atharva> The gradient check is passing with the KL loss added to the total loss

17:03 < sumedhghaisas> Jacobian tests is failing?

17:03 < sumedhghaisas> huh... Well, push away and lets see why is that test not happy

17:04 < Atharva> OKay

17:04 < sumedhghaisas> Also about the VAE class, which aspect of VAE do you think cannot be emulated by FFN?

17:05 vivekp has quit [Read error: Connection reset by peer]

17:07 < Atharva> For example, the Encode function, GenerateRandom, GenerateSpecific, SampleOutput. Also, I had to make the Evaluate and Backward function loop over all the layers collecting extra loss, which is 0 almost all the time.

17:08 < Atharva> Even in a VAE network, for one layer, it's too much work.

17:09 < sumedhghaisas> The loop is mostly static... which shouldn't cause any delay.

17:09 < sumedhghaisas> The extra loss functionality is not just for VAE

17:10 < sumedhghaisas> it extends the FFN functionality to produce L1 and L2 regularized layers, which is a huge improvement over the current framework

17:11 < Atharva> Yes, I understand, but the generate and encode functionalities will have to forward pass through some layers of the network, with custom inputs

17:11 < Atharva> With, multiple repar layers, it will prove tougher

17:11 < sumedhghaisas> The Encode function is nothing but a forward of parametric model, feed forward, CNN or RNN thus we do not need to make any extra efforts for it.

17:12 < Atharva> Yes but partial

17:12 < Atharva> Yeah

17:12 < sumedhghaisas> If we look at VAE as a special model, we restrict the user to improve upon it

17:12 vivekp has joined #mlpack

17:12 < sumedhghaisas> we will restrict them to use the functionality given by us

17:13 < Atharva> That makes sense

17:13 < sumedhghaisas> if we indeed look at it as a specific case of FFN and make sure the current architecture supports it

17:14 < sumedhghaisas> we not only make sure VAE could be implemented but the user can use the extra FFN features to improve upon it

17:14 < sumedhghaisas> For example, if you implement VAE class

17:14 < Atharva> Yeah, I never thought of it that way

17:14 < sumedhghaisas> you have to make sure you support hierarchical VAE, beta VAE, regularized VAE

17:15 < sumedhghaisas> although with FFN, multiplke repar layers would achieve hierarchical aspect

17:15 < sumedhghaisas> specialized repar layer with Beta will achieve Beta VAE and so on

17:16 < sumedhghaisas> minimal changes

17:16 < sumedhghaisas> Although I am still not 100 percent sure we can emulate it :)

17:17 < sumedhghaisas> So some thinking is required there

17:18 < Atharva> So, let's go ahead and start making some models with the FFN class, and if some functions prove too complex, then we can give a thought to VAE class.

17:18 < Atharva> If not, then we are good.

17:19 < sumedhghaisas> that would be risky, as a new class shift is not a simple one

17:19 < sumedhghaisas> Lets look at the aspects of VAE that we cannot satisfy right now

17:20 < sumedhghaisas> 1) Generation

17:20 < sumedhghaisas> what else?

17:20 < sumedhghaisas> hmmm

17:21 < sumedhghaisas> Okay how do we implement generation in FFN

17:22 < Atharva> The generation can be random or controlled

17:22 < Atharva> We need to think about both cases

17:22 < sumedhghaisas> indeed

17:24 < sumedhghaisas> okay give it some thought, lets try involving Ryan and Marcus as well and see if they have some thought on it

17:25 < Atharva> Yeah, can you explain how yoou said we would implement Encode?

17:25 < Atharva> you*

17:25 < Atharva> Can we do partial forward pass with FFN class?

17:25 < sumedhghaisas> Encode is not a direct feature of VAE, but generation is

17:26 < sumedhghaisas> Encode happens as a part of Forward

17:26 < sumedhghaisas> ahh partial pass

17:26 < sumedhghaisas> thats what I was thinking

17:26 < Atharva> Yes, but we should be able to have just the encodings i we want to.

17:27 < Atharva> From thoese encoding, we should be able to operate the Generate functions independently

17:27 < sumedhghaisas> I agree. We should, that could be achieved with partial pass

17:27 < Atharva> Yeah

17:27 < sumedhghaisas> If we do the partial pass and access the layers output parameter

17:28 < sumedhghaisas> we will get encoding

17:28 < Atharva> and then Generate either randomly, or with a sample of our choice

17:29 < sumedhghaisas> If we define the final layer as distribution layer the current architecture should produce conditional samples

17:29 < sumedhghaisas> For example, the current architectute Predict outputs the last layer output

17:30 < Atharva> Yes, but a VAE outputs a distribution

17:30 < Atharva> parameters to a distribution

17:30 < sumedhghaisas> if the last layer outputs a distribution, we sample from it to generate conditional samples

17:30 < Atharva> We should be able to then sample from that

17:30 < Atharva> Exactly

17:30 < sumedhghaisas> yes but that only conditional

17:31 < sumedhghaisas> How do we produce unconditional samples?

17:31 < Atharva> Sorry, what exaclty do you mean by unconditional samples?

17:31 < sumedhghaisas> for that we need to start the forward propagation from Repar layer

17:32 < sumedhghaisas> ohh conditional samples are samples from P(Z | X) where uncoditional are from P(Z)

17:33 < sumedhghaisas> basically conditional are samples from posterior over the latents and unconditional are samples from latent prior

17:33 < Atharva> Yeah

17:33 < Atharva> We need to start from repar layer for that

17:33 < sumedhghaisas> yes. Now thats the puzzler.

17:35 < Atharva> I just pushed the latest changes

17:36 < sumedhghaisas> Okay. Lets keep thinking about this and complete this week's work first. Lets hope we find some solution till then.

17:37 < sumedhghaisas> I will take a look at it tonight :)

17:37 < Atharva> Sure!

18:25 < Atharva> sumedhghaisas: You there?

18:50 manish7294 has quit [Ping timeout: 260 seconds]

18:52 < rcurtin> manish7294: I think we need to debug the idea a little bit more. recalculating impostors only once every 100 iterations should work just fine

18:52 < rcurtin> if you like, you could try recalculating only every other iteration

18:53 < rcurtin> just for debugging

18:53 < rcurtin> but it should be no problem, since all we are calculating in Impostors() is the indices of the impostors

20:24 ImQ009 has quit [Quit: Leaving]

20:43 witness_ has joined #mlpack

22:52 witness_ has quit [Quit: Connection closed for inactivity]