#mlpack on 2018-07-13 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

01:43 cjlcarvalho has joined #mlpack

01:45 vivekp has joined #mlpack

02:04 cjlcarvalho has quit [Ping timeout: 256 seconds]

02:07 cjlcarvalho has joined #mlpack

02:56 cjlcarvalho has quit [Quit: Konversation terminated!]

02:56 cjlcarvalho has joined #mlpack

04:58 cjlcarvalho has quit [Ping timeout: 256 seconds]

05:00 cjlcarvalho has joined #mlpack

05:05 cjlcarvalho has quit [Ping timeout: 268 seconds]

05:06 cjlcarvalho has joined #mlpack

05:33 cjlcarvalho has quit [Ping timeout: 256 seconds]

05:44 cjlcarvalho has joined #mlpack

06:40 < jenkins-mlpack2> Project docker mlpack weekly build build #1: UNSTABLE in 8 hr 53 min: http://ci.mlpack.org/job/docker%20mlpack%20weekly%20build/1/

08:18 cjlcarvalho has quit [Ping timeout: 256 seconds]

08:18 caiojcarvalho has joined #mlpack

09:18 witness_ has quit [Quit: Connection closed for inactivity]

09:53 caiojcarvalho has quit [Ping timeout: 264 seconds]

14:19 caiojcarvalho has joined #mlpack

14:24 ImQ009 has joined #mlpack

15:00 ImQ009 has quit [Quit: Leaving]

15:00 ImQ009 has joined #mlpack

16:03 manish7294 has joined #mlpack

16:04 < manish7294> rcurtin: Was our evaluation of GSoC completed?

16:07 < rcurtin> yeah, they are all done, I am not sure when Google returns them to you

16:07 < rcurtin> I thought today maybe?

16:08 < ShikharJ> rcurtin: Yeah, ideally they should be returning us the feedback.

16:08 < manish7294> rcurtin: Ya, it's about time and status has changed. But this time its a little weird

16:08 < rcurtin> what's weird about it? I filled it out as usual, nothing different from my end

16:09 < manish7294> firstly they haven't shown us the feedback yet, and status is just complete in a big red box, which got me worried

16:09 < manish7294> :)

16:09 < rcurtin> no need to worry, I am sure everything will be resolved shortly

16:10 < manish7294> Hopefully, they will.

16:10 < rcurtin> if there continues to be some problem into tomorrow or something, let me know and I will send an email to the GSoC admins

16:10 < rcurtin> (I guess you could do the same too if you wanted)

16:11 < manish7294> Sure, I will let you know as well if there's any problem. Thanks!

16:12 < manish7294> Now it's changed to passed, so I think that was just a little time delay :)

16:12 < manish7294> Thanks again! :)

16:12 < rcurtin> sure, glad it is worked out. I don't think you had any reason to worry :)

16:14 < manish7294> I just got a bit anxious by looking at the RED color :)

16:14 < rcurtin> yeah, I can understand the feeling

16:14 < rcurtin> I'm going to step out for lunch, I'll be back in a little while

16:15 < rcurtin> also, next week my timezone will be UTC+0 (so four hours ahead of where I am now); my new company is having an internal conference for a week in Iceland

16:15 < rcurtin> so it will seem like I am waking up really early :)

16:16 < manish7294> like for whole week?

16:39 < manish7294> zoq: Can there be multiple Evaluate() calls before a Gradient() call or vice versa while using BigBatchSGD.

16:39 haritha1313 has joined #mlpack

16:41 < manish7294> means it necessarily does not follow countinous E -> G -> E -> G routine?

16:43 < zoq> manish7294: Yes, it could be G -> E -> E -> E - > G -> ...

16:43 < manish7294> zoq: Thanks!

16:44 < zoq> manish7294: Does that cause any issues?

16:44 < manish7294> zoq: I was planning to use a cache from Evaluate to Gradient, which I think is not possible now?

16:45 < manish7294> It should have been continous to work as cache is updated on each Evaluate call

16:46 < manish7294> rcurtin: I think the non-continous Evaluate() and Gradient() calls nature is what causing the deviations while using evalOld in Gradient.

16:47 < zoq> manish7294: Cache means, something like reusing the output from the Evalaute step for the Gradient step?

16:47 < manish7294> zoq: Yes

16:53 caiojcarvalho has quit [Ping timeout: 256 seconds]

16:53 cjlcarvalho has joined #mlpack

16:54 < zoq> manish7294: I see, that's challenging, what you could do is to implement EvaluateWithGradient, which is a combination of Evaluate + Gradient, with the idea to use the results from the Evaluate step (cache). So if an optimizer does use the EvaluateWithGradient you can expect E -> G -> E -> G

16:54 < zoq> I can see if I could modify the optimizer to use EvaluateWithGradient at least at some steps.

16:55 < manish7294> zoq: I think we currently have all the variations of Evaluate + Gradient, The issue is just BigBatchSGD specific :)

16:59 < zoq> manish7294: I think I missed some details, but if you only implement caching for EvaluateWithGradient, and BigBatchSGD is only using Evaluate + Gradient, wouldn't that solve the issue?

17:01 < manish7294> We have caching for EvaluatewithGradient() (but just sgd, lbfgs and amsgrad make use of this, and it works as expected), but we also have BBSGD in the list which makes use of seprate Evaluate() and Gradient(), and is the cause of the problem.

17:02 < manish7294> We want to do the same for BBSGD as well.

17:03 < zoq> I see, that's challenging.

17:05 < manish7294> Currently, what I can think of is seprate caching for Evaluate() and Gradient()

17:06 < manish7294> But that's a lot of matrices and lots of tracking work. Have to know the views of Ryan as well, whether to do this or maybe something else.

17:09 < zoq> I wonder if you could get similair results with SGD if you use a really big batch size.

17:10 < manish7294> zoq: It works normally.

17:12 < manish7294> zoq: If I am right, out of amsgrad, sgd, lbfgs and bbsgd only bbsgd have variable batch_size ?

17:13 < zoq> right

17:15 < manish7294> hmm, so bbsgd is the only guy, putting up a lot of hurdles :)

17:16 < zoq> Right, not sure if it's worth the effort.

17:17 witness_ has joined #mlpack

17:18 vivekp has quit [Read error: Connection reset by peer]

17:20 vivekp has joined #mlpack

17:24 < manish7294> rcurtin: In case we avoid the evalBounds optimization for bbsgd, we can easily and efficiently store oldTransformationMatrices for other batch optimizers (will just have to store dataset.n_cols/ batchSize number of matrices). Do you think it sounds reasonable.

17:44 manish7294 has quit [Ping timeout: 252 seconds]

18:01 < rcurtin> manish7294: let's not do that because batches can be shuffled, and some optimizers may use different batch sizes at different times during the optimization

18:01 < rcurtin> I think the idea I posted will work and it will be roughly as fast

18:01 < rcurtin> unless you've found an error with it or anything

18:02 < rcurtin> and yeah, I will be there the whole week

18:03 < rcurtin> manish7294: reading the whole discussion that you and zoq had now (I only saw the most recent message then I scrolled up...)

18:03 < rcurtin> I think that we can't cache the bounds in general for Evaluate() or Gradient(), because we can't assume that Gradient() comes right after Evaluate() (if that was true, the optimizer can just use EvaluateWithGradient())

18:03 < rcurtin> so that may make the bounds become looser quicker than expected but it will still work, so I think it will be okay

18:13 caiojcarvalho has joined #mlpack

18:14 cjlcarvalho has quit [Read error: Connection reset by peer]

18:39 haritha1313 has left #mlpack []

19:17 vivekp has quit [Ping timeout: 240 seconds]

20:37 caiojcarvalho has quit [Ping timeout: 256 seconds]

20:56 caiojcarvalho has joined #mlpack

21:02 ImQ009 has quit [Quit: Leaving]

21:57 caiojcarvalho has quit [Read error: Connection reset by peer]

22:02 caiojcarvalho has joined #mlpack

22:43 caiojcarvalho has quit [Ping timeout: 256 seconds]

22:43 cjlcarvalho has joined #mlpack

23:07 cjlcarvalho has quit [Ping timeout: 256 seconds]

23:54 manish7294 has joined #mlpack

23:57 < manish7294> rcurtin: Sorry, I think I don't understand your last message very well. Do you mean that we can leave the optimization for Evaluate() and gradient() function or something other?