#mlpack on 2018-07-17 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

00:25 robertohueso has left #mlpack []

01:17 vivekp has quit [Ping timeout: 248 seconds]

01:20 vivekp has joined #mlpack

02:40 caiojcarvalho has joined #mlpack

03:37 travis-ci has joined #mlpack

03:37 < travis-ci> manish7294/mlpack#2 (evalBounds - 457980e : Manish): The build passed.

03:37 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/65842b9867fa...457980ebfe3e

03:37 < travis-ci> Build details : https://travis-ci.org/manish7294/mlpack/builds/404702873

03:37 travis-ci has left #mlpack []

05:13 cjlcarvalho has joined #mlpack

05:14 caiojcarvalho has quit [Ping timeout: 276 seconds]

05:23 caiojcarvalho has joined #mlpack

05:23 cjlcarvalho has quit [Ping timeout: 260 seconds]

05:49 vivekp has quit [Ping timeout: 240 seconds]

05:50 vivekp has joined #mlpack

07:16 lozhnikov has quit [Ping timeout: 240 seconds]

07:32 lozhnikov has joined #mlpack

07:50 lozhnikov has quit [Ping timeout: 276 seconds]

07:52 cjlcarvalho has joined #mlpack

07:53 caiojcarvalho has quit [Ping timeout: 256 seconds]

07:58 lozhnikov has joined #mlpack

08:17 cjlcarvalho has quit [Ping timeout: 240 seconds]

08:17 cjlcarvalho has joined #mlpack

08:21 lozhnikov has quit [Ping timeout: 260 seconds]

08:46 vivekp has quit [Ping timeout: 244 seconds]

08:48 manish7294 has joined #mlpack

08:48 vivekp has joined #mlpack

08:50 manish7294 has quit [Client Quit]

08:51 manish7294 has joined #mlpack

08:55 < manish7294> rcurtin: Got a copy of your mail from mailing list, look like you're already up. I want to discuss the structure of boostmetric. Do you think it is a good time for that?

08:56 < rcurtin> manish7294: I am in talks all day, so I think maybe it would be best to use email for that

08:56 < manish7294> sure :)

08:56 < rcurtin> I won't really be able to devote much time at the moment, only quick responses, etc.

08:57 < manish7294> no problem

08:57 < rcurtin> I'm waking up roughly 7 UTC this week so our awake times overlap much more than usual :)

08:59 < manish7294> JUst if we you have few seconds to spare, Do you think we would be able to use existing optimizers for boostmetric. https://arxiv.org/pdf/0910.2279.pdf

08:59 < manish7294> As looking from the algorithm I think we have to build it from scratch.

09:00 < rcurtin> I'll take a look when I have a second and respond then, thanks for the direct PDF link :)

09:00 < manish7294> sure

09:01 < rcurtin> do you mean for Algorithm 2?

09:03 manish72942 has joined #mlpack

09:04 < manish72942> Right, sorry for delay --- Lost the connection

09:04 manish7294 has quit [Ping timeout: 252 seconds]

09:05 < rcurtin> this will take me a little time to think about

09:05 < rcurtin> I will try and have an answer later today

09:06 < manish72942> no need to hurry :)

09:24 < rcurtin> manish72942: I haven't had time to really give it a good look, but my instinct is, the BoostMetric paper claims both speedup and accuracy boost over LMNN

09:24 < rcurtin> so, if you want to devote time now (if it will not take too long), you could implement it by itself and we could see if both of those claims are true

09:24 < rcurtin> I am not sure if speedup will still be obtained given our optimized implementation (which still has some further optimization to go)

09:26 < manish72942> like rough implementation as guven in algo without any optimizer or anything,right?

09:26 < manish72942> *given

09:27 < manish72942> If that so, I am on my way.

09:28 < rcurtin> right, I think that is reasonable, but the more important thing here will be trying to reproduce the results of their paper

09:29 < rcurtin> I want to make sure that in the time we have left, we get something interesting

09:29 < rcurtin> so really the situation I want to try to avoid is an incompletely optimized LMNN and then we find out that BoostMetric does not consistently give speedup or improved accuracy over LMNN

09:30 < rcurtin> fully optimized LMNN is interesting by itself, and also interesting is fast BoostMetric with some of the LMNN optimizations

09:30 < rcurtin> I need to read the BoostMetric paper in full (I have not had a chance to do that, sorry; I have been focused on LMNN)

10:16 < rcurtin> manish72942: more about the runtime results in BoostMetric:

10:16 < rcurtin> (1) the paper claims that for each iteration of LMNN, a projection of M back onto the PSD cone is needed, which costs an O(d^3) eigendecomposition

10:17 < rcurtin> however, in our implementation, since we are optimizing L (where M = L^T L) directly, M is always guaranteed to be PSD so we do not ever need to take that step

10:17 < rcurtin> (2) the paper points out that their implementation was in MATLAB, and that further speedup could be seen in C/C++

10:18 < rcurtin> to me, this almost guarantees they used the MATLAB implementation of LMNN, which we already know to be inefficient since it computes the full distance matrix

10:18 < rcurtin> so, an "efficient" implementation of BoostMetric may behave entirely differently than their results (with respect to speed at least)

10:18 < manish72942> Ya, they have even referenced it in their implementation

10:18 < rcurtin> so I don't mean to say BoostMetric is bad or anything, of course---I just mean that we can't be sure of exactly what we will encounter with respect to speed

10:19 < rcurtin> do you think you would rather implement BoostMetric or keep working on the LMNN optimizations? (or perhaps you feel that you can do both in parallel?)

10:21 < manish72942> from your above comments it looks like we already have done all these optimization, so we shouldn't be expecting much from this. But still let's give it a shot, maybe just today itself I will try to work it out.

10:22 < rcurtin> I think it's still completely possible that all of our LMNN optimizations will apply to BoostMetric

10:22 < rcurtin> and if it doesn't fit exactly into the optimizer API, that's ok---after all, our existing AdaBoost implementation does not either

10:23 < rcurtin> but as far as any paper goes, we can say, e.g., "we have provided order-of-magnitude+ speedups to LMNN and expect that these would be applicable to LMNN derivatives such as BoostMetric, PSMetric, etc."

10:23 < manish72942> I will try to make a rough implementation today and will see, whether it's any good continuing working on boost metric.

10:24 < manish72942> agreed, we are at least in position to claim that

10:27 < rcurtin> but I don't think we can persuasively say, e.g., "we got a little bit of speedup for LMNN and also implemented BoostMetric but roughly only see the same results as the BoostMetric paper" :)

10:27 < rcurtin> anyway, yeah, that sounds good, let's see what the rough implementation does

10:31 < manish72942> :)

10:56 lozhnikov has joined #mlpack

11:02 lozhnikov has quit [Ping timeout: 264 seconds]

11:22 lozhnikov has joined #mlpack

11:31 lozhnikov_ has joined #mlpack

11:32 lozhnikov has quit [Ping timeout: 240 seconds]

11:33 lozhnikov_ has quit [Client Quit]

11:34 < ShikharJ> lozhnikov: zoq : I hace tried debugging the RBM PR to an extent, but I'm unable to get the test accuracy up. Could you guys take a review of the code please?

11:35 lozhnikov has joined #mlpack

11:52 lozhnikov has quit [Ping timeout: 256 seconds]

12:17 < zoq> ShikharJ: Can you narrow down the issue to some part of the code? I'll take a look at the code later today, but I think it would be helpful to get some additional information, maybe you can tell us what you already tried?

13:47 cjlcarvalho has quit [Ping timeout: 248 seconds]

13:59 ImQ009 has joined #mlpack

14:06 < sumedhghaisas> Atharva: Hi Atharva

14:06 < sumedhghaisas> Hows it going?

14:07 < sumedhghaisas> Did the model work with MeanSquaredError?

14:08 < ShikharJ> zoq: I have added the support for mini-batches, but I'm doubtful of the usefulness of the design (I'd refactor the entire PR to use SFINAE + enable_if<>). Plus I'm not sure where the FreeEnergy function of SSRBM originates from. That, and a number of issues while working with mini-batch inputs since most of the code was designed keeping single input in mind, but the tests make use of mini-batches.

14:20 < ShikharJ> zoq: Most of the other part of the code is correct, but these issues are likely to be the cause of trouble. More specifically, the ones relating to updation of gradients.

14:45 cjlcarvalho has joined #mlpack

14:59 < zoq> ShikharJ: I guess, it would make sense to switch back to the single input case, if that might cause some issues; don't think training over minbatches is that important at least at this point.

15:00 < zoq> Also, this sounds like that we should start with the free energy function.

15:01 < ShikharJ> zoq: I tried augmenting the test-cases for single inputs, but even there the accuracy is not good, so there's probably a problem with our Evaluate-Gradient routines.

15:03 < zoq> ShikharJ: Okay, we should check the gradients for some steps, perhaps we see some strange values (inf, zeros).

15:06 caiojcarvalho has joined #mlpack

15:10 cjlcarvalho has quit [Ping timeout: 268 seconds]

15:23 jenkins-mlpack has quit [Ping timeout: 256 seconds]

16:00 manish7294 has joined #mlpack

16:03 < manish7294> rcurtin: Here's a rough implementation but it seems the binary search part takes just indefinite time (probably something is wrong), if you get some time please have a look at it - https://gist.github.com/manish7294/3d97be37919658b96bba0125f2f3de84

16:37 < manish7294> hmm, it looks like the the terminating condition for bisection given in the paper and in the implementation differ by an extra condition (abs(lhs) < EPS), Now after adding that condition, it seems boostmetric is superfast :)

16:38 < manish7294> The main reason I could think of is --- it doesn't recalculate impostors at every iteration.

16:49 vivekp has quit [Ping timeout: 264 seconds]

16:50 vivekp has joined #mlpack

17:01 xa0 has quit [Ping timeout: 256 seconds]

17:02 manish7294 has quit [Ping timeout: 252 seconds]

17:06 manish72942 has quit [Ping timeout: 265 seconds]

17:07 xa0 has joined #mlpack

17:19 xa0 has quit [Ping timeout: 244 seconds]

17:26 robertohueso has joined #mlpack

17:26 xa0 has joined #mlpack

17:31 < rcurtin> manish7294: great to hear it's fast, can you get some timings/accuracy reports on different datasets?

17:32 < rcurtin> if the issue is that it's not calculating impostors, we could also have a variant of LMNN where we don't recalculate impostors, and see what the performance there is

17:32 < rcurtin> it may also be implicit in their paper that impostors need to be recalculated, so maybe their implementation recalculated impostors but the paper didn't make it clear that needed to be done

17:35 manish7294 has joined #mlpack

17:37 < manish7294> rcurtin: Here's the original implementation https://gist.github.com/manish7294/123598515035fe5a37f0a049143e06ac , and I don't think they have ever recalculated impostors (they just have done it once for calculating knn_triplets)

17:41 < rcurtin> I don't really have time to look into the implementation, I am just offering possibilities for the speedup

17:41 < rcurtin> it will be interesting ti see the accuracy results, and then we should also compare with LMNN where we never recalculate impostors

17:42 < manish7294> no worries, I will post them soon :)

17:43 < rcurtin> sure, sounds good

18:25 < manish7294> rcurtin: Here are some simulations : https://gist.github.com/manish7294/2388267666b1159ce261ce7b95dc923c

18:26 manish7294 has quit [Quit: Page closed]

19:26 < Atharva> zoq: You there?

19:35 < zoq> Atharva: I'm here now.

19:38 < Atharva> zoq: I have realised that serialising parameters of the Sequential layer does not work. As the Sequential layer is just a container its parameter object is empty.

19:39 < Atharva> Instead, I propose a different solution to access the encoder and decoder of a network seperately, which I also think might be useful in other cases.

19:41 < Atharva> What do you think about a ForwardPartial() function in the FFN class which takes in input, output matrices and the starting and ending number of the layers within the network to forward pass through.

19:41 < Atharva> I implemented it locally and it saved a lot of effort when working with the decoder and encoder seperately

19:42 < zoq> Atharva: You are right, but using 'ar & BOOST_SERIALIZATION_NVP(network);' should call the serialize function of each layer, at the end we still have to implement the reset function through.

19:43 < zoq> Atharva: hm, that is an interesting idea, do you think we could provide another Forward function that does the same?

19:43 < Atharva> zoq: Okay, so do you want this function to be called just Forward instead of ForwardPartial ?

19:44 < zoq> Atharva: If you think that is reasonable, I think it looks cleaner.

19:45 < Atharva> zoq: Yes, it will call the serialize function of each layer, but then no layer serializes its parameters in the serialize function. So the trained parameters never get saved individually.

19:45 < Atharva> zoq: Okay! I will create a new PR then.

19:46 < zoq> Atharva: Right, you still ahve to collect the parameter in the reset function, your solution sounds much simpler.

20:23 ImQ009 has quit [Quit: Leaving]

21:15 < zoq> ShikharJ: Do you use 'RBMNetworkTest/ssRBMClassificationTest' for testing?

21:20 lozhnikov has joined #mlpack

21:27 < Atharva> zoq: I just posted a blog post, but the website isn't getting updated.

21:50 < zoq> Atharva: hm, I wonder if changing the date from 2018-07-10 to 2018-07-17 will fix the issue.

21:54 < Atharva> zoq: Oh sorry, I didn't change the date when I copied it.

22:06 < zoq> okay, that doesn't fix the issue, I'll look into it tomorrow.

22:16 < ShikharJ> zoq: Yes.

22:19 < zoq> ShikharJ: I get the following error:

22:19 < zoq> error: as_scalar(): expression doesn't evaluate to exactly one element

22:19 < zoq> unknown location:0: fatal error: in "RBMNetworkTest/ssRBMClassificationTest": std::logic_error: as_scalar(): expression doesn't evaluate to exactly one element

22:20 < ShikharJ> zoq: Ah, the ssRBM needs to be changed a little for batch support I guess.

22:21 < ShikharJ> zoq: I'll push in a few changes in an hour or so, probably that would fix this as well.

22:21 < zoq> ShikharJ: Okay, thanks!

22:21 < ShikharJ> zoq: The bigger issue lies with BinaryRBM code.

22:23 < ShikharJ> zoq: In my system, SSRBM is still giving about 74% accuracy, while BinaryRBM is just a notch above 65%.

22:24 < zoq> For the binary test I get: error: addition: incompatible matrix dimensions: 100x10 and 100x1, sounds like some batch size issue.

22:26 < ShikharJ> zoq: Have you pulled in the latest code from the branch?

22:28 < ShikharJ> zoq: Because these issues were there in previous versions on the code, which atleast builds and runs fine for now? What configuration of CMake are you using?

22:28 < zoq> cmake -DBUILD_CLI_EXECUTABLES=OFF -DDEBUG=ON -DBUILD_PYTHON_BINDINGS=OFF ..

22:28 < zoq> last commit is 98b5fc04d, which I think is the latest version

22:30 < zoq> travis ends up with the same error

22:30 < ShikharJ> zoq: Ok, the commit seems to be fine, I'll look into this as well. Thanks for letting me know.

22:59 lozhnikov has quit [Ping timeout: 240 seconds]