#mlpack on 2018-06-22 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

04:28 < jenkins-mlpack> Project docker mlpack weekly build build #47: STILL UNSTABLE in 3 hr 41 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20weekly%20build/47/

04:28 < jenkins-mlpack> * wenhao.huang.work: add cf no_normalization

04:28 < jenkins-mlpack> * wenhao.huang.work: normalizationType constructor

04:28 < jenkins-mlpack> * wenhao.huang.work: small bugfix

04:28 < jenkins-mlpack> * wenhao.huang.work: add normalization cmakelist

04:28 < jenkins-mlpack> * wenhao.huang.work: modify cf files

04:28 < jenkins-mlpack> * wenhao.huang.work: change CF to CF<>

04:28 < jenkins-mlpack> * wenhao.huang.work: add normalization to cmakelist

04:28 < jenkins-mlpack> * wenhao.huang.work: update comments

04:28 < jenkins-mlpack> * wenhao.huang.work: style fix

04:28 < jenkins-mlpack> * wenhao.huang.work: bug fix

04:28 < jenkins-mlpack> * wenhao.huang.work: add overall mean normalization

04:28 < jenkins-mlpack> * wenhao.huang.work: add user/item normalization

04:28 < jenkins-mlpack> * wenhao.huang.work: bugfix

04:28 < jenkins-mlpack> * wenhao.huang.work: add z-score normalization

04:28 < jenkins-mlpack> * wenhao.huang.work: add combined normalization

04:28 < jenkins-mlpack> * wenhao.huang.work: update comments

04:28 < jenkins-mlpack> * wenhao.huang.work: very small style fix

04:28 < jenkins-mlpack> * wenhao.huang.work: update comments & debug

04:28 < jenkins-mlpack> * wenhao.huang.work: remove if(cleanData) block

04:28 < jenkins-mlpack> * haritha1313: advanced ctor

04:28 < jenkins-mlpack> * wenhao.huang.work: use complete sentences for examples

04:28 < jenkins-mlpack> * wenhao.huang.work: change method names to Mean() and return const refenrence

04:28 < jenkins-mlpack> * wenhao.huang.work: change param specification

04:28 < jenkins-mlpack> * wenhao.huang.work: new line for brace

04:28 < jenkins-mlpack> * wenhao.huang.work: templatize Normalize() functions in some classes

04:28 < jenkins-mlpack> * haritha1313: tests edit

04:28 < jenkins-mlpack> * haritha1313: style edits

04:28 < jenkins-mlpack> * wenhao.huang.work: initialize members in constructor

04:28 < jenkins-mlpack> * wenhao.huang.work: style

04:28 < jenkins-mlpack> * wenhao.huang.work: style fix

04:28 < jenkins-mlpack> * wenhao.huang.work: Denormalize(users(i), ...)

04:28 < jenkins-mlpack> * wenhao.huang.work: change from arma::vec userMean to arma::rowvec userMean

04:28 < jenkins-mlpack> * Marcus Edel: Minor style fixes and add serialization.

05:25 travis-ci has joined #mlpack

05:25 < travis-ci> manish7294/mlpack#31 (lmnn - 1b571f2 : Manish): The build has errored.

05:25 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/620ee5987fa2...1b571f2b5cf2

05:25 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/77022112

05:25 travis-ci has left #mlpack []

07:08 travis-ci has joined #mlpack

07:08 < travis-ci> manish7294/mlpack#32 (lmnn - 1f08582 : Manish): The build has errored.

07:08 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/1b571f2b5cf2...1f0858230c37

07:08 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/77026701

07:08 travis-ci has left #mlpack []

07:17 < Atharva> zoq: rcurtin: The NormalDistribution class has become very specifiv to the ann module. So Sumedh and I were thinking about moving it in the ann module under a new folder dists. Is that okay? Or should we keep it in the core/dists folder?

08:55 < zoq> Atharva: In this case I would put the class into the ann folder.

08:56 < Atharva> zoq: Okay, so should I create a new dists folder under the ann folder?

08:57 < zoq> Atharva: I think that is a good idea.

08:57 < ShikharJ> zoq: Do you have any further comments for the DCGAN API? That PR is required for WGAN.

08:57 < Atharva> zoq: Okay, I will go ahead with it.

08:58 < zoq> ShikharJ: No, let me hit the merge button.

08:58 < ShikharJ> zoq: Sorry for bothering you again and again regarding this.

08:59 < zoq> ShikharJ: No worries, just wanted to get the test time down :)

09:20 < ShikharJ> zoq: For the next two weeks, I'll be focussing on the WGAN PR, Weight Clipping methods for WGAN and completing the Dual Optimizer PR. Please let me know if it's fine?

09:56 travis-ci has joined #mlpack

09:56 < travis-ci> mlpack/mlpack#5130 (master - 6fd5e52 : Marcus Edel): The build was broken.

09:56 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/0cf310abf1c2...6fd5e527b54b

09:56 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/395379647

09:56 travis-ci has left #mlpack []

10:07 < jenkins-mlpack> Project docker mlpack nightly build build #357: STILL UNSTABLE in 2 hr 53 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20nightly%20build/357/

10:15 travis-ci has joined #mlpack

10:15 < travis-ci> manish7294/mlpack#33 (lmnn - 104fb2a : Manish): The build has errored.

10:15 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/1f0858230c37...104fb2aa1ede

10:15 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/77042568

10:15 travis-ci has left #mlpack []

11:02 wenhao has joined #mlpack

11:20 < wenhao> zoq, rcurtin : I'm sorry for the late reply. I am a bit stuck with programming with opencl for my term project this week :(

11:23 < wenhao> zoq: For different search policies, as they use different metrics for choosing neighbors and calculating the distance, I think the resulting neighbors and similarities are different

11:26 < wenhao> zoq: And by "accumulate the results over multiple runs", did you mean running the algorithm with different seeds ？

11:50 < wenhao> rcurtin: Yes that will be useful. I didn't know that LSH is an alternative search method in mlpack. Thanks for the advice:)

12:00 < wenhao> lozhnikov: Hi Mikhail. I am thinking about the how to implement BiasSVD and SVD++. One issue is that they are not based on matrix factorization in the form of V = W * H, so I might have to refactor the CFTye<...> class template to allow for the implementation of BiasSVD and SVD++ models.

12:04 < wenhao> One of my ideas is to rename the current `CFType<>` to something like `CFMatrixDecompositionModel` which is used specially for cf algorithm based on matrix factorization. And then we can add a wrapper class CFType<ModelType> with interfaces including

12:04 < wenhao> `Predict` `GetRecommendations` etc..

12:05 < wenhao> In this way we can easily add models based on methods other than matrix factorization

12:07 < wenhao> I'm sure whether it's the best way to implement the new models. Any idea or suggestion would be helpful!

12:30 wenhao has quit [Ping timeout: 260 seconds]

12:53 < ShikharJ> rcurtin: Are you there?

13:13 < zoq> ShikharJ: Sounds like a good plan to me.

13:14 < ShikharJ> If we're able to pull that off, we'll be over with our main goals by Phase II. Then we can focus on the other pending PR by Kris on RBMs.

13:18 < zoq> Right, plenty of time for some cool experiments too.

13:47 wenhao has joined #mlpack

14:04 travis-ci has joined #mlpack

14:04 < travis-ci> manish7294/mlpack#34 (lmnn - 58b17e2 : Manish): The build has errored.

14:04 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/104fb2aa1ede...58b17e27284a

14:04 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/77063148

14:04 travis-ci has left #mlpack []

14:20 < wenhao> just found a typo in my previous message: I'm *not* sure whether it's the best way to implement the new models.

14:50 ImQ009 has joined #mlpack

14:55 < rcurtin> ShikharJ: I'm here now, sorry---I slept a little late today

15:10 travis-ci has joined #mlpack

15:10 < travis-ci> manish7294/mlpack#35 (lmnn - 4321cb7 : Manish Kumar): The build passed.

15:10 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/58b17e27284a...4321cb7f10c4

15:10 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/77063351

15:10 travis-ci has left #mlpack []

15:33 manish7294 has joined #mlpack

15:36 < manish7294> rcurtin: I tried changing constraints to class member but somehow things broke up, If you could have a quick look over the commit - "update documentation". I think that could help a lot.

15:38 < ShikharJ> rcurtin: No worries, I overcame the issue I was facing.

15:39 wenhao has quit [Ping timeout: 260 seconds]

15:41 < rcurtin> ShikharJ: ok, sounds good

15:41 < rcurtin> manish7294: ok, let me take a look...

15:42 < ShikharJ> rcurtin: Though I was wondering how you came up the name ratml for you personal domain (more specifically Rage Against The Machine Learning)?

15:43 < rcurtin> ShikharJ: it was a joke based on 'rage against the machine' by a friend... so I can't claim credit myself

15:43 < rcurtin> but since he did not work in machine learning I stole it :)

15:43 < rcurtin> manish7294: I see in the commit that you kept `dataset` as a member of the Constraints class, I was thinking that maybe you could remove that and take `dataset` as a parameter to Impostors() or TargetNeighbors()

15:43 < ShikharJ> rcurtin: The rock band 'Rage Against The Machine'?

15:43 < rcurtin> yeah

15:44 < ShikharJ> haha

15:46 < manish7294> rcurtin: Okay will try that, thanks!

15:47 < rcurtin> right, I see that the commit you sent failed the test. I'm not sure why, but if you want to try the approach I suggested, we can try and find a bug once it's implemented (if there is one)

15:49 < manish7294> Ya, on doing that optimization process was disbalanced and the results were extremely poor

15:54 < rcurtin> right, now ideally it should not have changed the results at all

15:56 manish72942 has joined #mlpack

15:56 < manish72942> Ya, that seems starnge

15:57 < manish72942> *strange

15:58 manish7294 has quit [Ping timeout: 245 seconds]

15:58 < manish72942> maybe I missed something while making that change

16:08 < rcurtin> yeah, it's possible that 'dataset' was being used somewhere where 'transformedDataset' should have been used

16:08 < rcurtin> that would be my first guess

16:08 < rcurtin> but let's see what happens with the refactoring and then dig in if we need

16:22 < zoq> wenhao: Ideally, we use the same conditions for each, but not sure that's easy enough at this point, so using different seeds might be a first test to see drifts in the results.

16:25 < zoq> wenhao: I'm not sure I see the reason for another class, can you clarifiy that point?

16:53 < manish72942> rcurtin: Looks like I found the issue, If you see the shuffle() in lmnn_function_impl.hpp , the labels are shuffled too leaving the precalculate part of Constraints absurd and hence the poor results. So, we may have to call Precalculate() again on shuffle() call.

16:55 < manish72942> I will make the require changes by tomorrow and then we can have it merge.

17:13 < rcurtin> manish72942: right, I guess maybe we have to pass the shuffled labels also then

18:09 < manish72942> Maybe now we can use the Dataset() and a Label() and just make a call to Precalculate during shuffle(), this way we can avoid changing much of the structure. does it sounds reasonable?

18:21 < rcurtin> manish72942: the code you sent earlier ended up making a copy of the dataset each time Dataset() was called

18:21 < rcurtin> so even if it is a little more work I think it's better to refactor so that the dataset and la els are being passed as reference-type parameters

18:22 < manish72942> sure, thanks for explaining :)

18:23 < rcurtin> labels* not la els :)

19:53 < ShikharJ> zoq: Sorry for reaching out a little late. Are you there?

19:56 < zoq> ShikharJ: I'm here.

19:57 < ShikharJ> zoq: I was looking out to making the DualOptimizer API as close to the current API of optimizers, but it seems I'll have to pass some additional parameters to make it work.

19:58 < ShikharJ> For example, in a single optimizer, in the optimize step, we run the routine over the entire set of parameters.

19:59 < ShikharJ> But in the case of GANs, we'll have to train the generator separately till the genWeights parameters and the discriminator from thereon till the end.

20:00 < ShikharJ> I mean the index of the submatrix would be from 0 to genWeights - 1 and genWeights to parameters.n_elem - 1, or something like that.

20:02 < ShikharJ> zoq: Do you think we should add the genWeights parameter in the Optimize() function in the dual Optimizer class, or should we instead pass this in the constructor itself?

20:03 < zoq> ShikharJ: hm, how we update the parameter is function (GAN, logistic regression, etc.) specific, so in the GAN function we could select the correct index.

20:04 < zoq> The issue I see is, that we allocate unnecessary/unused memory for the optimization process.

20:07 < zoq> So what you propose is to pass the weights for the two functions right?

20:08 < ShikharJ> zoq: Just the indices of the weight boundaries, but I'm not sure if this should happen inside Optimize() function which has a set number of arguments for most optimizers, or this should be done in the constructor.

20:09 manish72942 has quit [Ping timeout: 240 seconds]

20:12 < ShikharJ> Basically whether the function signature should be changed here (https://github.com/mlpack/mlpack/pull/1437/files#diff-64799002b4618155c00cc66b45610c74R38) or here (https://github.com/mlpack/mlpack/pull/1437/files#diff-64799002b4618155c00cc66b45610c74R31).

20:17 < zoq> Don't see any issues with the constructor or optimizer method, in either case we should set a default parameter, which uses the full set for both, that way, it can be used for the existing methods.

20:18 < ShikharJ> zoq: Consequently, I also believe that we'll need to have two separate Gradient and Evaluate functions for the two networks right?

20:18 < ShikharJ> Because they have two separate optimizers?

20:19 < zoq> You are talking about the GAN class right?

20:19 < ShikharJ> Yes

20:25 < zoq> Because the input for the one is generated by another network, I think you are right.

20:25 < zoq> What about passing a single function (GAN class), that handles the specifics inside the class itself.

20:26 < zoq> I guess that is basically what we have right now.

20:26 < zoq> Just that we have this dual optimizer class with an additional bounds information.

20:27 < zoq> If we implement a specific Evaluate/Gradient function, the optimizer is very GAN specific.

20:30 < ShikharJ> What I have trouble visualizing is that since we have only a single Evaluate/Gradient function inside the GAN class, when we shift to two of each of them, how would the individual optimizers know which ones to refer to

20:31 < zoq> ohh, you are right I missed that point

20:34 < zoq> I guess in thise case there is no easy way around the dual function approach

20:38 < zoq> What do you think if we specialize the DualOptimizer class itself for the GAN class? That way we could overwrite the Optimize function and singal once the other network is trained?

20:39 < ShikharJ> zoq: Can we try exploring a template-enabled dual function implementation. Maybe we define two evaluate functions, but we use enable_if to check for the required function (it might not be as easy as I'm guessing it to be but we can explore this).

20:41 < ShikharJ> Or maybe we can also try what you mentioned above.

20:42 < zoq> I was talking about something like: https://github.com/mlpack/mlpack/blob/0068a53a4d752d9553c08acc752898cf393a4f12/src/mlpack/methods/regularized_svd/regularized_svd_function.hpp

20:42 < zoq> line 149

20:43 < zoq> We could also see if we are able to use some template functions like enable_if as you suggested.

20:44 < ShikharJ> zoq: Ah I see, that can certainly be explored. I'll let you know what I find. I'll explore these two options for now.

20:47 < zoq> ShikharJ: I'll see if I can think of anything else.

21:30 ImQ009 has quit [Quit: Leaving]