verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
< Atharva>
zoq: rcurtin: The NormalDistribution class has become very specifiv to the ann module. So Sumedh and I were thinking about moving it in the ann module under a new folder dists. Is that okay? Or should we keep it in the core/dists folder?
< zoq>
Atharva: In this case I would put the class into the ann folder.
< Atharva>
zoq: Okay, so should I create a new dists folder under the ann folder?
< zoq>
Atharva: I think that is a good idea.
< ShikharJ>
zoq: Do you have any further comments for the DCGAN API? That PR is required for WGAN.
< Atharva>
zoq: Okay, I will go ahead with it.
< zoq>
ShikharJ: No, let me hit the merge button.
< ShikharJ>
zoq: Sorry for bothering you again and again regarding this.
< zoq>
ShikharJ: No worries, just wanted to get the test time down :)
< ShikharJ>
zoq: For the next two weeks, I'll be focussing on the WGAN PR, Weight Clipping methods for WGAN and completing the Dual Optimizer PR. Please let me know if it's fine?
travis-ci has joined #mlpack
< travis-ci>
mlpack/mlpack#5130 (master - 6fd5e52 : Marcus Edel): The build was broken.
< wenhao>
zoq, rcurtin : I'm sorry for the late reply. I am a bit stuck with programming with opencl for my term project this week :(
< wenhao>
zoq: For different search policies, as they use different metrics for choosing neighbors and calculating the distance, I think the resulting neighbors and similarities are different
< wenhao>
zoq: And by "accumulate the results over multiple runs", did you mean running the algorithm with different seeds ?
< wenhao>
rcurtin: Yes that will be useful. I didn't know that LSH is an alternative search method in mlpack. Thanks for the advice:)
< wenhao>
lozhnikov: Hi Mikhail. I am thinking about the how to implement BiasSVD and SVD++. One issue is that they are not based on matrix factorization in the form of V = W * H, so I might have to refactor the CFTye<...> class template to allow for the implementation of BiasSVD and SVD++ models.
< wenhao>
One of my ideas is to rename the current `CFType<>` to something like `CFMatrixDecompositionModel` which is used specially for cf algorithm based on matrix factorization. And then we can add a wrapper class CFType<ModelType> with interfaces including
< wenhao>
`Predict` `GetRecommendations` etc..
< wenhao>
In this way we can easily add models based on methods other than matrix factorization
< wenhao>
I'm sure whether it's the best way to implement the new models. Any idea or suggestion would be helpful!
wenhao has quit [Ping timeout: 260 seconds]
< ShikharJ>
rcurtin: Are you there?
< zoq>
ShikharJ: Sounds like a good plan to me.
< ShikharJ>
If we're able to pull that off, we'll be over with our main goals by Phase II. Then we can focus on the other pending PR by Kris on RBMs.
< zoq>
Right, plenty of time for some cool experiments too.
wenhao has joined #mlpack
travis-ci has joined #mlpack
< travis-ci>
manish7294/mlpack#34 (lmnn - 58b17e2 : Manish): The build has errored.
< manish7294>
rcurtin: I tried changing constraints to class member but somehow things broke up, If you could have a quick look over the commit - "update documentation". I think that could help a lot.
< ShikharJ>
rcurtin: No worries, I overcame the issue I was facing.
wenhao has quit [Ping timeout: 260 seconds]
< rcurtin>
ShikharJ: ok, sounds good
< rcurtin>
manish7294: ok, let me take a look...
< ShikharJ>
rcurtin: Though I was wondering how you came up the name ratml for you personal domain (more specifically Rage Against The Machine Learning)?
< rcurtin>
ShikharJ: it was a joke based on 'rage against the machine' by a friend... so I can't claim credit myself
< rcurtin>
but since he did not work in machine learning I stole it :)
< rcurtin>
manish7294: I see in the commit that you kept `dataset` as a member of the Constraints class, I was thinking that maybe you could remove that and take `dataset` as a parameter to Impostors() or TargetNeighbors()
< ShikharJ>
rcurtin: The rock band 'Rage Against The Machine'?
< rcurtin>
yeah
< ShikharJ>
haha
< manish7294>
rcurtin: Okay will try that, thanks!
< rcurtin>
right, I see that the commit you sent failed the test. I'm not sure why, but if you want to try the approach I suggested, we can try and find a bug once it's implemented (if there is one)
< manish7294>
Ya, on doing that optimization process was disbalanced and the results were extremely poor
< rcurtin>
right, now ideally it should not have changed the results at all
manish72942 has joined #mlpack
< manish72942>
Ya, that seems starnge
< manish72942>
*strange
manish7294 has quit [Ping timeout: 245 seconds]
< manish72942>
maybe I missed something while making that change
< rcurtin>
yeah, it's possible that 'dataset' was being used somewhere where 'transformedDataset' should have been used
< rcurtin>
that would be my first guess
< rcurtin>
but let's see what happens with the refactoring and then dig in if we need
< zoq>
wenhao: Ideally, we use the same conditions for each, but not sure that's easy enough at this point, so using different seeds might be a first test to see drifts in the results.
< zoq>
wenhao: I'm not sure I see the reason for another class, can you clarifiy that point?
< manish72942>
rcurtin: Looks like I found the issue, If you see the shuffle() in lmnn_function_impl.hpp , the labels are shuffled too leaving the precalculate part of Constraints absurd and hence the poor results. So, we may have to call Precalculate() again on shuffle() call.
< manish72942>
I will make the require changes by tomorrow and then we can have it merge.
< rcurtin>
manish72942: right, I guess maybe we have to pass the shuffled labels also then
< manish72942>
Maybe now we can use the Dataset() and a Label() and just make a call to Precalculate during shuffle(), this way we can avoid changing much of the structure. does it sounds reasonable?
< rcurtin>
manish72942: the code you sent earlier ended up making a copy of the dataset each time Dataset() was called
< rcurtin>
so even if it is a little more work I think it's better to refactor so that the dataset and la els are being passed as reference-type parameters
< manish72942>
sure, thanks for explaining :)
< rcurtin>
labels* not la els :)
< ShikharJ>
zoq: Sorry for reaching out a little late. Are you there?
< zoq>
ShikharJ: I'm here.
< ShikharJ>
zoq: I was looking out to making the DualOptimizer API as close to the current API of optimizers, but it seems I'll have to pass some additional parameters to make it work.
< ShikharJ>
For example, in a single optimizer, in the optimize step, we run the routine over the entire set of parameters.
< ShikharJ>
But in the case of GANs, we'll have to train the generator separately till the genWeights parameters and the discriminator from thereon till the end.
< ShikharJ>
I mean the index of the submatrix would be from 0 to genWeights - 1 and genWeights to parameters.n_elem - 1, or something like that.
< ShikharJ>
zoq: Do you think we should add the genWeights parameter in the Optimize() function in the dual Optimizer class, or should we instead pass this in the constructor itself?
< zoq>
ShikharJ: hm, how we update the parameter is function (GAN, logistic regression, etc.) specific, so in the GAN function we could select the correct index.
< zoq>
The issue I see is, that we allocate unnecessary/unused memory for the optimization process.
< zoq>
So what you propose is to pass the weights for the two functions right?
< ShikharJ>
zoq: Just the indices of the weight boundaries, but I'm not sure if this should happen inside Optimize() function which has a set number of arguments for most optimizers, or this should be done in the constructor.
< zoq>
Don't see any issues with the constructor or optimizer method, in either case we should set a default parameter, which uses the full set for both, that way, it can be used for the existing methods.
< ShikharJ>
zoq: Consequently, I also believe that we'll need to have two separate Gradient and Evaluate functions for the two networks right?
< ShikharJ>
Because they have two separate optimizers?
< zoq>
You are talking about the GAN class right?
< ShikharJ>
Yes
< zoq>
Because the input for the one is generated by another network, I think you are right.
< zoq>
What about passing a single function (GAN class), that handles the specifics inside the class itself.
< zoq>
I guess that is basically what we have right now.
< zoq>
Just that we have this dual optimizer class with an additional bounds information.
< zoq>
If we implement a specific Evaluate/Gradient function, the optimizer is very GAN specific.
< ShikharJ>
What I have trouble visualizing is that since we have only a single Evaluate/Gradient function inside the GAN class, when we shift to two of each of them, how would the individual optimizers know which ones to refer to
< zoq>
ohh, you are right I missed that point
< zoq>
I guess in thise case there is no easy way around the dual function approach
< zoq>
What do you think if we specialize the DualOptimizer class itself for the GAN class? That way we could overwrite the Optimize function and singal once the other network is trained?
< ShikharJ>
zoq: Can we try exploring a template-enabled dual function implementation. Maybe we define two evaluate functions, but we use enable_if to check for the required function (it might not be as easy as I'm guessing it to be but we can explore this).
< ShikharJ>
Or maybe we can also try what you mentioned above.