verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
wenhao has quit [Ping timeout: 260 seconds]
killer_bee[m] has joined #mlpack
gmanlan has joined #mlpack
< gmanlan> rcurtin: where do you think it would be a good place to add a c++ VS sample app referred by one of the new /doc/guide tutorials?
< rcurtin> gmanlan: hmmm, we could make a doc/examples directory
< rcurtin> if you do that, I will also put some other example programs in there, I think it could be a good idea
< gmanlan> great, will do that
vivekp has quit [Ping timeout: 268 seconds]
vivekp has joined #mlpack
gmanlan has quit [Quit: Page closed]
vivekp has quit [Read error: Connection reset by peer]
vivekp has joined #mlpack
manish7294 has joined #mlpack
< manish7294> rcurtin: Here is a accuracy curve over number of passes on vc2 dataset with k = 5, step size = 0.01, batch size = 50, optimier = amsgrad, https://pasteboard.co/HpE35mm.png
< manish7294> Will this work?
< ShikharJ> zoq: Are you there?
< manish7294> rcurtin: I have posted some resultant graphs on PR, please have a look at them https://github.com/mlpack/mlpack/pull/1407#issuecomment-396837003
vivekp has quit [Ping timeout: 240 seconds]
vivekp has joined #mlpack
< jenkins-mlpack> Project docker mlpack nightly build build #348: UNSTABLE in 3 hr 32 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20nightly%20build/348/
witness_ has quit [Quit: Connection closed for inactivity]
vivekp has quit [Ping timeout: 268 seconds]
vivekp has joined #mlpack
< zoq> ShikharJ: yes
< ShikharJ> zoq: In FFNs, we take each input column as a data point, and each row as a data dimension correct? Then what do each slices represent?
< sumedhghaisas> Atharva: Hi Atharva
< sumedhghaisas> Sorry got little busy yesterday
< sumedhghaisas> Maybe I have a solution for generation problem :)
< sumedhghaisas> rcutin: Hi Ryan, its super early in Georgia, but maybe you are awake? :)
< Atharva> sumedhghaisas: Hi Sumedh, don't worry about it.
< Atharva> Awesome! How?
< zoq> ShikharJ: That's correct. In case of the RNN class each time slice is a slice of a cube. If I remember right we don't use slices/cubes inside the FFN class.
< ShikharJ> zoq: I'm asking this because for the CelebA, I need to have 3 channels for each input image. I think we can take each slice for a different channel? What do you think?
travis-ci has joined #mlpack
< travis-ci> manish7294/mlpack#25 (lmnn - 3415f20 : Manish): The build has errored.
travis-ci has left #mlpack []
< sumedhghaisas> Atharva: Okay. We first support adding entire FFN as layer. Then we create a decoder FFN and encoder FFN, and we merge then together along with a repar layer in another FFN
< zoq> ShikharJ: Yes, I think we can do this, another option is to vectorize the input, which is necessary at some point anyway if the layer expects a matrix/vector.
< sumedhghaisas> So for conditional generation, user can use the full FFN, and for unconditional they can just pass the gaussian sample directly in the decoder
< zoq> sumedhghais Atharva: Perhaps the sequential layer is useful here, which is basically the FFN class but without the Train/Evaluate function.
< zoq> ShikharJ: But I guess in make sense to provide an interface that accepts arma::cube as input.
< sumedhghaisas> zoq: ahh yes. That's better. Wait... but we need to Evaluate the decoder for generation
< ShikharJ> zoq: By vectorizing the input, I guess we'll have to introduce the least amount of changes in the codebase, but the work of preparing the dataset will fall on the user.
< zoq> sumedhghais: You can still use the Forward function, which is what Evaluate calles anyway.
< sumedhghaisas> zoq: yeah... that just sleeped my mind
< zoq> ShikharJ: That depends on how we load the dataset right?
< sumedhghaisas> zoq: I also wanted to talk to you about the generative layers... by generative layers i mean layers which define distribution over the input. For example the output layer of VAE. Do you think we need to support that?
< zoq> ShikharJ: But I think using arma::cube is more intuitive, so perhaps a combination of both is the best way?
< sumedhghaisas> For example, in VAE my tensorflow implementation involves defining a distribution as the last layer and reconstruction loss in basically the log_prob of the input in that distribution
< ShikharJ> zoq: What I mean to say is that if we expect the input to be a vector irrespective of the number of channels, we'll practically have to do no additional work, as inside the convolution layer, we alias an input point as an arma::Cube(input, rows, cols, slices ...);
< sumedhghaisas> This generalizes to any distribution and input
< Atharva> sumedhghaisas: The Evaluate function calls the private function Forward and not the public member
< zoq> sumedhghais: I see that this could be helpful, but I guess if it's not necessary at this point, it could be delayed?
< ShikharJ> zoq: But when we take the input point as a cube of dimensions (Rows x 1 x Channels), we'll have to reshape the input data.
< sumedhghaisas> Atharva: I think Marcus is right. We could use a sequential layerfor decoder and use the Forward function to generate unconditional samples
< zoq> ShikharJ: What I like to avoid is to add support for arma::cube to each layer, creating an alias should be neglectable.
< Atharva> sumedhghaisas: Yes, just checked it out
< sumedhghaisas> zoq: Its not super necessary right now. We could create a loss layer specific to binary images such as MNIST and use it in VAE, but that is as much work as defining the distribution layer. My question is, can the current framework accommodate layer output other than arma::mat?
< ShikharJ> zoq: I see, then we can just expect the input as a vector point itself.
< zoq> Atharva: Here is an example that creates two networks and merges the output: https://github.com/mlpack/mlpack/pull/1427
< sumedhghaisas> zoq: I think it can, as OutputParameter has template class
< zoq> ShikharJ: What we could do is to provide an interface GAN class that takes arma::cube as input and inside that class we can do something like arma::Mat(slice(.).memptr(), ...)
< zoq> sumedhghais: I don't mind to modifiy the output layer infrastructure, adding another template parameter should be easy.
< sumedhghaisas> zoq: I was thinking the same thing but I am not able to judge how difficult is that. If its relatively easy then I would push for templatized distribution layer which will make things very easy for VAE and other generative frameworks
< sumedhghaisas> as we already have some distributions defined, the framework will by default support various datasets than defining a loss for each specific one
< ShikharJ> zoq: I think, rather than creating an interface, we can provide a dataset conversion routine for converting the individual Cube inputs into vectorised columns.
< zoq> sumedhghais: I can help with the implementation.
< zoq> ShikharJ: That is a good idea. I guess, at some point it make sense to integrate that into the load function, which I think doesn't support arma::cube.
< sumedhghaisas> zoq: amazing! I think it will also help in the GAN framework?
< zoq> sumedhghais: Yes, this could be a nice addition.
< sumedhghaisas> zoq: I think with the distribution layer and sequential layer, do we actuallly require a seperate GAN class? We could do it by having generator and discriminator as sequential layers with generator output layer as distribution layer. What you think? This way the GANs could use variational inference with attached Repar layer to the generator
< sumedhghaisas> I am no good in GANs but I saw couple of papers using variational inference in GANs
< sumedhghaisas> zoq: I will add it to the next weeks agenda. This implementation. :)
manish7294 has quit [Ping timeout: 260 seconds]
< Atharva> sumedhghaisas: So, what are we exactly going to do with the outputlayer?
< zoq> sumedhghais: Like the idea to combine both :)
< sumedhghaisas> Atharva: Okay so the plan is this. Currently each layer stores its output in output parameter
< sumedhghaisas> The new layer we are going to define will output a distribution, core::dist object rather than a matrix
< sumedhghaisas> so we make sure this kind of layer can exist in the framework
< Atharva> Okay, yes
< sumedhghaisas> the layer in question is a very simple one, it just takes an input and defines a templatized distribution over it
< Atharva> So, will that replace the repar layer as well
< Atharva> Or will that be different
< sumedhghaisas> It will be the output layer for VAE, that way the loss function we define, reconstruction loss becomes super easy, we just check the log_prob in the output distribution
< sumedhghaisas> Ohh repar layer will stay the same, these changes will affect the decoder output
< Atharva> Okay, so we are talking about the OutputLayer object, the final layer of networks
< Atharva> Understood
< sumedhghaisas> If we implement log_prob and its backward in the distribution we could play aroud with any dataset
< sumedhghaisas> Okay this is the work for later. How is Jacobian test holding up? :)
< sumedhghaisas> Lets get the Repar layer completed first.
< Atharva> Sorry I had some other things yesterday, I will push a commit soon with Jacobian fixed. What do you think remains after that in Repar layer?
< sumedhghaisas> Atharva: and I will also recommend testing the 2 booleans to the layer. Super easy and short tests but that will make sure that some future user cannot mess up with the implementation
< Atharva> Okay, about the Jacobian, I have added the extra loss from visitor to the Backward function in FFN as well. Not just the Evaluate
< Atharva> Sorry, I still can't figure out why exactly it's failing
< sumedhghaisas> So we added Gradient test, Jacobian test, Forward and Backward and later the booleans, that should cover it :)
< sumedhghaisas> ohh is it still failing after the changes?
< sumedhghaisas> or you mean you didn't understanding why it was failing in the first place?
< sumedhghaisas> *understand
< Atharva> I haven't added the boolean yet, but the it's failing after adding the extra loss to the Backward function.
< Atharva> Yes, I haven't exactly understood
< sumedhghaisas> Ahh okay lets walk you through the Jacobian test
< sumedhghaisas> Okay so we have to Jacobians, jacobianA, which is the approximate jacobian and jacobianB which is the rela one based on the gradients
< sumedhghaisas> okay wait, have you studied jacobian?
< Atharva> I think I have but can't remember anything
< Atharva> Maybe it's better if I first look it up online and then we discuss it
< sumedhghaisas> Wikipedia should be enough for basic understanding :) its geometrical interpretation is too complex though
< sumedhghaisas> yeah :) take a look at what jacobian is and compare it with the computation of jacobianA and jacobianB in our code.
< Atharva> Okay sure
< sumedhghaisas> I have a little test if you understand it correctly or not. Try to answer this, what extra term do we have to add in the Forward of the layer to make the jacobian test work with klBackward included in the gradients?
< sumedhghaisas> is the question clear enough?
< Atharva> question is clear, I will get back on this
< sumedhghaisas> great! Good luck!
< rcurtin> sumedhghaisas: I am awake now :)
< sumedhghaisas> rcurtin: ahh wanted to talk about the distribution layer issue :) BTW the extra loss collector visitor is added as part of Reparametrization layer PR. :)
< rcurtin> ok, do you mean the multivariate Gaussian distribution issue?
< sumedhghaisas> okay so the question regarding distribution layer is, currently all layers output a matrix, can a layer output a dist object?
< sumedhghaisas> I mean can our current framework support it?
< rcurtin> I don't think that would make sense, since the next layer would expect a matrix type as input
< sumedhghaisas> ahh yes, the distribution will go inside the layer, any, not just gaussian
< rcurtin> what you could do, is output means and covariances as a matrix, and then the next layer could use those directly
< rcurtin> but I think in that case, you would require that the layer after the one that outputs means and covariances has a specific types
< rcurtin> i.e. if you put the means and covariances into a linear layer, it probably wouldn't make sense
< sumedhghaisas> rcurtin: we could make the next layer accept a dist as input. But now I am wondering could we bypass this by templatizing the next layer with distribution and outputting dist parameters as you mentioned
< sumedhghaisas> So let me see if I can explain the issue in more detail
< Atharva> sumedhghaisas: From what I could figure out, JacobianA is approximate and B is true. This test is a lot like the gradient check just that we are taking all the dimenions of input into account at once
< rcurtin> I'm not sure I understand the situation fully. personally, I don't have a problem with either way, but if a layer outputs a non-matrix object, then we have to be very careful to ensure that a user can't add a subsequent layer that accepts a matrix object
< rcurtin> I wonder if it might be better to make a "combined" layer so that there is no need to output the distributions, you could just use them internally in the "combined" layer
< rcurtin> but, I don't know VAEs well, so like I said I don't fully understand the problem. you and Atharva know better, I am just proposing ideas based on what I think the problem is :)
< sumedhghaisas> VAEs output is a distribution, and reconstruction error is basically the log_prob of input in the output distribution. Now if we have such a distribution layer, the reconstruction loss implementation becomes easy and very generic. Now it also helps us in generation, as the output is as actual dist, The Predict function will output the same and user could sample from this distribution
< rcurtin> oh, I see, so you are not passing the distribution between layers, the distribution is the output of the autoencoder
< sumedhghaisas> yes... but the distribution layer can be in the middle, for example in GANs
< sumedhghaisas> in GANs the output of generator is a distribution, sample of which is passed to the discriminator
< rcurtin> hmm, maybe it is worth looking at how Shikhar implemented this then?
< sumedhghaisas> in that case, we will have a distribution layer and the next layer will sample from it.
< sumedhghaisas> rcurtin: yeah I need to look into GAN class in more details. I wanted to make sure user can use variational inference in any model they desire.
< sumedhghaisas> So the generic design becomes, any component is a Sequential layer as Marcus suggested, this component can be a generative one, which outputs a distribution or a deterministic, all these components connect together with FFN
< sumedhghaisas> Now after the model is trained with could use the generative components separately as they output a distribution
< rcurtin> I think that would be reasonable
< sumedhghaisas> very useful in VAEs and GANs, in VAE the decoder is generative where encoder is deterministic, for sampling we need decoder separately, similar for GANs with generator and discriminator
travis-ci has joined #mlpack
< travis-ci> manish7294/mlpack#27 (lmnn - 8a6709f : Manish): The build has errored.
travis-ci has left #mlpack []
< sumedhghaisas> the only issue is can a layer output not a matrix in our framework? The problem with next layer's input could be solved with templatization, and some new layers could be defined which accept dist input
< sumedhghaisas> Atharva: ahh sorry I forgot to reply to you :)
< sumedhghaisas> Atharva: You are right, jacobianA is the approximate one and jacobianB is the real
< sumedhghaisas> although jacobianB takes into account also the error signal from KLas we add it in Backward
< sumedhghaisas> but jacobianA is computed on the forward function which does not add KL to the output
< sumedhghaisas> thus, these 2 jacobians differ
< sumedhghaisas> if we add KL to the forward they will become same
< sumedhghaisas> but KL is a part of loss
< sumedhghaisas> so we add it separately in the loss
< Atharva> Yes, in the Repar backward function I have added the kl error as well, but I also add the double kl loss to the total loss (the kl forward function) in the FFN Backward function
< Atharva> Oh!
< Atharva> I see, but the JacobianA has no idea of the KL loss
< Atharva> Okay, I got a little confused because the Forward() function of the FFN class doesn't actually evaluate it
< Atharva> I will put in the second boolean
< Atharva> and test the booleans
wenhao has joined #mlpack
< rcurtin> sumedhghaisas: I think that it would be hard for a layer to output something that's not a matrix. so the suggestion from my end would be, just output a matrix that holds the means and covariances
< rcurtin> and when the next layer expects distributions as input, it can just use the passed input matrix that has means and covariances
manish7294 has joined #mlpack
< manish7294> rcurtin: Did you see the graphs?
wenhao has quit [Quit: Page closed]
< rcurtin> manish7294: I saw them, but I have not had a chance to respond
< rcurtin> it seems to me like there would not be an easy setting to choose; the datasets don't always converge quickly
< rcurtin> how are we doing with the overall runtime? has it been significantly reduced further?
< manish7294> rcurtin: I ran the datasets from 1 to 150 passes and the maximum total time was on diabetes dataset(768 points) with value 1 hr 21 mins
< manish7294> and on iris it was less than 15 mins
< rcurtin> on iris we really need to be shooting for more like 10 seconds or less
< rcurtin> where are the bottlenecks still?
< manish7294> for passes from 1 to 150
< rcurtin> I have a long flight on Friday and during that time I will be able to try out the pruning idea I have been talking about, but there will not be time for me to do that until then
< rcurtin> for now, the idea of only recomputing impostors every 100 iterations or so (or whatever number) is ok, I think
< rcurtin> possibly that could be increased even further
< manish7294> Ya, it gave some good speed up
< manish7294> I have pushed it to master
< rcurtin> the thing is, LMNN is kind of like a preprocessing step for some algorithms, not a learning algorithm itself. so nobody is going to want to wait many hours for a ~1-2% increase in kNN accuracy
< manish7294> iris 100 passes - 3.6secs
< manish7294> computing_neighbors: 2.113525s
< rcurtin> ok, sorry, maybe I misunderstood? I thought you said it took less than 15 minutes (I assumed that to mean it took nearly 15 minutes)
< rcurtin> on the iris dataset that is
< manish7294> after every 10 iteration total_runtime = 1.7secs
< manish7294> 15 mins was for 150 lmnn runs with passes running from 1 to 150
< rcurtin> ok, I see
< manish7294> every N iteration idea is giving some good speedups
< rcurtin> I think we should develop a 'standard benchmark set' of datasets and optimizer configurations so that we can better track the progress
< rcurtin> do you think that would be worthwhile to do now, and then revisit LMNN to accelerate it?
< rcurtin> it feels to me a little like we are trying lots of things, but I don't feel like I have the best grasp of what has helped, what has hurt, and how far away we are from where we want to be (with respect to speed)
< manish7294> Sure, no problem
< rcurtin> I think that was set up for next week in your proposal, but maybe it is better to get the benchmarking scripts set up now, then use the benchmarking system to test the speed of the code as we improve it
< rcurtin> if you'd prefer not to do that, that's okay also, but I think it could be helpful
< manish7294> But will it work without lmnn merging as benchmark has a different repo.
< manish7294> I will post some runtimes too later today on PR, so you can have a reference.
ImQ009 has joined #mlpack
< rcurtin> it's not too hard to modify the benchmarking system code to work with a custom mlpack, I can show you how
< rcurtin> you'd need to do that anyway to test any changes
< rcurtin> anyway, yeah, if you can post some runtimes on the PR, that would be great
< rcurtin> I'm sorry about the slowness from my end. the paper took more time than I expected, and technically I am taking a vacation this week so my work for LMNN has been reduced :)
< rcurtin> but I have time set aside on Friday, since I have a long flight. it will be perfect
< rcurtin> (for getting some code written that is)
< manish7294> rcurtin: no worries! Everything is good till now :)
< rcurtin> :)
< manish7294> rcurtin: I have added some runtimes over iris, I hope they atleast help a bit https://github.com/mlpack/mlpack/pull/1407#issuecomment-396971811
manish7294_ has joined #mlpack
< rcurtin> manish7294: thanks, any chance you could also do the same for another dataset like vc2 or covertype-5k? (or maybe the full covertype? maybe that takes too long though)
< manish7294_> sure, I will do for vc2, it will be a bit fast and comfortable, if it is okay? :)
manish7294 has quit [Ping timeout: 260 seconds]
manish7294_ has quit [Quit: Page closed]
< rcurtin> yeah, that's fine
ImQ009 has quit [Read error: Connection reset by peer]
manish7294 has joined #mlpack
< manish7294> rcurtin: Done! Added a comment regarding vc2 benchmarks on PR.
manish7294 has quit [Client Quit]
ImQ009 has joined #mlpack
vivekp has quit [Read error: Connection reset by peer]
vivekp has joined #mlpack
< Atharva> sumedhghaisas: I updated the PR. Also, after Ryan's remarks, what should we decide to do finally, so that I can get started?
vivekp has quit [Read error: Connection reset by peer]
vivekp has joined #mlpack
vpal has joined #mlpack
vivekp has quit [Ping timeout: 264 seconds]
vpal is now known as vivekp
yaswagner has joined #mlpack
< sumedhghaisas> Atharva: Sorry got little busy
< sumedhghaisas> So what we can do is, for now create a reconstruction loss layer which accepts distribution as a template parameter
< sumedhghaisas> The loss layer will take VAE output, define a distribution over it and compute the loss
< sumedhghaisas> For this 2 functions need to be defined in distributions, log_prob and its backward
< sumedhghaisas> For testing, we will make VAE output single variable gaussian distribution so we could use the currently defined distribution
< sumedhghaisas> ahh but the batch dimension will still create a problem
< sumedhghaisas> okay. Lets create NormalDistribution in dist first, thats the next task
< Atharva> with mean and std, right
< ShikharJ> wife
< sumedhghaisas> Sorry if I am confusing you too much :) I will send out a mail detailing the task
< sumedhghaisas> Atharva: ahh you are right
< ShikharJ> Ah sorry, some random autofill typing :(
< sumedhghaisas> So are you clear on the NormalDistribution task?
< Atharva> Yeah, a mail will be nice :)
< ShikharJ> zoq: Are you there?
< sumedhghaisas> Atharva: Lets clear as much as we can here and then summaries it in a mail
< Atharva> about the distribution, for the reconstruction loss, how can we calculate it with just the distribution?
< Atharva> Yes
< sumedhghaisas> Okay thats the next task :) We can also discuss that right now, but first lets be clear on the upcoming task, then I will try to explain this
< Atharva> I mean, using the data, we would take the mean sqaured or negative log likelihood
< Atharva> yeah right
< Atharva> we will discuss it later
< Atharva> Just to confirm, what will be the private members we will have? Because the GaussianDistribution class has mean, covariance, covLower, invCov, lodDetCov
< Atharva> I also think I should open a different PR for this
< sumedhghaisas> Atharva: We have similar private functions
< sumedhghaisas> mean, variance, log_prob, log_prob backward
< sumedhghaisas> For now this should achieve the result
< sumedhghaisas> The distribution will accept a matrix of means, matrix of stddev to create a distribution
< sumedhghaisas> shape checking must be done in the constructor
< sumedhghaisas> log_prob should accept a matrix of same shape and return a log probability of that matrix in the distribution
< sumedhghaisas> log_prob_backward should accept the same matrix and return the gradient of log_prob given that matrix
< sumedhghaisas> And yes, definitely separate PR for this :)
< Atharva> Okay, everything understood, I will get on this.
< sumedhghaisas> Great!
< Atharva> Also, whenever you are free, do review the sampling PR
< Atharva> I think it's done and furthur work should go in new PRs
< sumedhghaisas> Atharva: I am taking a look at it now :) But mostly everything looks good... If I don't find anything I will merge it in tomorrow :)
< sumedhghaisas> and yes, All future work should go in new PRs
< Atharva> I don't know why the appveyor build keeps failing, I will have to check what it says, it has failed for all the commits till now.
< Atharva> It builds without problem on my pc
< sumedhghaisas> Have to run outside for some time. Try to fix the AppVeyor build, if it doesn't work I will take a look tomorrow.
< Atharva> Don't worry about it, I will figure it out.
< yaswagner> Hi guys! i'm trying to build bindings for Go, and to do so I first need to make a C API. Im working on trying to bind the CLI, so i'm working with the mlpack/core/util/ directory right now. I'm having trouble compiling my code. Is there a way for me to compile my files using cmake, without having to recompile the whole library?
< rcurtin> yaswagner: cmake should only recompile the files that are needed, so if you've modified a core file it may need to recompile a lot
< rcurtin> if you just need the library libmlpack.so, you could do 'make mlpack' and this could save some time
< yaswagner> ok perfect thank you!
< rcurtin> also 'make -jN mlpack' will use N cores, which can help
< rcurtin> (substitute N with however many cores you want to use of course)
< yaswagner> Will do. I am not modifying a core file, im adding .h header files so I think just using libmlpack.so should work!
< rcurtin> if nothing is actually being compiled, you can also do 'make mlpack_headers' which just moves all the source files from src/ into the build directory
< rcurtin> hope that helps, let me know if I can help with anything else :)
< rcurtin> I am going to step out to change some brake lines on my car now... I'll be back in a little while
< rcurtin> need somebody to press down on the brake pedal but I suspect nobody in this channel can help with that :)
< yaswagner> Perfect! will let you know if im still stuck
witness_ has joined #mlpack
ImQ009 has quit [Read error: Connection reset by peer]
< zoq> rcurtin: I could help you out if you wait like 24 hours?
ImQ009 has joined #mlpack
< rcurtin> zoq: ;)
< rcurtin> I think Emily will come home in the next few hours and I will ask her to do it :)
< zoq> probably faster and at least for me cheaper :)
ImQ009 has quit [Quit: Leaving]