verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
vivekp has quit [Ping timeout: 248 seconds]
vivekp has joined #mlpack
yaswagner has quit [Ping timeout: 260 seconds]
travis-ci has joined #mlpack
< travis-ci>
manish7294/mlpack#28 (master - 0128ef7 : Marcus Edel): The build passed.
< ShikharJ>
zoq: I'm sorry I couldn't reach out at that time, could you tell me what timezone are you in so that I can plan accordingly?
< ShikharJ>
zoq: Pardon me if I'm misunderstanding, but isn't inSize parameter intended to serve as the parameter for the number of channels in an input image?
< zoq>
ShikharJ: No worries, I got distracted so I couldn't take a look at the issue at the time you posted it, I'm in UTC + 2.
< zoq>
ShikharJ: About inSize you are right, and that would effect the number of weights, so we have to modify the conv layer; take a look at the inputTemp parameter, currently it dos only account for the first sample.
< ShikharJ>
zoq: I'll let you know what I find by today evening.
< zoq>
ShikharJ: Okay, hopefully the changes are minor.
< manish7294>
zoq: How do we make some new dataset additions to mlpack.org/datasets?
< manish7294>
And, can we add categorical datasets, mainly having categorical labels?
< zoq>
You can open an issue with the link to the dataset; I'll upload the dataset afterwards or just add the dataset to the list; or you can post the link here.
< zoq>
manish7294: That depends on the lib, in case of mlpack all we do is forwarding the dataset (filename + path).
< zoq>
manish7294: I guess if you like to benchmark against matlab you might have to adjust the benchmark script since it's probably using dlmread.
< zoq>
manish7294: If mlpack can't handle the dataset right away we can write a simple preprocess step in python and pass the modified dataset.
< manish7294>
zoq: That's sounds about right. Can we modify the method too as I remeber Haritha doing something like that for decision tree to support categorical data, though I am not sure?
< zoq>
manish7294: Should be straightforward, if you follow the PR.
< manish7294>
zoq: I will follow it then :)
< manish7294>
rcurtin: Here is some results on letters dataset(20000 instances, 16 attributes) - lbfgs, k = 3, total time - 3 mins 49.5 secs, initial accu - 96.285, final - 96.905; amsgrad, total time - 4 mins 53.6 secs, final - 97.335;
< Atharva>
sumedhghaisas: You there?
< sumedhghaisas>
Atharva: Hi Atharva
< sumedhghaisas>
here now :)
< Atharva>
Will you be available in 2 hours?
ImQ009 has joined #mlpack
< manish7294>
zoq: rcurtin: Currently I am able to successfully execute LMNN benchmarks on my local system by copying mlpack_lmnn to libraries/bin/mlpack_lmnn. Can you please guide me how I can run the same on benchmarks system using my lmnn branch as lmnn is not yet merged?
< rcurtin>
manish7294: hey there... I slept in a little this morning
< rcurtin>
so, there's a script in libraries/ called 'mlpack_install.sh'
< manish7294>
rcurtin: :)
< rcurtin>
basically, all that does is build mlpack in the libraries/mlpack/ directory
< sumedhghaisas>
Atharva: Ahh sorry missed your msg
< rcurtin>
so, you could, after setting up all the other libraries, manually build your branch of mlpack in libraries/mlpack/
< Atharva>
sumedhghaisas: 1 hour 40 minutes feom now
< Atharva>
9:45 ist
< manish7294>
All I have to do is same copying the lmnn branch bin folder to libraries/bin
< rcurtin>
it might also be a good idea to, in your local benchmarks repo, comment out the mlpack_install.sh call from the install_all.sh script, in case you accidentally type 'make setup'
< rcurtin>
manish7294: no, I don't think that will work, because that may depend on parts of libmlpack.so that aren't there
< rcurtin>
so I think it's better to build the whole library in libraries/mlpack/ with the CMake configuration above, then 'make install' to put it correctly in libraries/bin and libraries/lib
< manish7294>
Okay, I will take care
< manish7294>
And should I do this on slake itself
< rcurtin>
yeah, I think it's fine to do it on slake
< rcurtin>
if you do all the 'make run' runs there, they'll be comparable to each other
< manish7294>
I will be taking covtype as the limiting(in terms of maximum size) dataset
< rcurtin>
right, that sounds good
< manish7294>
rcurtin: And should I create a pull request on benchmarks repo or should I wait till lmnn merge?
< rcurtin>
if it is taking like 13 hours to run, you might want to start with a 5k or 50k subset
< rcurtin>
you can open the PR now if you like, but we should wait to merge it until LMNN is merged
< rcurtin>
(which should be fairly soon I think)
< manish7294>
rcurtin: Are you randomly selecting 5k points of covertype to make 5k covertype.
< manish7294>
or are they 1st 5k points
< rcurtin>
I took them randomly
< manish7294>
Can you share it with me, if it is an independent file? Or I will make a more substantial one for me as I was earlier using a subset of covertype-small, which was missing some classes.
< sumedhghaisas>
Atharva: Sure. I think I will be free.
< manish7294>
rcurtin: Hoping, I am not ruining your vacation ( It's once in a while chance) :)
< rcurtin>
no, it's no problem at all, it is a vacation from Symantec not everything :)
< rcurtin>
I am just staying at home this week anyway
< rcurtin>
soon I will go see if my new brakes work well (hopefully they do so I will come back)
< rcurtin>
which I guess means it is important that I get you the datasets now, it could be the last chance :)
< rcurtin>
ok, I'll be back later. in the worst case I might have to use the emergency brake but I think everything will be fine :)
< manish7294>
Thanks!
sulan_ has joined #mlpack
< sumedhghaisas>
Atharva: Hi Atharava, I have to sync up with someone at work at 17 BST. Could we discuss at 18:00 BST?
< rcurtin>
perfect, brakes work great :)
< manish7294>
rcurtin: That was a quick test drive. So, emergency breaks didn't get a chance, haha :)
manish72942 has joined #mlpack
manish7294 has quit [Ping timeout: 260 seconds]
< Atharva>
sumedhghaisas: Yes sure!
< Atharva>
sumedhghaisas: You there?
manish72942 has quit [Ping timeout: 255 seconds]
vivekp has quit [Ping timeout: 255 seconds]
< sumedhghaisas>
Atharva: So sorry. The meeting stretched for long.
< sumedhghaisas>
I am here now :)
< Atharva>
sumedhghaisas: Meetings always do :)
< sumedhghaisas>
Atharva: So true :)
< sumedhghaisas>
so whats up?
< Atharva>
I had some questions about the normaldist class
< Atharva>
Do we need the Train functions which the GaussianDistribution class has?
< sumedhghaisas>
Ohh... ummm not really. Let me think.
< sumedhghaisas>
In any case we can implement it later
< Atharva>
Yeah, so not now.
< sumedhghaisas>
Sure.
< Atharva>
Another question is if we should allow vectors and cubes or just matrices?
< Atharva>
I think we should think about RNN support as well, so cubes should be allowed I guess
< sumedhghaisas>
ohh cubes are necessary as the output might be conv
< Atharva>
Yup
< sumedhghaisas>
so I would say vector, matrices and cubes
< Atharva>
So, I will define multiple constructors
< sumedhghaisas>
or maybe just templatize it?
< Atharva>
Yes, but still, we need to take different number of parameters for each data type
< Atharva>
in the constructor
< Atharva>
for the size
< sumedhghaisas>
I am not sure I understand that
< sumedhghaisas>
why exactly you need size?
< sumedhghaisas>
also you could infer size from the matrix itself
< Atharva>
Hmm, in the constructor NormalDistribution(), when we are making a new set of distributions, if it's a matrix, then we need something like NormalDistribution(n_rows, n_cols)
< Atharva>
Because in Gaussian, it only supports vector and it's multivariate, so they take GaussianDistribution(dimension)
< sumedhghaisas>
What about NormalDistribution(arma::mat mean, arma::mat variance)?
< Atharva>
This creates standard normals
< sumedhghaisas>
ahh I mean templatize it properly
< Atharva>
That will be there, this one is for standard normal of a given size
< Atharva>
We don't need it, but it's good to have that I think
< sumedhghaisas>
hmm... I am conteplating if we should define these distributions inside ANN framework or not
< sumedhghaisas>
although I would say lets not have the size constructor
< Atharva>
Okayy
< sumedhghaisas>
in case if the user wants, he can create a dist by generating constant matrix
< sumedhghaisas>
of required size
< Atharva>
Yes that's easy
< Atharva>
So, as Ryan said, we shouldn't make layers output distributions if layers after them expect matrices
< Atharva>
So, this is just for the final layer, right?
< sumedhghaisas>
Ahh no, so as Ryan suggested, what we do instead is, accept the distribution in the layer its being used rather than the layer outputing it
< sumedhghaisas>
So VAE would output a matrix, and loss layer will define a dist over it and use it
< Atharva>
Got it, so the network isn't affected, everything happens in one layer
< sumedhghaisas>
everything happend in one layer? sorry didn't get that
< Atharva>
Sorry, I meant that the dist objects are just used in the layer which needs it as input, rest of the network operates normally on matrices
< sumedhghaisas>
ahh yes. :)
< Atharva>
You said that the logprob will give us the reconstruction loss, but what if the user wants to use some loss for reconstruction?
< sumedhghaisas>
Usually in VAEs, reconstruction loss is defined over some distribution only
< sumedhghaisas>
in any case, if the user wants some other loss, he could replace the ReconstructionLoss layer and use his own
< Atharva>
Okay, so after the distribution, next task to is to implement a ReconstructionLoss layer, right?
< sumedhghaisas>
yes
< sumedhghaisas>
ReconstructionLoss will take a distribution to define the loss
< Atharva>
So, you are saying that in VAEs we shouldn't sample from the output distribution, say an image, and then use some simple loss such as mean squared between the output image and training image
< Atharva>
Will the loss always be taken with the output distribution?
travis-ci has joined #mlpack
< travis-ci>
mlpack/mlpack#5079 (master - e08e761 : Marcus Edel): The build has errored.
< Atharva>
Also, I think we should discuss implementation details of the ReconstructionLoss layer now, because the dist won't take long now. I will try to complete the layer and test it before the week ends
< sumedhghaisas>
Atharva: got distracted with something
< sumedhghaisas>
Sure.
< sumedhghaisas>
I still need to merge the Repar layer
< Atharva>
Can you please read my message before the last one?
< sumedhghaisas>
Sorry for being pedantic, but could you rebase the PR on master rather than merging? Merging just creates a complicated history
< sumedhghaisas>
Mean squares loss is basically a log_prob loss with normal distribution
< Atharva>
Okay, if that makes thing easier
< sumedhghaisas>
and thats what we will provide as default, although with binary MNIST we will have to use bernoulli dist log_prob, thus distributions will make this easier
< sumedhghaisas>
Now for sampling, what use can do is define a distribution over the FFN.Predict function
< sumedhghaisas>
and sample from it
< Atharva>
Yeah
< Atharva>
The dists will help, we just need to make some adjustments in FFN class
< sumedhghaisas>
Not really, now that we no longer output dist, the current framework will work, we will need to Implement ReconstructionLoss thats it
< Atharva>
Okay, but we do need to change the Predict function for dists, right?
< sumedhghaisas>
umm... No. Predict will output a matrix.
< sumedhghaisas>
While sampling, we will define a Dist over the Predict output
< Atharva>
I think, we will need to make one detailed tutorial because we will be doing a lot of things externally and not in some class
< sumedhghaisas>
A tutorial and a simple MNIST model in models :)
< Atharva>
Yes, after that I hope we get some time for RNNs :)
< sumedhghaisas>
Although this style of sampling is common across other frameworks as well
< Atharva>
According to the planned timeline, the next week was for testing VAE class, but we don't have that now, so by then everything else should be ready so that we can play with some VAE models after that
< sumedhghaisas>
For RNNs we will need to find a dataset as well
< Atharva>
Yeah, what do you say about a music dataset?
< sumedhghaisas>
for generation?
< Atharva>
Yeah, in RNNs
< sumedhghaisas>
Thats a very hard task for VAEs
< sumedhghaisas>
to be honest
< Atharva>
Oh, okay, maybe something else then, we will decide later
< sumedhghaisas>
yeah, maybe we could play around with Reber grammar that exists currently
< sumedhghaisas>
Lets see if we could generate a correct grammar with VAEs
< sumedhghaisas>
thats a very interesting experiment...
< Atharva>
That's interesting, working with models is going to be so much fun!
< sumedhghaisas>
Although shouldn't be difficult
< Atharva>
We also have to reproduce results from the papers
< sumedhghaisas>
That would be the first task as soon as we get MNIST working
< sumedhghaisas>
does the paper mention MNIST or Binary MNIST?
< Atharva>
Yeah, about the ReconstructionLoss layer, do you have something specific that I should keep in mind while implementing
< Atharva>
I will check
< sumedhghaisas>
Not really. Have you understood the role that it plays?
< Atharva>
Yes I have
< Atharva>
It's forward function will return a double just like the other loss layers that we have
< Atharva>
It will take in matrix and then use the dist to define a distribution over it
< Atharva>
The dist object will then have logprob and logprob backwards which the layer will use for the forward and backward functions
< sumedhghaisas>
Yup, yup and yup
< Atharva>
We also need support for Bernoulli dist
< sumedhghaisas>
thats for later :)
< Atharva>
Okay
< Atharva>
I will get on this then
< sumedhghaisas>
Lets go with pure MNIST now
< Atharva>
Yeah
ImQ009 has quit [Quit: Leaving]
< ShikharJ>
zoq: Are you there?
< zoq>
ShikharJ: yes
sulan_ has quit [Quit: Leaving]
< ShikharJ>
zoq: I had a theoretical doubt. Let's say we have a 4 3x3 input points. So its shape becomes 3x3x4, and let it convolve with a 3x3x1 kernel, so that the output now is 1x1x4, for the 4 inputs.
< ShikharJ>
zoq: Now when I'm computing the gradients for the above operation in the Gradient method, should I compute 3x3x4 gradients pertaining to the 4 inputs or something else has to be done?
< ShikharJ>
zoq: Because our kernel size is just 3x3x1?
< zoq>
The gradient has to be calculated for each input separately and you take the sum over the gradients at the end, so theoretically you could just write a for loop around everything and take the sum at the end; but since this is slow we should see if we can vectorize the operation.
< ShikharJ>
zoq: Ah, so I calculate the 3x3x4 sized gradients (for 4 inputs) and then reduce it to 3x3x1 by summing, is that right?
< zoq>
correct
< ShikharJ>
zoq: Summing along slices?
< zoq>
yeah, if we use the cube representation
< ShikharJ>
zoq: Ah, that cleared a lot on how convolutions actually work, for me. Thanks for the help!
< zoq>
ShikharJ: Here to help.
< zoq>
A simple test could check if the output is the same for two seperate runs (two inputs) and a single run with the two inputs combined.
< ShikharJ>
zoq: Ah, yes, we can try that out as well.
< ShikharJ>
zoq: The implementation for Batch Support on Convolutional Layers is nearly complete, we can test after I push the code.
< zoq>
ShikharJ: awesome
witness_ has quit [Quit: Connection closed for inactivity]
< ShikharJ>
zoq: I think I also found a bug in the Gradients method of convolution_impl.cpp, though I'll be needing you to review it, pushing the code for now.
< ShikharJ>
zoq: I also posted the results for DCGAN MNIST test on the full dataset on the PR!
< zoq>
ShikharJ: Are you going to open another PR or do we use the DCGAN PR?
< ShikharJ>
zoq: For the bug, I'll push to the BatchSupport PR. DCGAN should be good to go for MNIST, but I still need to get it running for CelebA, which I think should benefit from the BatchSupport PR improvements.
< zoq>
ShikharJ: Okay, sounds fine for me.
< ShikharJ>
zoq: Pushed in the changes.
< zoq>
ShikharJ: Nice, does this one incoperate the fix for the gradient function, not sure I see the issue.