ChanServ changed the topic of #mlpack to: "mlpack: a fast, flexible machine learning library :: We don't always respond instantly, but we will respond; please be patient :: Logs at http://www.mlpack.org/irc/
vidhan has joined #mlpack
Poulami101 has joined #mlpack
Poulami101 has quit [Client Quit]
vidhan has quit [Ping timeout: 256 seconds]
KimSangYeon-DGU has quit [Quit: Page closed]
MystikNinja has joined #mlpack
< MystikNinja>
Hey all, I'm applying to mlpack for GSoC 2019. I'm interested in implementing a deep learning module along with associated tests and docs. I'd like some advice on preparing my application beyond what is mentioned in the Ideas page.
< MystikNinja>
1. What are you looking for in a successful application?
< MystikNinja>
2. What level of knowledge regarding the literature and background material do you expect? I'm sure everyone will be willing to learn, but I don't know if I will have time to learn enough, if the gap is too large.
< MystikNinja>
3. You ask us to "provide some comments/ideas/tradeoffs/considerations about your decision process". Could you elaborate more on what kind of things you expect us to reason about? Really, the only thing I can think of to select a model to implement is if I feel I can implement it in time. Are there particular models you all would prefer implemented over others?
< MystikNinja>
4. What do you expect someone to know (or learn) if they want to work on this idea i.e. implementing a deep learning module?
MystikNinja has quit [Quit: Page closed]
tnsahr2580 has joined #mlpack
tnsahr2580 is now known as Soonmok
< Soonmok>
Hi! I'm trying to implement GAN application on mnist data using mlpack lib.
< Soonmok>
but I don't understand what is noise Function in GAN lib.
< Soonmok>
what should i put into the noise function place??
KumarRIshabh has joined #mlpack
< KumarRIshabh>
Hello everyone, I am Rishabh Kumar a student majoring in Math from India. I am quite interested in the project 'QGMM' and have previously worked with GMM. I have not worked with mlpack prior to this. Are there any issues or previous builds of QGMM or GMM available in mlpack ?
Suryo has joined #mlpack
< Suryo>
hello zoq!
< Suryo>
I've submitted a pull request to include the PSO code. Sorry for the delay - I've had some pretty tough coursework this semester.
< Suryo>
However, I've not included any parallelization so far. I'll do that once the current request has been approved.
< Suryo>
Thank you.
Suryo has quit [Quit: Page closed]
< ayesdie>
rcurtin: I was about to get started on making bindings for `LinearSVM` and then benchmarking it. Should I wait for for the PR to be approved?
KumarRIshabh has quit [Quit: Page closed]
KumarRishabh has joined #mlpack
< rcurtin>
ayesdie: nah, no need to wait---but let's open the bindings and benchmarks in a different PR if that's ok
< ayesdie>
alright, I'll get into it and make a PR when it will be ready for an initial review.
< rcurtin>
sounds good, thanks. I hope to be able to review the linear SVM PR in full in the next few days
sk1499 has joined #mlpack
sk1499 has quit [Client Quit]
KumarRishabh has quit [Ping timeout: 256 seconds]
kinshuk has joined #mlpack
KimSangYeon-DGU has joined #mlpack
kinshuk has left #mlpack []
KimSangYeon-DGU has quit [Quit: Page closed]
Bellalau_ has joined #mlpack
cjlcarvalho has joined #mlpack
Bellalau_ has quit [Ping timeout: 256 seconds]
vivekp has quit [Read error: Connection reset by peer]
vivekp has joined #mlpack
Suryo has joined #mlpack
< Suryo>
zoq, rcurtin: I have a doubt. I submitted a pull request for a PSO module and the travis ci check failed. However, it failed for the spsa_test and that's something that I did not touch.
< Suryo>
Two tests that I wrote for PSO are passing
< zoq>
Suryo: I'll have to adapt the threshold for the SPSA test, will do that later today.
riaash04 has joined #mlpack
gopal has joined #mlpack
< gopal>
help
< riaash04>
Hi, I am working on the implementing a manifold learning algorithm isomap, so for this I forked from the mlpack repository, cloned it to local, built it using the cmake commands and it succeded. But now after making adding new files, I again tried to build it but it can't find some boost libraries (program options, unit test framework, serialization).
< riaash04>
So I again did apt-get for all the dependencies, but still can't find it.
< riaash04>
Please help. I just want to test the code I am implementing.
< riaash04>
Even cmake configuration is not completing.
gopal has quit [Ping timeout: 256 seconds]
cjlcarvalho has quit [Ping timeout: 268 seconds]
sonu628 has joined #mlpack
yanyan has joined #mlpack
sonu628 has quit [Ping timeout: 256 seconds]
sonu628 has joined #mlpack
yanyan has quit [Quit: Page closed]
yanyan has joined #mlpack
sonu628 has quit [Ping timeout: 256 seconds]
< riaash04>
So after deleting the build folder cmake configuration is working again.
yanyan_ has joined #mlpack
yanyan_ has left #mlpack []
yanyan has quit [Ping timeout: 256 seconds]
< riaash04>
I am doing to following to setup the development envirenment (very new to open source development) : 1) Forked and cloned mlpack repository 2) Used cmake ../ to configure and then built mlpack, Now after adding any file to methods directory, do I need to build whole mlpack again everytime to check the code. I know a part of mlpack can be built separately but how to build the new files I am adding separately?
Suryo has joined #mlpack
< Suryo>
Zoq: thanks!! I also saw that adeel has submitted a pull request with some of his earler code refactored. I think that's for global best pso. Mine is local best. So I guess it'll be good to get both the implementations together.
< Suryo>
What do you think?
Suryo has quit [Client Quit]
riaash04 has quit [Quit: Page closed]
yanyan has joined #mlpack
yanyan has quit [Ping timeout: 256 seconds]
blank has joined #mlpack
blank has quit [Client Quit]
aman_p has joined #mlpack
sumedhghaisas has quit [Ping timeout: 256 seconds]
pd09041999 has joined #mlpack
< ShikharJ>
rcurtin: I think quite a few PRs got closed recently. Is that automation for closing PRs still on?
< rcurtin>
yeah, it is, they get closed if the 'keep-open' label is not set
< rcurtin>
you could see the ones that got closed with a search like 'is:pr is:closed label:'s: stale'' or similar
< rcurtin>
and if it closed some that should have stayed open, feel free to reopen and mark as 'keep open' :)
< ShikharJ>
Cool, thanks :)
< rcurtin>
sure
< rcurtin>
let me know when you're happy with the test wiki page (or if you wanted me to add something to it? I can't remember) and I can update mlpack-bot's text
riaash04 has joined #mlpack
ironmaniiith has joined #mlpack
KRONOS has joined #mlpack
yanyan_ has joined #mlpack
soham has joined #mlpack
ironmaniiith has quit [Quit: Page closed]
ironmaniiith has joined #mlpack
aman_p has quit [Ping timeout: 246 seconds]
< ShikharJ>
I wanted to add some stuff, but probably after my exams (which get over in a couple of days).
KRONOS has quit [Ping timeout: 256 seconds]
< rcurtin>
sure, no hurry
soham has quit [Ping timeout: 256 seconds]
aman_p has joined #mlpack
aditya has joined #mlpack
aditya has quit [Client Quit]
yogesh01 has joined #mlpack
< riaash04>
Hi, how can I make cmake build just the folder that I specify (like a new folder that I add to the methods folder)? Also, can anyone help me to understand how can I run specific code that I write, to debug. I have added some a folder which has some hpp and cpp files in the method folder. How can I just run the contents of that folder? Very new to open source development. Thanks for help.
< riaash04>
I have cloned and build the mlpack repository in ubuntu
< rcurtin>
Hi riaash04, have you done any reading about how CMake works? If you've added a new folder, you would need to add it to the relevant CMakeLists.txt files
< rcurtin>
I'd suggest you take some time and learn a little about CMake, then read through how we have the project configured in order to understand how it is you do what it is you want to do
< riaash04>
Yes, I added the folder to CMakeLists.txt and it did get built. I still have to get more familiar with CMake though, I will do that. Thanks.
< rcurtin>
sounds like you have gotten it worked out then; that's good to hear. if I can clarify things about mlpack's specific CMake configuration do let me know
rob has joined #mlpack
rob is now known as Guest79291
< Guest79291>
How do I compile with nvBlas? I have already installed cuda and everything, but doing -lnvblas even with -DARMA_DONT_USE_WRAPPER it doesn't make it any faster...
< Guest79291>
and arma::config tells me that it's not using blas when I do
< rcurtin>
Guest79291: nvblas isn't guaranteed to give speedup for every operation... it depends on the workload and the algorithm and the data, etc.
< rcurtin>
maybe you are not doing any operations with mlpack that make use of blas functionality?
< Guest79291>
I'm making a ton of subvec() s and resizes to a matrix I loaded in
< rcurtin>
not sure that nvblas would help with that
< Guest79291>
Wouldn't it print that it was using blas? I'm doing if(cfg.blas){...}
< rcurtin>
I'm not familiar with arma::config unfortunately
< Guest79291>
alright, well, thank you
< rcurtin>
yeah, sorry that I can't be more helpful...
< rcurtin>
but anyway the way nvblas works is that it looks at the size of the matrix and the operation
< rcurtin>
and estimates whether it would be faster to transfer the matrix to the GPU, perform the operation, and transfer back
< Guest79291>
oh, I see
< rcurtin>
but it sounds like what you're doing is something like matrix copy/extract operations? in which case that wouldn't give any speedup, I don't think
< rcurtin>
but if instead you are doing something more like A*B.t() (or those types of operations), for very large matrices (and if you have a good GPU), nvblas will move the computation to the GPU and it will give some speedup
< rcurtin>
neither of these things are as good as having the matrix always stored on the GPU
< rcurtin>
the bandicoot project will probably be what you are looking for when it is ready, then---it's basically Armadillo with matrices stored on the GPU
< Guest79291>
Ah. You're exactly right
< rcurtin>
however there is still a lot of implementation work to be done there
< rcurtin>
so it is not quite ready yet unfortunately :(
< Guest79291>
Well, good luck :)
< Guest79291>
I realize now what you're saying, I hadn't actually used the data in mlpack in any way.
< Guest79291>
I'm assuming once I feed into my NN it will speed up
< rcurtin>
yeah, nvblas may give acceleration for the NN, but I'm not totally sure... I haven't tried it myself
< rcurtin>
I'd imagine larger batch sizes would be helpful with that
< rcurtin>
like I said before it also depends on the GPU too... nvblas does its own estimation of what will be faster and what won't, so it takes the GPU model into account (I assume)
sohamt09 has joined #mlpack
< sohamt09>
hi!!
pavan has joined #mlpack
pavan has quit [Client Quit]
< zoq>
sohamt09: Hello there!
< zoq>
Suryo: definitely
< sohamt09>
i have a few question to ask about
yogesh01 has quit [Ping timeout: 256 seconds]
Guest79291 has quit [Quit: Page closed]
KimSangYeon-DGU has joined #mlpack
riaash04 has quit [Quit: Page closed]
ironmaniiith has quit [Ping timeout: 256 seconds]
deepak_ has joined #mlpack
< deepak_>
HELP
deepak_ has quit [Quit: Page closed]
vivekp has quit [Read error: Connection reset by peer]
robb6 has joined #mlpack
vivekp has joined #mlpack
< robb6>
are there any plans to add methods for genetic algorithms? I know one was done in a fork a while ago
< zoq>
robb6: If you count NEAT, CNE, DE as well, yes.
< rcurtin>
(we took all the stuff out of src/mlpack/core/optimizers/ and put it into its own separate library, because we thought it would be more widely usable outside of mlpack)
< robb6>
Ah
< robb6>
thank you
sohamt09 has quit []
< robb6>
is the optimizer tutorial page still up to date?
< robb6>
Great. And all I have to do is specify the optimizer during training?
< zoq>
Yes and no, not every optimizer will work with every method. But ensmallen will warn you if you do something that shouldn't work in the first place.
< robb6>
What optimizers would be able to work on an ANN? For example, optimizing the amount of hidden layers/neurons?
< robb6>
:( I'm getting an out of bounds error for Mat::operator() when I train my model
< robb6>
It's just a normal FFNN with three layers
< rcurtin>
but I don't know details of the reporter's system
< rcurtin>
I know we've seen speedup with nvblas in the past though; I don't remember if it was for the NN code (actually I think the NN code didn't exist at that time?)
< robb8>
i guess I assumed wrongly that it would switch over
< robb8>
even at 100% cpu , maybe its not worth it
< rcurtin>
yeah, I'm not sure of the algorithms they use internally
< rcurtin>
I feel like something like PCA where it's an eigendecomposition or something might ship the work off to the GPU because it's more computationally intensive work
< rcurtin>
assuming the data is large enough
< robb8>
got it
< rcurtin>
I wish I could say "try bandicoot!"... give us a few months (or several?) and then maybe we can :)
< robb8>
:)
< rcurtin>
it's close to next in my priority list once I finish the website and handle a few other mlpack-related things
< robb8>
also, is there a way to use one of the genetic algorithms in ensmallen (like CNE) to optimize the amount of neurons ? or would I have to implement that myself? what exactly does it change when generating a new random network
< robb8>
does it just use CNE on the weights/biases?
< rcurtin>
hmmm, so that would be tricky but possible
< rcurtin>
imagine implementing a non-differentiable function, so it has Evaluate()
< rcurtin>
and Evaluate() takes in an arma::mat of parameters
< rcurtin>
maybe each element in that arma::mat represents the number of neurons at each layer
< rcurtin>
and then inside of Evaluate(), you convert each element of the arma::mat() to a size_t, then build the network accordingly, train and test
< rcurtin>
and return the MSE (or whatever measure)
< robb8>
got it, thanks! :)
< rcurtin>
it's also possible you could come up with some way to approximate a Gradient() function using finite-differences or something like this
< rcurtin>
but... with neural network hyperparameters, there's no guarantee the loss function would even be smooth
< rcurtin>
so using a gradient-based optimizer may not work very well
< robb8>
hmm
< robb8>
I guess it'll probably be easier to tune it myself
< robb8>
I have 256 input neurons, 1 output neuron, doing time series
< robb8>
and like 2 hidden layers
< rcurtin>
it may be easier to just try a handful of possibilities and see which is best
< rcurtin>
but there are lots of possibilities :)
< robb8>
;)
kinshuk has quit [Remote host closed the connection]
< robb8>
i guess I have to figure out a working range
kinshuk has joined #mlpack
< rcurtin>
I've found with RNNs that adding extra layers of memory (like two layers of LSTMs or something like this) can be really helpful
< rcurtin>
of course it makes it take way longer to train...
< robb8>
I tried using an RNN (i'm using a FFNN right now), but it took me a while to get the data into cubes
< robb8>
also, I had no idea what I was doing in terms of activation functions, so I applied Sigmoid to it, even though I wasn't classifying anything
< robb8>
so I just added way more inputs to an FFNN
< rcurtin>
ah, ok; same thing though with FFNNs, if you add more layers of depth it takes longer to train but can probably fit better
< robb8>
do you think an RNN would be better for time series?
< rcurtin>
depends on the task, but generally I'd use RNNs for time series if I could
< rcurtin>
but again RNNs can take way longer
< rcurtin>
some people have shown really nice results with convolutional FFNNs, and that they can approximate the results of RNNs (and train much more quickly)
kinshuk has quit [Ping timeout: 250 seconds]
< rcurtin>
anyway, my rule of thumb for deep learning is "just try a lot of things" because it's really hard to predict how things will perform :)
ac-optimus has quit [Ping timeout: 256 seconds]
< rcurtin>
other people may have different rules of thumbs... I am not the world's best data scientist anyway :)
< robb8>
haha :) thank you
< robb8>
also, is there a way to print the MSE or do I calculate that myself?
< rcurtin>
you'd have to compute it yourself for now, but one of the other things on my list is callbacks for ensmallen that print this automatically during optimization
< robb8>
I'm assuming I do that myself because predict() only takes the input data
< robb8>
gotcha
< robb8>
thanks
< rcurtin>
ah sorry you are just building the network not using ensmallen; in that case, there is a PR to the git master branch that makes Train() return the last value of the loss function
< rcurtin>
that was merged recently, so if you are using the git master branch you should be able to do that
< robb8>
oh! yeah I just compiled from source
< robb8>
thanks :))))
< rcurtin>
sure :)
< robb8>
hey, I get a suuuuper long error when I try to use the mlpack::data::save... It uses boost right? I linked -lboost_system
< robb8>
does it use something like -lboost_serialization ?