verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
robertohueso has left #mlpack []
ajaivgeorge has quit [Quit: ajaivgeorge]
sumedhghaisas2 has joined #mlpack
sumedhghaisas has quit [Read error: Connection reset by peer]
rajeshdm9 has joined #mlpack
rajeshdm9 has quit [Ping timeout: 260 seconds]
wangxin has joined #mlpack
govg has joined #mlpack
Atharva has joined #mlpack
< Atharva>
sumedhghaisas: Hi Sumedh, I wanted to ask if prior experience in generative models is necessary for VAE project? Will it suffice if I make an effort to understand them and play around with them before the application deadline.
< Atharva>
I do have experience with neural networks in general and have also written some from scratch.
< Atharva>
So, I meant to say that I am quite comfortable with the internal workings and mathematics of neural networks.
Atharva has quit [Quit: Page closed]
ainch has joined #mlpack
ainch has quit [Client Quit]
ainch has joined #mlpack
wangxin has quit [Ping timeout: 260 seconds]
wangxin has joined #mlpack
prakhar_code[m] has quit [Ping timeout: 256 seconds]
killer_bee[m] has quit [Ping timeout: 256 seconds]
sumedhghaisas2 has quit [Read error: Connection reset by peer]
sumedhghaisas2 has joined #mlpack
sumedhghaisas has quit [Ping timeout: 248 seconds]
sumedhghaisas2 has quit [Ping timeout: 240 seconds]
sumedhghaisas has joined #mlpack
sumedhghaisas2 has joined #mlpack
sumedhghaisas has quit [Read error: Connection reset by peer]
dk97 has joined #mlpack
dk97[m] has joined #mlpack
< dk97[m]>
hi there
< dk97[m]>
i am dakshit, an undergraduate at IIT Roorkee
< dk97[m]>
I saw the org lists of ideas, and am interested in the project of building VAE
< dk97[m]>
is it alright to pursue for this summer? I have read a few papers related to it.
< dk97[m]>
zoq_: @rc
< dk97[m]>
rcurtin:
< dk97[m]>
@sumedhghaisas2
dk97 has quit [Ping timeout: 260 seconds]
caladrius has joined #mlpack
caladrius has quit [Client Quit]
wangxin has quit [Ping timeout: 260 seconds]
vivekp has quit [Ping timeout: 268 seconds]
vpal has joined #mlpack
vpal is now known as vivekp
Trion has joined #mlpack
sumedhghaisas has joined #mlpack
sumedhghaisas2 has quit [Read error: Connection reset by peer]
< dk97[m]>
hey there sumedhghaisas
< sumedhghaisas>
@dk97[m] Hey Dakshit
< dk97[m]>
I saw the org project list has VAE project in it, but there are no issues created, and also no files of the same. The project lists that VAE is to be build from scratch.
< dk97[m]>
Any guidance on how to proceed?
< dk97[m]>
I have read the paper mentioned on the project details
< dk97[m]>
Before the proposals, should I build a basic cpp implementation?
< sumedhghaisas>
@rcurtin: Sorry about the username change :) My IRC client is really bad... is there a good one for Android?
< sumedhghaisas>
@dk97[m] There are no issues related to this as of yet.
< dk97[m]>
sumedhghaisas:
< sumedhghaisas>
not by scratch but the framework has to be build.
< sumedhghaisas>
the current ANN architecture can be used. although we will need to create a new loss for VAE networks
< dk97[m]>
my bad
< sumedhghaisas>
VAEs use encoders and decoders... that can be built from current available layers
< dk97[m]>
okay, I will have a look at that module
< sumedhghaisas>
the major task would be to build Re-paramatrization layer which comes between encoder and decoder
< dk97[m]>
yeah
< sumedhghaisas>
this way. we can use current FFN and RNN classes to build VAE
< dk97[m]>
Ohkay
< dk97[m]>
thanks for the info, I will look at the corresponding code
sumedhghaisas2 has joined #mlpack
< sumedhghaisas2>
dk97[m]: I got disconnected again. :) Did you send me something in the meantime?
sumedhghaisas has quit [Ping timeout: 256 seconds]
< dk97[m]>
Nope
< dk97[m]>
Thanks for the info
< dk97[m]>
I will have a look
< sumedhghaisas2>
sure :) let me know if you have any more questions regarding the implementation
sumedhghaisas has joined #mlpack
sumedhghaisas2 has quit [Read error: Connection reset by peer]
travis-ci has joined #mlpack
< travis-ci>
mlpack/mlpack#4068 (master - f6eadd8 : Ryan Curtin): The build has errored.
< sumedhghaisas>
@rcurtin: haha... also right now when I was debugging NTM error I removed all the other files from mlpack_test except the one that I want
< luffy1996>
my command is g++ <test name> -std=c++11 `pkg-config --libs mlpack`
< sumedhghaisas>
@luffy1996 ahh the main function is written in mlpack_test.cpp
< rcurtin>
luffy1996: the structure of that directory makes things a little bit difficult. I think it *might* work if you add mlpack_test.cpp to the list of files you are compiling, but I am not sure
< luffy1996>
so how does the cli goes?
< luffy1996>
I am presently trying to add a test case for one of my pr
< luffy1996>
I just want to taste how testing goes in mlpack
< rcurtin>
luffy1996: if you want to add a new test, you can just create a new <something>_test.cpp file and add it to the CMakeLists.txt
< rcurtin>
you'll use BOOST_AUTO_TEST_CASE() for each individual test case
< rcurtin>
so you can look at the other tests and get an idea of the code you need to do it
< luffy1996>
I have done that part already
< luffy1996>
I need to see how to run the code
< rcurtin>
ah, if you do 'make mlpack_test' it will build a program 'bin/mlpack_test' that you can use
< rcurtin>
you can run individual test suites, e.g., mlpack_test -t MathTest
< luffy1996>
MathTest is an executable file right?
sumedhghaisas has quit [Read error: Connection reset by peer]
< rcurtin>
no, it is not an executable file, it is a name of a test suite
< rcurtin>
the actual tests are in math_test.cpp but they get compiled into the mlpack_test program
< luffy1996>
so for this it is QLearningTest? right?
< luffy1996>
like mlpack_test -t QLearningTest
< rcurtin>
yes, if you run mlpack_test -t QLearningTest it will run each BOOST_AUTO_TEST_CASE() in q_learning_test.cpp
< rcurtin>
in fact you can even run individual tests, like mlpack_test -t QLearningTest/NameOfTestCase
< luffy1996>
Now I get the intuition. Let me go ahead and check my test files.
< rcurtin>
sounds good
< luffy1996>
I need to add the test in Cmakelist?
< rcurtin>
if you wrote a new test file, yes, if you want it to be compiled into mlpack_test you need to add it to CMakeLists.txt
< luffy1996>
THanks @ryan. I will give you followup soon :)
daivik has joined #mlpack
< rcurtin>
sure, happy to help
< luffy1996>
one more thing
< luffy1996>
In case I just need to make my file
< luffy1996>
can I go ahead with something?
< rcurtin>
I'm not sure I understand what you mean
< rcurtin>
if you want to build only one test, you could comment out all of the other test files in CMakeLists.txt if you are not interested in those other tests and it can speed up compile time
< daivik>
rcurtin: Sorry, I've been really busy at my internship recently and could not get back to you on the mlpack_hmm_train tests PR. I did see your comments on there; I'll post the output I get - but I guess the random seeds PR has also been approved, so that should solve the issue with the failing tests?
< rcurtin>
daivik: I don't think that's the issue; I can't reproduce the failure at all even with different random seeds
< daivik>
Hm, any idea what could be causing the failing tests then?
< daivik>
I do see a test that failed on travis
< rcurtin>
did you try digging into it and seeing what is wrong?
< rcurtin>
the first step is to reproduce it
< daivik>
When I run the tests a large number of times (bash script for loop), all of them pass for most of the runs. However, sometimes the HMMTrainNoLabelsReuseModelTest fails. The fact that it doesn't always fail tells me that it has something to do with random initializations or something that uses random something. I'll try digging into it a bit more -
< daivik>
but since i'm not able to reproduce the issue on every single run, I dont know where to start looking
< daivik>
Also, on the travid build (https://travis-ci.org/mlpack/mlpack/jobs/343916205) -- a different test (HMMTrainRetrainTest1) failed. Again, since this was not failing on my machine - I can only attribute it to random initializations
< rcurtin>
in this case you will have to find a specific random seed that causes the problem, and then use a debugger to get to the bottom of it
< rcurtin>
unfortunately I don't think I have time to dig too deep into this one
< rcurtin>
but the way you can do this is by taking the fix from #1264 and dropping it into place in your branch
< rcurtin>
then the test that you think is failing, you can add this code at the top:
< rcurtin>
size_t seed = std::time(NULL);
< rcurtin>
std::cout << "seed: " << seed << "\n";
< rcurtin>
math::RandomSeed(seed);
< rcurtin>
then, you can run the test until it fails, and when it does fail it will print the random seed that caused the failure
< rcurtin>
and you can then hard-code that random seed into the math::RandomSeed() call and this should make it easier to debug
< rcurtin>
I'll merge #1264 tomorrow so if you don't want to go through the effort of manually patching your branch, you can just wait and then sync to master
manish7294 has quit [Ping timeout: 256 seconds]
< daivik>
Sorry if this sounds a little stupid but - I'm not sure what we're trying to do here. We're trying to find a setting for a random seed that causes the tests to fail. Having found that value of random seed, what do we want to do with that? Do we want to modify the tests so that they dont fail for any value of random seed?
ImQ009 has joined #mlpack
< rcurtin>
daivik: correct, the tests should not fail for any random seed---the fact that they do either implies a problem with the test or with the code
< rcurtin>
it is a bit tedious to dive into the code at this level but unfortunately I think it's what needs to be done in this case
Sayan98 has joined #mlpack
< daivik>
okay .. got it. Thanks a lot. I'll look into it.
Sayan98 has quit [Client Quit]
< rcurtin>
when I look at the failure in HMMTrainRetrainTest1, I notice that the Travis job compiled with debugging symbols
< rcurtin>
did you also try that? sometimes errors appear only when compiled with debugging symbols (since some assertions are skipped otherwise)
< rcurtin>
that might be an easy fix if so
< daivik>
Yeah, I'm running that build as we speak
< rcurtin>
I'm happy to try and provide quick guidance here and there if you have issues
< rcurtin>
but with so many PRs to review (and then I also have to be doing my own mlpack work :)) it would be a long time before I am able to try and dig to the bottom of it
< daivik>
Right, no worries. After the serialization fix - I feel like the Usain Bolt of debugging. Thanks again for the help
caladrius has joined #mlpack
Trion has quit [Remote host closed the connection]
< luffy1996>
Here the OutputLayerType is the error function
< luffy1996>
By error funtion I mean mean squared error, logistic regression etc?
< rcurtin>
I don't understand your question
< rcurtin>
I would suggest that you take a look at the tests in, e.g., feedforward_network_test.cpp to get a better idea of how it is used (or consult other documentation)
< luffy1996>
what is outputLayerType in ffn?
< luffy1996>
I think this might answer my question
Trion has quit []
< rcurtin>
from ffn.hpp: @tparam OutputLayerType The output layer type used to evaluate the network.
< rcurtin>
but it could be helpful if there were a list of example output layer types that could be used
< rcurtin>
MSE and negative log likelihood are two examples; those can be found in the layer/ directory I believe
< zoq>
rcurtin: Hopefully we don't have to wait another week ...
sumedhghaisas has quit [Read error: Connection reset by peer]
< rcurtin>
I had a call where I told them how disappointed I was, so they "escalated" the firewall issue and asked for it to be done today
< rcurtin>
I'm not sure why that escalation didn't happen earlier, but we'll see if they do it today
sumedhghaisas has joined #mlpack
< zoq>
fingers crossed
< rcurtin>
in their timezone it is 10:45am so there are still some hours left in the work day
< rcurtin>
I tried to be harsh and let them know how much of a strain it has been to have 14 days of downtime when we expected 3, and that it's imperative that we make sure it doesn't become 17 days of downtime
< rcurtin>
but I am not sure how good I am at actually being harsh
daivik has joined #mlpack
< zoq>
hm, if we have a another downtime for whatever reason let's install jenkins on mlpack.org
< rcurtin>
I definitely agree with that
sumedhghaisas2 has joined #mlpack
kaushik_ has joined #mlpack
sumedhghaisas has quit [Ping timeout: 240 seconds]
wiking has quit [Ping timeout: 260 seconds]
wiking has joined #mlpack
< caladrius>
Hi! I am Aarush. I am really interested in working on a sequence-to-sequence based encoder-decoder network under the GSoC program. Could it be done?
sumedhghaisas2 has quit [Ping timeout: 256 seconds]
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Ping timeout: 240 seconds]
sumedhghaisas has joined #mlpack
caladrius has quit [Quit: Page closed]
moksh has joined #mlpack
< moksh>
@zoq, I was able to get the gym socket api working on python3 by setting the ERLPORT_PYTHON environment variable and using this https://github.com/hdima/erlport/issues/42. Just thought I'll post here so everyone can take a look. I will start working on an agent now, will get back to you in case of any doubts.
sumedhghaisas2 has joined #mlpack
sumedhghaisas has quit [Ping timeout: 256 seconds]
< zoq>
moksh: Thanks for sharing.
deepL has joined #mlpack
deepL has quit [Client Quit]
sumedhghaisas2 has quit [Ping timeout: 248 seconds]
moksh has quit [Ping timeout: 260 seconds]
jenkins-mlpack has joined #mlpack
< zoq>
no way
< rcurtin>
yeah
< rcurtin>
we have outbound
< rcurtin>
but not inbound
< rcurtin>
so Jenkins is running, but we can't access it
< zoq>
great ...
< rcurtin>
I had a little trouble getting masterblaster to boot (it wasn't giving any serial output, so I had to change the grub options and then it booted just fine)
< rcurtin>
the current status is that the firewall team responded that the ticket was only open for getting the IPs and not opening ports
< rcurtin>
so now a new ticket is open to do the actual firewall changes to allow inbound SSH and HTTP/HTTPS
sumedhghaisas has joined #mlpack
< zoq>
okay, priority urgent!
< rcurtin>
I suspect that what will happen is that they will open SSH only and forget about HTTP, but I am trying to find the actual person working on it
< rcurtin>
if I can find the right person to talk to this literally takes no more than 5 minutes
< rcurtin>
unless the process for firewall changes is really intense or something...
< zoq>
I don't think it is
< rcurtin>
I also submitted a firewall change request to Georgia Tech so that the new masterblaster IP can connect to dealgood
< rcurtin>
that will probably be done on Monday, so until then we won't have MATLAB working properly
< zoq>
fine for me
< rcurtin>
GT is much better to work with than Symantec IT in general...
< rcurtin>
at least it didn't take a year this time
sumedhghaisas has quit [Read error: Connection reset by peer]
sumedhghaisas has joined #mlpack
< rcurtin>
also, in trying to get the most speed out of logistic regression for benchmarking, I've written a utility that automatically converts a dense matrix to a sparse matrix if the matrix is sparse enough and have a prototype integrated with the mlpack_logistic_regression program
< rcurtin>
lots of things need to be cleaned up, but the basic idea is in Python you pass dense or sparse however you like, and from the command line you may append :dense or :sparse to the filename if you want to force a conversion
< rcurtin>
i.e. --training_file sparse_file.csv will do auto-selection (as will --training_file sparse_file.csv:auto) but you can override with --training_file sparse_file.csv:dense or --training_file sparse_file.csv:sparse
< rcurtin>
for sparse datasets like reuters (and many others) this can be a huge speedup
< rcurtin>
yeah... just for training reuters_train.csv, it's 1.06s for sparse (and 0.15s for the sparse conversion) vs. 7.61s for dense
< rcurtin>
I don't think it will be applicable to every algorithm but definitely some of them
< rcurtin>
and it is generic enough that eventually we can expand to gpu_mat, or an mmap'ed matrix for out-of-core learning, etc. (whether anyone will have time to do that, who knows)
< zoq>
wow, that is really cool, sounds like you're going the last mile and squeeze the last bit out of the lr code
< rcurtin>
it is the only way to compete with scikit and others... scikit is quite fast
< rcurtin>
mlpack is like 15% faster for L-BFGS, but nobody will switch for only 15% speedup
< zoq>
Does scikit something special, besides Trust Region?
< rcurtin>
scikit's use of liblinear uses sparse matrices though so it can be ~50% faster than mlpack's L-BFGS for dense matrices; however, I think mlpack L-BFGS with sparse matrices will be a few times faster than that
< rcurtin>
nah, and actually I am not sure the TRN optimizer via liblinear is doing that great of a job; it seems like many times L-BFGS can do as well, but TRN is faster for sparse datasets because it is using a sparse representation
< rcurtin>
I'd have to look closer to confirm whether or not what I said is fully correct
< rcurtin>
but I'm hoping to just re-run with the automatic sparse/dense selection, and hoping that the benchmarks will look a lot better then
< rcurtin>
if they do I will finally move on to another algorithm...
< rcurtin>
still, a lot of what I have been working on is useful for many algorithms, so that is good at least :)