verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
< rcurtin> zoq: I was trying to run the Elki benchmark job, but it seems like there was an issue with the SQL database:
< rcurtin> "_mysql_exceptions.OperationalError: (1136, "Column count doesn't match value count at row 1")"
< rcurtin> do you think I should simply remove (after backup) the database? or do you know of an easy way to fix it?
< rcurtin> I have no problem running all the benchmarks again
< rcurtin> it will just take a little while...
< rcurtin> oh, I see what it is... the sql database doesn't have the right columns for any sweeps
< rcurtin> in this case I'll go ahead and back up the DB and then remove it so that the benchmark job will just create a new one
vivekp has quit [Ping timeout: 245 seconds]
vivekp has joined #mlpack
< jenkins-mlpack> Project docker mlpack nightly build build #369: UNSTABLE in 2 hr 45 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20nightly%20build/369/
< ShikharJ> zoq: Using EvaluateWithGradients increases time performance by almost 13%. So that saves us about 45 minutes of training time. This puts mlpack (single core majorly) at 6.25 hours and tensorflow (multi-threaded) at 4.5 hours (single core aggregate at 11 hours). Using OpenBLAS didn't turn out to be of much benefit, multi-threading was only active for an aggregate of 5 minutes, so I'm not sure if considering the overhead,
< ShikharJ> this would be beneficial or harmful.
< ShikharJ> zoq: I also timed the individual evaluate function at ~19s per call and gradient at ~52s per call. Evaluate with Gradients is steady at ~62s per call. So the next step should be to look into Gradients function, and see if there's a chance of improvement (though it looks pretty tight to me).
< ShikharJ> zoq: If we can somehow provide multi-thread support to our FFN architecture (with a performance bump of ~30%, though I'm not sure if this is realistic or not), we can beat tensorflow's time on multi-threads as well. But atleast for now we have the edge on single core.
vivekp has quit [Ping timeout: 240 seconds]
vivekp has joined #mlpack
witness_ has quit [Quit: Connection closed for inactivity]
manish7294 has joined #mlpack
< manish7294> rcurtin: zoq: I think setting seed using math::RandomSeed(const size_t) and then calling arma::randu() is always initializes the same matrix.
< manish7294> I replaced RandomSeed() with arma_rng::set_seed_random() and this time initialization was different as to be expected.
manish7294 has quit [Ping timeout: 252 seconds]
ImQ009 has joined #mlpack
< rcurtin> manish7294: that is strange, RandomSeed() should be calling arma_rng::set_seed_random() internally
< rcurtin> the only case where it doesn't do that is when BINDING_TYPE is BINDING_TYPE_TEST, is that true in your case?
< rcurtin> that should only be in any code that's in the src/mlpack/tests/main_tests/ directory
manish7294 has joined #mlpack
< manish7294> rcurtin: math::RandomSeed((size_t) CLI::GetParam<int>("seed")); This is being used in lmnn_main.cpp to do this, but somehow it's not working.
< rcurtin> right, so either you'll need to set --seed differently in each of your calls to the program, or you'll need to do what the other programs do, which is
< rcurtin> if (CLI::GetParam<int>("seed") == 0)
< rcurtin> math::RandomSeed(std::time(NULL));
< rcurtin> else
< rcurtin> math::RandomSeed((size_t) CLI::GetParam<int>("seed"));
< manish7294> I am doing the same
< manish7294> if (CLI::GetParam<int>("seed") != 0)
< manish7294> math::RandomSeed((size_t) CLI::GetParam<int>("seed"));
< manish7294> else
< manish7294> math::RandomSeed((size_t) std::time(NULL));
< rcurtin> and does seed have a default value of 0?
< manish7294> yes
< rcurtin> it sounds like you should trace what is going on and figure out why it isn't setting the random seed
< manish7294> the most strange part is that putting arma_rng::seed here solves this
< manish7294> *arma_rng::set_seed_random()
< rcurtin> this does not make sense, so you should debug it and find out what is going on
< manish7294> ya, sure
< manish7294> And shall we merge the LMNN code, so that PR related to issues can be open.
< manish7294> I think all the comments on PR are solved now.
< rcurtin> manish7294: thanks, I'm glad to have it merged :)
< rcurtin> ShikharJ: I read your results from yesterday, it looks good to me so far! I agree that some improvement could still be done, but definitely I think we are in a good starting place
< rcurtin> I like to use profilers like gprof or perf to try and identify what's taking a long time
< rcurtin> and actually using mlpack's Timer::Start() and Timer::Stop() can be good for high-level benchmarking
< rcurtin> so long as what you're timing takes a non-negligible amount of time (like 0.0001 seconds between the call to Start() and Stop()) I think it is reasonably accurate
< rcurtin> of course I don't know if that is what you are planning to do next, and if not, no worries, but if so, I thought it would be helpful :)
< rcurtin> ShikharJ: I will also bring savannah back online for jenkins, so let me know if you want me to bring it offline for simulations again
manish7294 has quit [Ping timeout: 252 seconds]
< ShikharJ> rcurtin: Ah, I used std::chrono for timing the builds and the calls.
manish7294 has joined #mlpack
< manish7294> rcurtin: there?
< ShikharJ> rcurtin: Sure, you can bring Savannah online now. Atleast we now have a baseline score to beat. I'll keep digging in the code to find places we can improve upon, and test out the implementations on the benchmark systems for as long as they're online.
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#5225 (master - fd59d03 : Ryan Curtin): The build passed.
travis-ci has left #mlpack []
vivekp has quit [Read error: Connection reset by peer]
vivekp has joined #mlpack
< zoq> ShikharJ: Thanks for the timings, glad implementing EvaluateWithGradients worked out.
< zoq> ShikharJ: I guess now it's time to use gprof or something similar to find some bottlenecks.
< zoq> ShikharJ: We can definitely revisit the conv operations.
< ShikharJ> zoq: I was thinking to not shift focus from RBM PR, since I guess the first priority should be to get as many modules available within Mlpack. Optimizing them further should be a task for the later in my opinion. I wish to finish atleat Kris' share of remaining PRs from last year. Then maybe we caan focus on this?
< zoq> Sounds reasonable, I'll see if I can do some pre-profiling.
< manish7294> rcurtin: As per what I found, the error is within this line #if (BINDING_TYPE != BINDING_TYPE_TEST) , it doesn't take care of NULL condition and as a result condition is set to false when BINDING_TYPE is not there because we are evaluating (NULL !=NULL) here----which is not true. If that sounds reasonable then I will open a PR fixing the same.
< Atharva> zoq:
< Atharva> I think there is a mistake in the `Gradient()` function of `Sequential` layer.
< Atharva> The error to the `Sequential` layer is passed to the first layer within the `Sequential` layer. It should be passed to the last layer.
< Atharva> Also, when using a single `Sequential` layer in a FFN class, the `Gradient(arma::mat &&input)` calls `network[1]` which does not exist and hence it throws a Segmentation fault.
< zoq> Atharva: Nice catch, you are right; about the single layer issue, ideally we would have to check if network size > 1 and either modify the update process or add an identity layer. I guess adding a note to the class itself that someone should only use the class if there is more than one layer, don't really see a reason to use the seq layer with a single layer.
< Atharva> Yeah, there isn't any reason to use the class with only only a single layer. I was just trying to debug something so I did it.
< Atharva> What do you think about the first issue, in th `Gradient()` function of the layer. My network has three layers, sequential encoder, reparametrization and sequential decoder. In the FFN class, the sequential layer is just one object so it passes the error from the reparametrization layer to it. The sequential layer ends up passing it to it's first layer when actually it should pass it to it's last layer.
< zoq> right, we should use network.back()
< Atharva> Yes, if it's okay I will make the changes in one of my PRs.
< zoq> Atharva: Great, thanks :)
< zoq> Atharva: Really nice catch!
< Atharva> zoq: Thanks, I had to solve it because my network was failing because of it.
< ShikharJ> zoq: I'm not sure if the newer constructor initialization is correct in WGAN PR. Could you take a look. I have left a comment at the necessary place.
manish7294 has quit [Ping timeout: 252 seconds]
cjlcarvalho has joined #mlpack
< rcurtin> manish7294: you're right, that is a big bug! if you can submit a PR that would be great
< rcurtin> that means random seeds are not working at all right now
cjlcarvalho has quit [Ping timeout: 240 seconds]
manish7294 has joined #mlpack
< manish7294> rcurtin: I have opened #1462 dealing with RandomSeed() issue. I guess it was accidentally missed when BINDING_TYPE_TEST support was added.
travis-ci has joined #mlpack
< travis-ci> manish7294/mlpack#50 (RandomSeed - fc0edcd : Manish): The build has errored.
travis-ci has left #mlpack []
travis-ci has joined #mlpack
< travis-ci> manish7294/mlpack#51 (RandomSeed - 46f3695 : Manish): The build has errored.
travis-ci has left #mlpack []
manish7294 has quit [Ping timeout: 252 seconds]
ImQ009 has quit [Quit: Leaving]
travis-ci has joined #mlpack
< travis-ci> manish7294/mlpack#52 (RandomSeed - 800f52f : Manish): The build passed.
travis-ci has left #mlpack []
vivekp has quit [Ping timeout: 240 seconds]
vivekp has joined #mlpack