#mlpack on 2015-03-31 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

02:14 travis-ci has joined #mlpack

02:14 < travis-ci> mlpack/mlpack#39 (master - 5c00312 : ryan): The build passed.

02:14 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/648651ec58bd...5c0031225be5

02:14 < travis-ci> Build details : http://travis-ci.org/mlpack/mlpack/builds/56507786

02:14 travis-ci has left #mlpack []

02:53 < JaraxussTong> I think there is something wrong in range_search_impl.hpp. In some constructor pointer queryTree isn't initialized, once singleMode is true and naive is false, the deconstructor will fail.

03:12 < naywhayare> JaraxussTong: I think you're right; want to submit a PR?

03:16 < JaraxussTong> Yes, I will commit it as soon as possible.

03:24 < naywhayare> thanks!

07:11 yingryic has quit [Ping timeout: 264 seconds]

07:12 yingryic has joined #mlpack

08:41 govg has joined #mlpack

09:48 KTL has joined #mlpack

10:32 govg has quit [Remote host closed the connection]

11:07 KTL has quit [Remote host closed the connection]

12:04 govg has joined #mlpack

12:29 prafiny has joined #mlpack

12:35 < naywhayare> prafiny: I actually still haven't received my email on how to get icc... maybe I should try again

12:40 < prafiny> Yes why not ! Because it didn't take 1 min for me to receive it :/

12:48 < naywhayare> hm, then I will have to try again L:)

12:49 < naywhayare> (oops, unintended L)

13:14 < prafiny> Hm still some trouble even with icc

13:20 < naywhayare> if you want to paste the output, I can take a look and maybe point you in the right direction

13:32 < naywhayare> JaraxussTong: I'm going to refactor RangeSearch a bit, and that should make it easier to use in MeanShift

13:32 < naywhayare> basically, you'll pass the query set when calling Search(), instead of when calling the constructor

13:42 travis-ci has joined #mlpack

13:42 < travis-ci> mlpack/mlpack#41 (master - 0c14444 : Ryan Curtin): The build passed.

13:42 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/5c0031225be5...0c14444f7fde

13:42 < travis-ci> Build details : http://travis-ci.org/mlpack/mlpack/builds/56565925

13:42 travis-ci has left #mlpack []

13:46 govg has quit [Ping timeout: 252 seconds]

13:57 govg has joined #mlpack

14:06 govg has quit [Quit: leaving]

14:18 < prafiny> In fact I picked a build of boost for visual and it doesn't seem to deal well with mlpack :/

14:21 < naywhayare> if the boost version is compiled with MSVC, I'm not sure it's guaranteed to work with the intel compiler, so you may have to compile Boost by hand...

14:24 < prafiny> Hm… there must be some prebuilt packages somewhere, mustn't there ?

14:38 < naywhayare> well, there's at least this: https://software.intel.com/en-us/articles/building-boost-with-intel-c-composer-xe-2013-on-windows-7

14:38 < naywhayare> I don't think there can be prebuilt packages, because I don't think you're allowed to distribute anything you build with a free version of icc

14:44 govg has joined #mlpack

14:54 < prafiny> Hm i see, and using a mingw build wouldn't help ?

15:02 < naywhayare> using mingw to build boost? I think you might have the same issue as the version built with msvc. but, I'm not certain -- I haven't tried it

15:22 yingryic has quit [Quit: Leaving]

15:23 prafiny has quit [Ping timeout: 246 seconds]

15:54 < JaraxussTong> naywhayare: thanks

16:02 travis-ci has joined #mlpack

16:02 < travis-ci> mlpack/mlpack#42 (master - 90a9d93 : ryan): The build passed.

16:02 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/0c14444f7fde...90a9d9394828

16:02 < travis-ci> Build details : http://travis-ci.org/mlpack/mlpack/builds/56584083

16:02 travis-ci has left #mlpack []

16:21 KTL has joined #mlpack

17:00 < naywhayare> KTL: I took a look at your proposals... I'm not sure exactly what problem you're trying to solve

17:01 < naywhayare> I thought that the idea was to find a better way to make the covariance matrix positive definite, in the case where it isn't

17:01 < KTL> oi oi

17:01 < KTL> i also have an implementation; took some inspiration from yours

17:02 < KTL> i wasted time with silly mistakes resulting in insane bugs :(

17:03 < naywhayare> that sounds like my every day :)

17:03 < naywhayare> anyway, I'm happy to look through your implementation

17:03 < naywhayare> if it's faster and it still passes all the tests, we should probably use it

17:04 < KTL> mine will not be faster

17:04 < KTL> because every gaussian has its own set of dimensions

17:05 < KTL> and i don't have the EM-algorithm because i want everything to be incremental

17:06 < KTL> naywhayare, http://postimg.org/image/66kt9zbxh/

17:06 < KTL> 48 clusters, only 1 pixel in 4*4 given as input

17:07 < KTL> input from http://upload.wikimedia.org/wikipedia/en/2/24/Lenna.png

17:07 < KTL> i used euclidean distance without normalizing

17:09 < naywhayare> okay, so I think maybe I misunderstand the problem you are trying to solve then?

17:11 < KTL> yes and no, i am building a GMM-implementation but within a larger framework that has less common needs

17:11 < KTL> and without a strong background in math and stats :D

17:15 < naywhayare> yeah, but I still don't get the goal of your proposals, so I can't think about which is the best idea :)

17:15 < naywhayare> could you explain more of what you're trying to do, and what you might like to contribute or see changed upstream?

17:17 < KTL> when i said that i thought it would help to get the random() function more numerically stable ... BUT ... in the meantime i did find a bug in my implementation ...

17:17 < KTL> mmmm but .... that bug ... won't have caused the instability i think ...

17:18 < naywhayare> is Random() unstable? if your Gaussian has positive definite covariance, it should just be drawing randomly from that distribution

17:19 < KTL> i had inf-values on the diagonal of my covlower

17:21 < KTL> and the numerical value of logdetcov could be seriously different depending on the way it was calculated, when the determinant of the covariance matrix was near-zero

17:21 < KTL> one method is to calculate the determinant (with option "std") and take the log

17:22 < KTL> the other method was to take the sum of the logs of the covlower and multiply by 2

17:23 < KTL> anyway, just ignore my rambling, if you ever get numerical issues, a probability that is not between 0 and 1 for example ... then look at it :D

17:34 < stephentu> naywhayare: i was working on LDA on teh plane and i ran into this issue about randomness

17:34 < stephentu> is there any reason we prefer this global random object

17:35 < stephentu> versus threading randomness through the objects

17:35 < stephentu> i implemented LDA doing the latter

17:35 < stephentu> it seems cleaner imo

17:36 < naywhayare> stephentu: no particular reason it was done the way it was originally; can you show me an example of what you mean?

17:37 < naywhayare> KTL: okay, I think I understand

17:38 < naywhayare> I'm fine with forcing "std" for det(), since det() only uses something else if the matrix is 4x4 or smaller

17:42 < stephentu> template <typename PRNG> void DoInference(PRNG &prng, ...)

17:42 < stephentu> where PRNG is typically one of the objects in <random>

17:42 < stephentu> or even better

17:42 < stephentu> template <typename RPNG> void DoInference(..., PRNG &prng = math::randGen);

17:42 < stephentu> so by default uses teh global obj

17:43 < stephentu> but you can have hte option of specifying your own

17:43 < stephentu> downside is that everything that takes randomness must be a template

17:44 < naywhayare> one advantage of a global random object is that I can set the seed easily during a test to reproduce a particular bug

17:45 < naywhayare> to me it seems less clear how I might do that with a templated RNG, especially if anything is not using the default RNG

17:46 < naywhayare> a disadvantage of the templated approach is that the API gets more complicated, which I'm not a huge fan of (it's already pretty complex), and also sizeof(Type) is going to get larger by a reference (if you're storing the RNG internally in the class)

17:46 < naywhayare> but I think maybe I am not thinking exactly the same way you are about how this gets implemented

17:47 < naywhayare> and I feel like there are some advantages here I'm overlooking

17:47 < stephentu> naywhayare: 1) sizeof(Type) wont' increase, since you dont store the reference, but just pass it inot hte methods that need it

17:47 < stephentu> the main advantage

17:47 < stephentu> is parallelization

17:47 < stephentu> like if i want to run 20 LDAs in parallel

17:47 < stephentu> i definitely dont want them sharing the same rng

17:48 < stephentu> and i definitely want to run multiple chains in parallel

17:49 < stephentu> thats standard practice for MCMC stuff

17:49 < naywhayare> hang on, I have to step out... I'll be back shortly

17:53 < stephentu> one way to combat api complexity

17:53 < stephentu> is we agree before hand that all random objects in mlpack

17:53 < stephentu> will be say std::default_random_engine

17:53 < stephentu> then we typedef std::default_random_engine rng_t;

17:53 < stephentu> and we use rng_t everywhere

17:53 < stephentu> instead of the template

18:29 < naywhayare> stephentu: sorry for the delay... I think this is a good idea, because parallelism is a nightmare otherwise

18:29 < naywhayare> I think the extra API bloat of having template parameters for functions that use RNGs is unfortunate, but somewhat unavoidable

18:29 < naywhayare> providing default arguments and good documentation is probably a decent solution here

18:30 < naywhayare> so, we should figure out how to make the doxygen output look a bit nicer for functions that have a crapload of parameters and template arguments (maybe, optionally hide all the parameters that have default arguments or something)

18:30 < naywhayare> but that can happen another day

18:31 < naywhayare> the typedef rng_t solution has problems (if you are trying to let the user overload it) if rng_t is ever used in a .cpp file somewhere, and the user selects something other than default_random_engine

18:32 < naywhayare> if you agree with that idea, then, do you want to add it to the new design guidelines page? (i.e. "when using RNGs, make it a template parameter, like so") https://github.com/mlpack/mlpack/wiki/ProposedNewDesignGuidelines

18:33 < stephentu> sure i'll add it

18:33 < stephentu> i hope to get some working version of LDA up soon too

18:33 < stephentu> which will be a good example

18:47 < naywhayare> yeah, that would be great

18:59 KTL has quit [Remote host closed the connection]

20:27 trapz has quit [Quit: trapz]

21:17 trapz has joined #mlpack

21:26 KTL has joined #mlpack

22:53 KTL has quit [Ping timeout: 256 seconds]