ChanServ changed the topic of #mlpack to: "mlpack: a fast, flexible machine learning library :: We don't always respond instantly, but we will respond; please be patient :: Logs at http://www.mlpack.org/irc/
< rcurtin>
sreenik: same for me, I use vim + gnu screen. it's a bit spartan but for me it does all I need :)
Suryo has joined #mlpack
< Suryo>
zoq: I've fixed the code styling on ensmallen PR#117 and I followed the design guidelines. I think that it is okay now. Let me know what you think.
Suryo has quit [Quit: Page closed]
< rcurtin>
robertohueso: since you said that the simple example does approximate successfully with SummarizeMC(), maybe the next idea to try would be this:
< rcurtin>
take a look at the equation above equation 2 in the paper (that is, z_{\beta / 2} * (\sigma_s / \sqrt(m)) \le \epsilon * \Phi(q, R) / |R|)
< rcurtin>
you can take the estimate at the end of the Score() implementation and see if it satisfies that condition
< rcurtin>
i.e., you can compute Phi(q, R) exactly to check that condition
< rcurtin>
now, the approximation done by SummarizeMC() is probabilistic, so it's not guaranteed to work *every* time, but you can explore with different \beta and see if it appears to behave correctly as \beta decreases
< rcurtin>
perhaps, even just counting the number of failed SummarizeMC()s vs. non-failed SummarizeMC()s could suffice for getting an idea of if things are working right
< rcurtin>
for these tests, it may be best to comment out the existing KDE prunes, so that SummarizeMC() is always used
< rcurtin>
my other concern is that I am not seeing, in this algorithm, how we avoid caling SummarizeMC() at the very first level of the tree!
< rcurtin>
it seems to me that, since CanSummarizeMC() depends *only* on the number of points in the reference node (specifically, the number of points in the reference node being *greater* than a certain value),
< rcurtin>
then there is no way the algorithm will ever even recurse, because it will just call CanSummarizeMC() at the first level
AndroUser has joined #mlpack
favre49 has joined #mlpack
< favre49>
sreenik: I've been using sublime text myself.
< favre49>
zoq: I'm fine now, so I should be able to progress much faster. I've been working on the Genome and Phenome classes, I'll probably commit them both at once since the structure of the Genome class is going to be dependent on how iimplement the Phenome classes.
AndroUser has quit [Remote host closed the connection]
< favre49>
rcurtin: zoq: I'm not sure why, but NEAT isn't being built when i run make.
< rcurtin>
favre49: is it added to the CMake configuration?
< rcurtin>
and if so, is the specific target you are trying to build being built?
< favre49>
I've created a CMakeLists.txt in the neat folder in methods, and included the name of the folder in the CMakeLists.txt in mlpack/methods.
< rcurtin>
ok, are there any .cpp files or specific things that should be compiled?
< favre49>
Nope, only .h files
< rcurtin>
i.e. did you have any add_cli_executable() or add_library() or add_executable(), etc., in the NEAT directory?
< rcurtin>
ok
< rcurtin>
in this case, there will be nothing to build---a header file doesn't need to be compiled
< rcurtin>
do you mean that the NEAT header files you are writing aren't being copied to build/include/ when you do 'make mlpack'?
< favre49>
oh right they are
< favre49>
I wasn't aware header files wouldn't be built unless included
< favre49>
thanks :)
< favre49>
I should probably read about how the whole thing works some time, I've just never had to work with it.
< rcurtin>
no worries :)
< rcurtin>
now, those header files should be compiled into some test in src/mlpack/tests/
< rcurtin>
this highlights the importance of testing---the compiler won't actually try to compile code in headers unless it's being used somewhere
< rcurtin>
so if we don't test out template classes, it's possible they won't even compile and a user would find that out after a release (which is not what we want to happen :))
< favre49>
Yup, makes total sense
< robertohueso>
rcurtin: About the equation above equation 2, it doesn't satisfies the restriction so I guess this means there's something wrong with the rule. I will check it again.
KimSangYeon-DGU has joined #mlpack
< rcurtin>
robertohueso: when you say it doesn't satisfy the restriction, what do you mean?
< robertohueso>
I meant condition (the one above equation 2), sorry
< rcurtin>
ah, ok
< rcurtin>
I'm more concerned about the lack of termination... the algorithm should recurse if the variance of the reference node is too great so that it can't approximate
< rcurtin>
are you also finding that the way the algorithm is written, it will always try to approximate the root node and never recurse? or perhaps did I misread something?
< robertohueso>
I agree with you, unless I also misread, CanSummarizeMC will avoid recursion
< robertohueso>
Basically the condition (the equation above equation 2) is satisfied when a larger number of points are involved, not with the 22 points test.
< favre49>
zoq: I've basically finished the implementation of the Genome and the Phenome for acyclic networks, if we decide on what internal class representations to use (i left a comment on the PR) i could commit it tomorrow and we could make changes from there.
favre49 has quit [Quit: Page closed]
sreenik has joined #mlpack
< sreenik>
zoq rcurtin I think I'll also have to shift to vim/neovim because of the insane amount of memory vscode is taking up. favre49 is sublime text light enough like vim?
< zoq>
favre49: Sounds good, will take a look at the comment.
< rcurtin>
I've heard good things about sublime text, but I am too old-fashioned to use it :)
< rcurtin>
robertohueso: do you mean that the number of points that need to be sampled is greater than the number of points in the node? (for that small 22-point dataset test)
< rcurtin>
I wonder if this is the right condition to recurse on---if during any iteration, m > |R|, then recurse, since the desired approximation may not be achievable
< rcurtin>
well, I am not sure that is right. I need to think about it a little bit more, but definitely there must be some condition where the number of points to be sampled is too great, and recursion should happen