naywhayare changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
govg has quit [Ping timeout: 272 seconds]
govg has joined #mlpack
govg has quit [Read error: Connection reset by peer]
< andrewmw94>
never mind it was me. I saw the test failed in something else and thought that's where the problem would be. Duh.
govg has joined #mlpack
govg has quit [Changing host]
govg has joined #mlpack
< jenkins-mlpack>
Starting build #1979 for job mlpack - svn checkin test (previous build: FAILURE -- last SUCCESS #1977 3 hr 58 min ago)
< naywhayare>
andrewmw94: yeah, just a missing file
< naywhayare>
I just page-down in the angry Jenkins email until I find where in the build log it failed; maybe that's quicker than surfing around on the semi-terrible Jenkins interface
< andrewmw94>
I'm confused on neighbor_search_impl.hpp:262
< andrewmw94>
I used to try make clean in a futile hope that I hadn't actually made the mistake. This time I figured it wasn't worth the effort.
< naywhayare>
sometimes when you modify header files, and rebuild an executable or something, CMake still considers the mlpack target up to date and doesn't copy the updated headers
< naywhayare>
and as a result may not actually rebuild your target even though the code may be different
< andrewmw94>
ahh. Good to know
< naywhayare>
I'm not sure that #322 is exactly what happened to you, but maybe something similar like that
< naywhayare>
if, in the future, you come across a situation where you made a modification to your code but CMake doesn't actually recompile anything, please let me know or file a bug or something
< naywhayare>
little CMake issues can be incredibly frustrating and annoying...
< andrewmw94>
will do.
< andrewmw94>
Also, I wanted to ask you about my solution to the sorting of nodes. I added an extra struct that's like four lines long
< andrewmw94>
and then put it in the single_tree_traverser file
< andrewmw94>
is that ok.
< andrewmw94>
I though the struct was the cleanest way to use their sort and avoid computing the distances from points to HRects more than once
< naywhayare>
yeah, the cover tree has to do something like that
< naywhayare>
I think I called it the CoverTreeNodeFrame or something like that
< andrewmw94>
ok, just wanted to make sure
< andrewmw94>
thanks
< naywhayare>
"DualCoverTreeMapEntry"
< naywhayare>
just rolls right off the tongue
< andrewmw94>
ok, so further modifications to rectangle_tree/single_tree_traverser_impl.hpp don't seem to recompile properly. Should I file a bug or just tell you here?
< andrewmw94>
but I seem to have other problems with my builds in general now. Do you have the same issue? I may just try rebooting.
< naywhayare>
yeah, let's just figure them out here
< naywhayare>
what's going on? is it just not recompiling things at all?
< andrewmw94>
It isn't recompiling that file unless I do make clean
< naywhayare>
okay
< naywhayare>
let me see if I can reproduce
< andrewmw94>
but I'm also having issues with copying files every other build now:
< andrewmw94>
Error copying file "/home/awells/Development/mlpack/mlpack/trunk/src/mlpack/core/tree/rectangle_tree/rectangle_tree_traverser.hpp" to "/home/awells/Development/mlpack/mlpack/trunk/include/mlpack/core/tree/rectangle_tree/rectangle_tree_traverser.hpp".
< naywhayare>
my quick tests seemed to indicate that now the headers are copied whenever the mlpack target is built
< naywhayare>
but the issue andrewmw94 is having is a bit more complex
< naywhayare>
the files that the allknn target depends on are only those specified in the SOURCES variable in src/mlpack/methods/neighbor_search/CMakeLists.txt
< naywhayare>
but those sources don't include any of the header files of the library itself, such as single_tree_traverser_impl.hpp
< naywhayare>
sorry, I misspoke... the files that the allknn target depends on are only those specified in the add_executable(allknn ...) command
< sumedh_>
ohh so only those dependent headers are getting copied now??
< naywhayare>
no, this doesn't have to do with the files being copied
< sumedh_>
okay then I don't understand the problem at all :(
< naywhayare>
the fix I linked you to (changeset 16741) dealt with the file copying problem
< naywhayare>
but this other problem is actually quite different
< sumedh_>
okay... but now we should gather header files from MLPACK_INCLUDES... cause if new header file is added it won't be copied with glob...
< jenkins-mlpack>
Starting build #1980 for job mlpack - svn checkin test (previous build: FIXED)
< naywhayare>
if a new header file is added, cmake needs to be re-run anyway
< naywhayare>
a new header file means the cmake configuration needs to be changed
< naywhayare>
and when cmake is re-run, it will catch the new header file with the glob command
< sumedh_>
yes... user has to add that file to MLPACK_SOURCES... so actually there is no need for glob to check... we can grab all the files from sources itself...
< naywhayare>
the glob makes it easier to get only the .h and .hpp files
< sumedh_>
yes thats true... that can even be done with regular expression checking... but yeah right now... glob is no problem...
< sumedh_>
naywhayare: yeaah... SVD is running faster now...
< jenkins-mlpack>
* Ryan Curtin: Move LaplaceDistribution to mlpack::distribution and expand upon it, but then
< jenkins-mlpack>
remove MoveDistributionType template parameter from SA class because really, the
< jenkins-mlpack>
MoveControl() function is written specifically to work with the Laplace
< jenkins-mlpack>
distribution. I was incorrect to think it could be templatized so easily. The
< jenkins-mlpack>
implementation of the Laplace distribution has been inlined, too, which may help
< jenkins-mlpack>
it be a little faster.
< jenkins-mlpack>
* andrewmw94: commit to ease debugging
< naywhayare>
jenkins-mlpack: sorry! it is fixed now
< jenkins-mlpack>
naywhayare did you mean me? Unknown command 'sorry!'
< jenkins-mlpack>
Use '!jenkins help' to get help!
< naywhayare>
bot is incapable of accepting an apology, I guess
< andrewmw94>
I should try doing that to people
< sumedh_>
hehe....
< andrewmw94>
"use --help for more options"
< naywhayare>
haha
< sumedh_>
naywhayare: how come your validation RMSE is still running ... for me its terminating at 1.23 ...
< naywhayare>
no idea, maybe we are using different tolerances or different versions of the code?
< naywhayare>
I'm using the one you sent me in an email
< naywhayare>
like I said, whenever it finally terminates, I'll send you an exact copy of what I was using
< sumedh_>
okay ... strange... there is one little problem with the abstraction... we are currently changing matrix V... so in the end it should fixed again... So should I add deinitialize function to all the termination policies??
< naywhayare>
that situation is specific to the validation RMSE policy
< sumedh_>
yes...
< naywhayare>
and I think the best thing to do is force the user to create the ValidationRMSETermination object on their own, which will modify the matrix V and create the validation set
< jenkins-mlpack>
Starting build #1981 for job mlpack - svn checkin test (previous build: FAILURE -- last SUCCESS #1979 1 hr 31 min ago)
< naywhayare>
and we should document that the matrix V will have elements removed to create the validation set
< naywhayare>
and it should be up to the user to make a clean copy of their data, if they need it afterwards
< naywhayare>
that's my opinion, at least. it means we can make AMF::Apply() take a const matrix again, which is much better in my opinion
< sumedh_>
ohh so matrix modification to be shifted to the constructor then??
< naywhayare>
yeah, so what the user has to write for validation RMSE termination looks like this:
< naywhayare>
arma::sp_mat V; // the user loads this on their own
< sumedh_>
that will also solve another problem....
< naywhayare>
ValidationRMSETermination termination(V, /* other parameters for building validation set */);
< naywhayare>
AMF amf(termination, /* other parameters */);
< naywhayare>
if they wanted the original V afterwards, they could make a copy before calling ValidationRMSETermination
< naywhayare>
the other choice is to copy V inside of ValidationRMSETermination and split it into two matrices
< naywhayare>
the training set and the validation set
< naywhayare>
actually, that idea doesn't work... nevermind
< sumedh_>
right now the constructor for validation RMSE accepts the number of testing points...
< sumedh_>
there cannot be a default value for that...
< naywhayare>
I thought that it took the number of testing points to use from each row?
< sumedh_>
cause we don't know how many points are going to be there in the dataset...
< naywhayare>
i.e. 3
< naywhayare>
or is it something else?
< sumedh_>
yes... I thought of that but in the case where there are many users this takes tremendous time...
< sumedh_>
like in the case of movie lens...
< sumedh_>
but in stead if we go with number like 1000 or 2000... we get equivalent result with lot of speed gain...
< sumedh_>
so I think we should be able to provide both options...
< sumedh_>
wait... but amf::Apply() calls a default constructor... okay.. but we can make it private .... thats no problem...
< naywhayare>
hm, I see what you mean
< naywhayare>
so if a user passes in a number like 2000, how will you sample from V?
< naywhayare>
divide by the number of users, and sample that many points from each user, or something else?
< sumedh_>
random 2000 points... completely random...
< sumedh_>
will that work??
< naywhayare>
that won't be the same as their idea of sampling 3 from each user, but I think that's okay
< naywhayare>
since we just need to sample 3 from each user to check that we are getting the same results
< naywhayare>
then we can write some tests using your idea of sampling 2000 random points to write tests for mlpack_test
< naywhayare>
but Movielens1m is way too large to test on
< sumedh_>
so how to give both options to the user?? either provide total points or provide number of points per user??
< naywhayare>
let's just go with "provide total points"
< naywhayare>
but for testing we can just make a simple modification to do 3 points for user specifically for the movielens dataset, and make sure the validation RMSE gets to the same point
< naywhayare>
(0.8, I think?)
< sumedh_>
and are you running the code which does not require termination policy??
< naywhayare>
at the moment, yes; it's the version you mailed me
< sumedh_>
yes... that is not terminating on my machine either...
< naywhayare>
but is it getting better RMSE?
< naywhayare>
it is in my case
< naywhayare>
very slowly decreasing
< naywhayare>
andrewmw94: okay... I think I understand a little better what is going on to cause your build problem where allknn doesn't actually rebuild things
< naywhayare>
CMake is supposed to automatically detect the header file dependencies of a given .cpp file
< sumedh_>
yes... but machine is heating up too much...
< naywhayare>
you can see the dependencies of, say, allknn_main.cpp by opening the file build/src/mlpack/methods/neighbor_search/CMakeFIles/allknn.dir/depend.make
< naywhayare>
sumedh_: oops, maybe ctrl-z the process and let the computer cool down for a while :)
< sumedh_>
I can actually make an omlet on it... :)
< naywhayare>
anyway, that list of dependencies is correct with one exception -- the file core/tree/rectangle_tree/single_tree_traverser_impl.hpp just isn't in that list of dependencies of allknn_main.cpp.o
< sumedh_>
yes... too old... have to buy a better one now....
< andrewmw94>
ahh, so I just add it and that should be it?
< naywhayare>
well, maybe that would work, but CMake will probably overwrite your change
< naywhayare>
so I am trying to figure out how to convince CMake to automatically add that file
< naywhayare>
or figure out why it ignored single_tree_traverser_impl.hpp in the first place
< naywhayare>
is it missing in your build directory too?
< andrewmw94>
no, it's in trunk/src/mlpack/core/tree/CMakeLists.txt
< naywhayare>
yeah, it's listed there
< naywhayare>
but open the file build/src/mlpack/methods/neighbor_search/CMakeFIles/allknn.dir/depend.make
< naywhayare>
that's a file generated by CMake that lists the dependencies of the allknn_main.cpp.o target (so, basically, everything that will trigger a rebuild of allknn_main.cpp)
< andrewmw94>
that one lists the dual_tree one but not the single_tree
< naywhayare>
(or dual_tree_traverser_impl.hpp, but that isn't used, I don't think)
< andrewmw94>
yes. just the *traverser.hpp files
< naywhayare>
so if you modify single_tree_traverser_impl.hpp (try it -- add a syntax error) and type 'make allknn', then assuming bin/allknn exists, CMake won't build anything
< naywhayare>
but it picks up the traverser implementations for the cover tree and the binary space tree
< andrewmw94>
yeah, syntax errors don't help
< andrewmw94>
funny to be writing that
< andrewmw94>
would it be because it has exactly the same name as the ones in dual tree?
< andrewmw94>
wild and improbably guess
< andrewmw94>
sorry, Binary Space tree, not dual tree
< naywhayare>
I thought that might be the case; the binary space tree ones don't have the _impl files either
< naywhayare>
but the cover tree does
< naywhayare>
I managed to get depend.make to include rectangle_tree/single_tree_traverser_impl.hpp by explicitly adding that as an include to src/mlpack/core/tree/rectangle_tree.hpp
< naywhayare>
but that still doesn't expose the actual issue of why CMake is ignoring it
< naywhayare>
I moved rectangle_tree/single_tree_traverser_impl.hpp to rectangle_tree/single_tree_traverser_impl_blah.hpp and removed it from src/mlpack/core/tree/rectangle_tree.hpp (so src/mlpack/core/tree/rectangle_tree.hpp is the same as what is in svn)
< naywhayare>
and then depend.make had single_tree_traverser_impl_blah.hpp in it
< naywhayare>
the first answer is correct, but we are using include_directories()
< naywhayare>
I think I can reduce this to a simple test case, then file a cmake bug
< andrewmw94>
perhaps I change the name to something like single_tree_traverser__impl.hpp for now
< andrewmw94>
to keep the syntax similar and make it work
< andrewmw94>
(note the double __)
< naywhayare>
nah, if anything adding the _impl files to src/mlpack/core/tree/rectangle_tree.hpp is probably the better solution
< naywhayare>
because explicitly specifying them there seems to make CMake pick them up okay
< naywhayare>
I bet what is happening is this:
< andrewmw94>
yeah. I forgot that works too
< andrewmw94>
I assume we should do the same for BinarySpaceTree
< naywhayare>
CMake scans all the #includes of the .cpp file and then all of its #includes recursively, generating a list
< sumedh_>
naywhayare: I am down to 0.80 in 60 secs...
< naywhayare>
but it just has the actual include file name, not the path
< naywhayare>
so then it removes duplicates... and wipes out all of the other single_tree_traverser_impl.hpp files, and only leaves one behind
< andrewmw94>
yeah, sounds probable.
< sumedh_>
but I am always returning false in termination policy...
< naywhayare>
but if, somewhere, we write "#include "rectangle_tree/single_tree_traverser_impl.hpp"", then that won't get wiped out
< naywhayare>
sumedh_: hm; so I know that the termination should happen when the validation RMSE starts to increase
< andrewmw94>
should we count on the cover tree always being there or should I add something like that for it too?
< naywhayare>
how much does it increase before you get to 0.80?
< naywhayare>
andrewmw94: we should probably add one for the cover tree too. the actual include it chooses probably has to do with the lunar cycle
< naywhayare>
and we don't have any support code for that
< andrewmw94>
next release
< naywhayare>
haha
< sumedh_>
very little ... here and there...
< naywhayare>
sumedh_: ok, how much magnitude increase in RMSE? on the order of 1e-5, or 1e-3, or bigger/smaller?
< sumedh_>
no... bigger but they are like kinks... 2.05, 1.78, 1.73, 1.80, 1.72, 1.74...
< sumedh_>
1.65...
< sumedh_>
then again decreasing...
< sumedh_>
so its one point jump...
< sumedh_>
some smoothening is required...
< naywhayare>
yeah, what if we change the termination condition to require two successive increases in RMSE?
< naywhayare>
do you think that would fix it?
< sumedh_>
okay... three will definitely solve it...
< naywhayare>
I'm glad I found the bug, because this means I don't have to go through and make my own bug report
< sumedh_>
naywhayare: with 3... 0.87... :) in 80 seconds....
< naywhayare>
if it's jumping around like 1.78, 1.73, 1.80 then maybe the momentum or step size is too large?
< sumedh_>
yes... but those jumps are only at the start... after that... its a almost smooth decrease...
< sumedh_>
i will try it with different momentum...
< sumedh_>
yes.. you are right... with momentum 0.5... its a smooth decrease...
< naywhayare>
I think I am going to file a CMake bug anyway, because this situation may be a little different, and the one I linked to is backlogged which means it'll never be addressed...
< sumedh_>
but now not getting till 0.87... very slow decrease...
< sumedh_>
still at 0.97...
< naywhayare>
yeah, a slower momentum will mean slower convergence
< naywhayare>
that's the problem with optimizers; they are very picky about their parameters :)
< sumedh_>
I guess 3 increase policy seems fine....
< sumedh_>
I mean its faster...
< sumedh_>
and I shifted test point generation to constructor...
< sumedh_>
I agree that const MatType& is better in Apply()
< jenkins-mlpack>
* sumedhghaisas: * faster implementation of SVDBatchWithMomentum
< jenkins-mlpack>
* tolerance termination policy is modified according to new policy
< jenkins-mlpack>
* test point selection in validation RMSE termination is shifted to constructor
< jenkins-mlpack>
* Ryan Curtin: Rename parameters, clean up documentation; very minor tweaks. No functionality changes.
< jenkins-mlpack>
* Ryan Curtin: Rearrange parameters -- maxIterations is probably the most likely to be changed, and also maxIterations tends to be the first parameter for other algorithms.
< jenkins-mlpack>
* Ryan Curtin: Remove documentation for MoveDistributionType.