naywhayare changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
< jenkins-mlpack> Project mlpack - svn checkin test build #1931: SUCCESS in 33 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/1931/
< jenkins-mlpack> * Ryan Curtin: Oops, this actually happens in two places.
< jenkins-mlpack> * Ryan Curtin: Typo which causes a segfault.
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Ping timeout: 255 seconds]
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Ping timeout: 240 seconds]
< jenkins-mlpack> Project mlpack - nightly matrix build build #475: STILL UNSTABLE in 1 hr 33 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20nightly%20matrix%20build/475/
< jenkins-mlpack> * Ryan Curtin: Oops, this actually happens in two places.
< jenkins-mlpack> * Ryan Curtin: Typo which causes a segfault.
< jenkins-mlpack> * andrewmw94: change name of leafSize to maxLeafSize. more stuff for the R-tree. Some name changes, some more node splitting, a start on traversal.
Anand has joined #mlpack
< marcus_zoq> Anand: Good Morning, sounds good.
Anand has quit [Ping timeout: 240 seconds]
udit_s has joined #mlpack
cuphrody has joined #mlpack
udit_s has quit [Quit: Leaving]
andrewmw94 has joined #mlpack
< andrewmw94> naywhayare: so I got several emails about the build becoming unstable. It listed two of your commits and one of mine. Mine compiled on my computer, but I wanted to make sure I didn't break it. Actually, I'd like to learn how it works either way for when I probably break it later. Do you have a link on how you find what the error was etc.?
< andrewmw94> well, I found the error, but it seems it's on an older version of armadillo that I don't have so I'll just have to wait to learn this I guess.
< naywhayare> andrewmw94: I'm surprised it sent it to the right email, usually it'll try <username>@gatech.edu (and of course andrewmw94@gatech.edu doesn't work)
< naywhayare> you can ignore those emails; it's something that needs to be looked into
< naywhayare> but the instability (which is probably a failing test or two) isn't caused by any of your commits
< naywhayare> it's been doing that for months, but I haven't had a chance to address what the issue actually is
< andrewmw94> ahh.
< andrewmw94> why don't my other commits cause me to get emails then?
< andrewmw94> it's only happened for two of them
< naywhayare> yeah, that's the part that confuses me
< naywhayare> I'll turn off email notifications for unstable builds for now, because it will be a little while until I can debug the actual issues there
Anand has joined #mlpack
Anand has quit [Ping timeout: 240 seconds]
govg has joined #mlpack
govg has quit [Changing host]
govg has joined #mlpack
Anand has joined #mlpack
< Anand> Marcus : I have implemented the metric as discussed. It should work now
govg has quit [Ping timeout: 264 seconds]
govg has joined #mlpack
govg has quit [Quit: leaving]
< marcus_zoq> Anand: Great, can you commit the changes?
< Anand> I will do it soon
< Anand> Just figured out a small glitch
< Anand> Committed
< Anand> A small change required though
govg has joined #mlpack
govg has quit [Changing host]
govg has joined #mlpack
< marcus_zoq> Anand: Okay, I will take a look into the code in a few minutes.
< Anand> Ok
< marcus_zoq> Anand: What needs to be changed?
< Anand> For the AvgMeanPredictive method the 'i' represents the labels index in the CM and not the actual value of the label. I have to replace the 'i' in the call to MeanPredictiveInformationClass(i,..,..) with the actual label value
< Anand> I am extracting labels from the file
< marcus_zoq> Anand: Okay, so nothing to worry about?
< Anand> No.
< Anand> :)
< marcus_zoq> :)
< Anand> Done!
Anand has quit [Ping timeout: 240 seconds]
Anand has joined #mlpack
< Anand> Marcus : It seems like we are done with class conversions. I will probably add unit tests for the new implementations tomorrow.
< marcus_zoq> Anand: Great, test would be awesome!
< Anand> Yup!
govg has quit [Quit: leaving]
Anand has quit [Ping timeout: 240 seconds]
sumedhghaisas has joined #mlpack
< jenkins-mlpack> Starting build #1932 for job mlpack - svn checkin test (previous build: SUCCESS)
< jenkins-mlpack> Project mlpack - svn checkin test build #1932: FAILURE in 2.6 sec: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/1932/
< jenkins-mlpack> andrewmw94: a few miscellanious small changes. Added to CMake.
< naywhayare> andrewmw94: the relevant error is ' Cannot find source file: ... /rectangle_tree/r_tree_descent_heuristic.hpp'; you probably just forgot to add that file?
< sumedhghaisas> naywhayare: Trying to compare NMF results with SVD batch learning... SVD batch learning performs better when initialized in certain way and even prediction function works differently.. For that user needs to input the min and max value of the ratings...
< sumedhghaisas> Is it a good idea to accept min and max value in update rule constructor??
< sumedhghaisas> this would force the user to have min and max value for running SVD batch learnibng...
udit_s has joined #mlpack
< andrewmw94> sumedhghaisas: you can always have a default value set if you need to use them anyways
< sumedhghaisas> yes but hat default values should I use?? as the scale can be anything??
< sumedhghaisas> another way would be to find the min and max values from the matrix itself...
< andrewmw94> ahh, I assume that's the best
< sumedhghaisas> as we already know predicted values cannot exceed those values...
< andrewmw94> armadillo has support for min and mak
< andrewmw94> max*
< andrewmw94> max(A, dim) and min(A, dim)
< andrewmw94> actually, nevermind, that function returns a matrix not a value
< sumedhghaisas> yeah... but as update function accepts the factorization matrix in each call, min and max has to be calculated in each call... this is a huge overhead...
< sumedhghaisas> value can be accessed by mat[0][0]...
< sumedhghaisas> or I guess its mat(0)(0).. dont remember...
< jenkins-mlpack> Starting build #1933 for job mlpack - svn checkin test (previous build: FAILURE -- last SUCCESS #1931 18 hr ago)
< andrewmw94> I think it's mat(0,0). But as regards your question I would probably overload the function to have one version that doesn't require them and one that does
< sumedhghaisas> maybe its better to shift the acceptance of factorization matrix to constructor...
< sumedhghaisas> then its reference can be stored along with min and max...
< andrewmw94> yeah. Or depending on how the code is already structured you could have the object for running SVD be a subclass of some class that holds the matrix. Then the parent class can calculate the min and max and the SVD code can get it from the parent
< andrewmw94> but I probably don't know enough about this to help you, I just wanted to make sure you knew about default arguments in C++
< andrewmw94> I learned it this summer, they skipped it in my classes
< sumedhghaisas> Ohh yeah I know about the default arguments... I was just thinking how to restructure the module to accommodate this new feature...
< sumedhghaisas> I am coding in C++ for long now...
< sumedhghaisas> its my preferred language now :)
< naywhayare> sumedhghaisas: I agree with andrew's ideas; it's probably a good idea to look through the rating matrix with arma's min() and max() functions, unless the user specifies a known range of possible values
< sumedhghaisas> yes I agree...
< sumedhghaisas> There is one problem...
< sumedhghaisas> right now the update function accepts factorization matrix...
< sumedhghaisas> I dont why its designed that way...
< sumedhghaisas> *understand...
< sumedhghaisas> cause in this way min and max have to be computed each time...
< sumedhghaisas> cause user may provide different factorization matrix...
< sumedhghaisas> Isnt it better better to accept factorization matrix in the constructor??
< naywhayare> accepting the factorization matrix in the constructor would mean that a particular instantiated update rules class would be limited to working with that particular factorization matrix
< sumedhghaisas> yes... so that object will be created and destroyed in amf Apply() function...
< sumedhghaisas> if certain user parameters are required then we take advantage of copy constructors...
< sumedhghaisas> Like I mean the constructor will be accept all the user defined parameters... so user can set them before passing that object to AMF...
< naywhayare> the other problem with accepting the factorization matrix in the constructor is that the simple update rules classes can't have static update functions
< sumedhghaisas> copy constructor will accept the factorization matrix...
< sumedhghaisas> user details will be copied....
< naywhayare> if I store a reference to the factorization matrix in the update rules class, then call UpdateW(w, h) and then UpdateH(w, h), then those functions cannot be static
< naywhayare> and those optimization benefits you might get from having a static function are gone
< naywhayare> in the case of your svd batch learning implementation, the rules can't be static anyway because you have to store the min and the max
< naywhayare> but you could have the user manually pass the min/max values or use defaults, or alternately calculate those in the constructor and just store those min and max values
< sumedhghaisas> yes... right...
< sumedhghaisas> anyway AMF class creating a new object of updaterule in its constructor...
< sumedhghaisas> as default parameter...
< sumedhghaisas> is calling static function on object optimized??
< naywhayare> what do you mean? can you describe further? I'm not sure exactly what situation you're referring to
< sumedhghaisas> okay... this is the amf constructor...
< sumedhghaisas> AMF(const size_t maxIterations = 10000,
< sumedhghaisas> const double minResidue = 1e-10,
< sumedhghaisas> const InitializationRule initializeRule = InitializationRule(),
< sumedhghaisas> const UpdateRule update = UpdateRule());
< sumedhghaisas> here update rule object is getting initialized...
< naywhayare> right; if the UpdateRule class is only having its static functions used, the compiler should be able to see that the AMF class doesn't actually even need to store an UpdateRules class
< naywhayare> so the whole UpdateRule class is optimized out, with only the static calls to UpdateW() and UpdateH() remaining
< sumedhghaisas> yeah I agree... compiler can do that optimization... but then how to compute min and max?? compute when first call to update function is made and store??
< naywhayare> well in the case where min and max is stored, the UpdateRule class can't be static, so that optimization can't be done
< naywhayare> that optimization only works for the standard rules like NMFALSRules
< naywhayare> in your case, I'd think maybe a decent solution is to compute min and max in the constructor, or take default values
< naywhayare> or have a user pass in min and max
< naywhayare> but, I'm a little confused why you need the min and max values in the update rules
< naywhayare> if I decompose the matrix X -> W * H, and I want the values of W * H to all lie in the range [a, b],
< naywhayare> that doesn't necessarily mean that all of the values of W lie in [a, b] or all of the values of H lie in the range [a, b]
< naywhayare> so how does this update rule work?
< sumedhghaisas> Yeah I was wondering abut that... okay could you look at that paper?? umm...A Guide to Singular Value Decomposition
< sumedhghaisas> by Chih-Chao-Ma
< sumedhghaisas> In that SVD batch learning... there they have considered min and max... in prediction function...
< sumedhghaisas> I dont seem to understand why...
< naywhayare> yeah -- that's in the prediction function, not in the decomposition
< sumedhghaisas> delta calculation will require prediction function...
< sumedhghaisas> in alternating update...
< naywhayare> basically what equation 3 is saying is that when U*M returns a value that's outside the range [a, b], then return the closest value
< naywhayare> ok; where is the equation for the delta calculation?
< sumedhghaisas> equation 5 and 6...
< sumedhghaisas> basically gradient descent...
< jenkins-mlpack> Yippie, build fixed!
< jenkins-mlpack> Project mlpack - svn checkin test build #1933: FIXED in 33 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/1933/
< jenkins-mlpack> andrewmw94: add the missed files.
< naywhayare> ok, I see what you mean now
< sumedhghaisas> ohh okay... that closest value calculation does make sense now...
< naywhayare> ok, I think the way to go, then, is to take the min and max values in the constructor for the rules
< naywhayare> for the default constructor that takes no arguments, you can just assume an infinite range (i.e. [-DBL_MAX, DBL_MAX])
< naywhayare> which would mean that all scores are allowed
< naywhayare> but if the user wants to restrict scores to a certain level, then they can construct the rules object manually and set the range
< naywhayare> does that make sense? or have I overlooked something?
< sumedhghaisas> hummm.... okay that sounds right...
< sumedhghaisas> setting it to infinity would be better than calculating min and max from matrix...
< sumedhghaisas> as actual range could be higher than max value in the matrix...
< naywhayare> yeah; I think that if the user does not specify a range, we should not assume that their matrix should have bounded values
< sumedhghaisas> its user's responsibility to specify the range if its required :)
< naywhayare> when we actually make an executable for this technique, we can include a flag that says whether or not to calculate the range based on the data, so we can do it for them automatically
< naywhayare> but for the C++ interface, yeah, we should make it their responsibility (at this level of the code, at least)
< sumedhghaisas> Okay and for creating SVDBatchLearning hpp...
< sumedhghaisas> should I create a new file or svn cp??
< naywhayare> if you're not copying it from somewhere else, you should create a new file
< sumedhghaisas> cause SVD would be a complete new addition...
< sumedhghaisas> okay then I will create a new file...
sumedhghaisas has left #mlpack []
sumedhghaisas has joined #mlpack
udit_s has quit [Quit: Leaving]
udit_s has joined #mlpack
udit_s has quit [Client Quit]
< sumedhghaisas> naywhayare: I just updated my ubuntu from 13.10 to 14.04... I am recompiling MLPACK and getting uint32_t undefined error...
< sumedhghaisas> in file random.hpp in core/math
< sumedhghaisas> i fixed it by defining it in that file... but I guess this should be taken care of somewhere else...
< sumedhghaisas> Or by adding inttypes.h include??
< naywhayare> uint32_t... that should be in stdint.h
< sumedhghaisas> all the inttypes can be imported by inttypes.h??
< sumedhghaisas> wait ... i will add that and check...
< sumedhghaisas> recompiling ... ... ...
< naywhayare> I don't think inttypes.h has uint32_t defined in it
< naywhayare> after you finish your current test, can you try adding #include <cstdint> to random.hpp and see what that does?
< sumedhghaisas> okay I opened inttypes.h to be sure... it includes stdint.h...
< naywhayare> okay, but stdint.h should be included by mlpack/prereqs.h
< naywhayare> which is included by random.hpp
< sumedhghaisas> #include <stdlib.h>
< sumedhghaisas> #include <math.h>
< sumedhghaisas> #include <float.h>
< sumedhghaisas> #include <inttypes.h>
< sumedhghaisas> #include <boost/random.hpp>
< naywhayare> are you using an up-to-date svn version? I made some changes recently with the includes
< sumedhghaisas> okay no... how recently??
< naywhayare> probably a week... just do an svn update and I think that will fix your problem
< naywhayare> make sure you do the svn update from the root of the repository
< sumedhghaisas> okay will do that...
andrewmw94 has quit [Quit: Leaving.]
< sumedhghaisas> okay that solved the problem...
< naywhayare> okay, good to hear that
< sumedhghaisas> naywhayare: Okay I know your busy... When you you get time... can you take a look at my library... uploaded yesterday on github.. any comments are welcome...
< naywhayare> I'll take a look over the weekend
< sumedhghaisas> Thanks :)
< naywhayare> sure, no problem
sumedhghaisas has quit [Ping timeout: 260 seconds]