naywhayare changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
Anand has joined #mlpack
< Anand> Marcus : The nan errors fixed! Sent you a mail. Have a look. I think we should make another merge before starting with the representation part. How about doing it today?
Anand has quit [Ping timeout: 246 seconds]
Anand has joined #mlpack
< marcus_zoq> Anand: Thanks for the mail; go ahead and merge :)
< Anand> Marcus : Ok. Will do it in a few minutes!
< Anand> Also run scikit perceptron and see the metrics working.
< marcus_zoq> Anand: Looks good, no more nan's.
< Anand> Marcus : Yeah!
< marcus_zoq> Anand: Maybe it is a good idea, to add a test tor the edge cases?
< marcus_zoq> Anand: Btw. Thanks for the blog post!
< Anand> Marcus : Ok, yeah, we can add some tests
< Anand> Marcus : Added it today. My machine crashed yesterday! :P
< marcus_zoq> Anand: No problem; Everything fine with your machine?
< Anand> Yes, it is now.
< marcus_zoq> Anand: Okay, good to hear; Hopefully you don't need to buy a new one.
< Anand> Oh no. Just the OS crashed!
< Anand> I re installed it
< marcus_zoq> Anand: Sounds like a lot of work :(
< Anand> Haha. Not really. Takes time though!
Anand has quit [Ping timeout: 246 seconds]
Anand has joined #mlpack
< Anand> Marcus : merge done
< marcus_zoq> Anand: Great!
Anand has quit [Ping timeout: 246 seconds]
andrewmw94 has joined #mlpack
andrewmw94 has quit [Quit: Leaving.]
andrewmw94 has joined #mlpack
< jenkins-mlpack> Starting build #2048 for job mlpack - svn checkin test (previous build: SUCCESS)
< andrewmw94> naywhayare: I have a probably silly question about C++. Are you free?
< naywhayare> andrewmw94: yeah, I'm here
< naywhayare> waiting on the release candidate to build...
< naywhayare> I was hoping to finish it over the weekend, but then a friend showed up with a car and we spent the weekend working on it...
< andrewmw94> Hopefully fun though :) Basically, I'm fixing the copy constructor so it will be what the user expects. In doing so I need to provide another way for me to get an exact copy of a node, so I wrote a function ExactClone()
< naywhayare> okay
< andrewmw94> *"in doing so" should have been "to replace what I currently use"
< naywhayare> I remember our previous discussion, so I remember why this is necessary
< andrewmw94> yeah. So I have a function that makes the new node, using a different constructor to avoid creating a bunch of new child nodes. I then go through each member variable and set them to be the same.
< andrewmw94> Isn't that all I should need to do?
< naywhayare> yeah, that should work fine, presuming that none of the members are references
< naywhayare> if any members are references, you'll need to use a constructor with an initialization list to set them, since references can't be re-set after they are initially assigned
< andrewmw94> Hmm. So I only have one variable that's a reference--the central dataset. And that's set correctly by the constructor I am using, so I don't try to change it.
< andrewmw94> my guess for things that don't work would be the pointer or the vectors
< andrewmw94> but from what I've read, I can just use vector1 = vector2 to copy them
< andrewmw94> for the pointer I do this:
< andrewmw94> arma::mat* pointer = &(newNode->LocalDataset());
< andrewmw94> delete pointer;
< andrewmw94> pointer = localDataset;
< naywhayare> okay, this seems reasonable
< naywhayare> the vector will do a deep copy; that shouldn't be a problem
< naywhayare> you only need to delete the pointer if you are actually creating a new matrix in the constructor
< andrewmw94> yeah, I am doing that.
< naywhayare> ok
< naywhayare> also, can you explain again why there is a separate local dataset and also the main dataset?
< andrewmw94> The main dataset is used to order the points for the BaseCase
< andrewmw94> the local dataset is used to have contiguous memory while allowing insertions and deletions
< naywhayare> okay, but for an insertion, we still have to call insert_col() on the main dataset, right?
< andrewmw94> contiguous meaning contiguous for say 15 points at a time
< andrewmw94> Currently yes. You mentioned that we could perhaps in the future make a different implementation of the matrix that allowed dynamic resizing
< naywhayare> right
< naywhayare> okay
< naywhayare> so I think that the extra cost of holding the local dataset will outweigh the gains you get from having those points in contiguous memory, but we can see when we get there
< naywhayare> I think that's pretty easy code to modify either way
< andrewmw94> yeah, it's just changing LocalDataset(i) to Dataset.col(Points()[i])
< andrewmw94> Could you clarify what deep copying does?
< andrewmw94> I suspect that's the problem
< naywhayare> deep copying is not the behavior of the default copy constructor
< naywhayare> suppose you have this struct:
< naywhayare> struct a { int* b; }
< naywhayare> then you use this struct with this code:
< naywhayare> struct a myObject;
< naywhayare> myObject.b = new int[500];
< naywhayare> for (size_t i = 0; i < 500; ++i) { myObject.b[i] = i; }
< naywhayare> then you try to copy it...
< naywhayare> struct a myOtherObject = myObject;
< naywhayare> the code that is actually executed here is a shallow copy:
< naywhayare> myOtherObject.b = myObject.b
< naywhayare> so now the pointer myOtherObject.b points to the same memory as myObject.b
< andrewmw94> ahh, that's what I want to happen
< naywhayare> but if I then do 'delete myObject.b', then myOtherObject.b is invalid
< andrewmw94> is there a way to have it do a shallow copy? Or do I have to do that by hand?
< naywhayare> a shallow copy is just copying the pointer value
< naywhayare> so 'pointer = localDataset', which you referenced before, is a shallow copy
< andrewmw94> yeah. But for a vector of pointers
< naywhayare> i.e. std::vector<RectangleTree*>?
< andrewmw94> Yeah. Should I just write a loop. Or is there a standard way to do this?
< naywhayare> I'm not sure what the vector constructor will do there
< naywhayare> it shouldn't make deep copies of each RectangleTree* object though
< naywhayare> so if you do std::vector<RectangleTree*> b = a; // a is std::vector<RectangleTree*> too
< naywhayare> then b should contain pointers to the same locations as a
< andrewmw94> hmm. Ok, that's what I wanted.
< andrewmw94> let me see if I can figure out how to use pastebin really quickly...
< naywhayare> it's pretty simple :)
< andrewmw94> as a random guess, perhaps my constructor:
< andrewmw94> explicit RectangleTree(RectangleTree<SplitType, DescentType, StatisticType, MatType>* parentNode);
< andrewmw94> is causing problems again?
< naywhayare> line 13 is calling your copy constructor; is that what you want it to do?
< naywhayare> or, sorry, that's calling the explicit constructor you just referenced
< andrewmw94> ahh. I was very confused
< naywhayare> when you say it doesn't work, what do you mean?
< naywhayare> is it a compilation error, or a runtime error?
< andrewmw94> runtime.
< andrewmw94> Do you want me to checkin the code?
< naywhayare> what is wrong with the tree after line 41?
< andrewmw94> it has a memory access violation
< andrewmw94> the address is 0x00000000, which I thought would never occur in normal code, so I suspect I have something set to NULL that I shouldn't
< andrewmw94> but I could just be wrong about address 0
< naywhayare> the address of the node, or some member of the node?
< andrewmw94> it just says: "unknown location(0): fatal error in "RectangleTreeConstructionCountTest": memory access violation at address: 0x00000000: no mapping at fault address"
< naywhayare> ok
< naywhayare> try running the test with gdb, then get a backtrace
< naywhayare> have you used gdb before?
< andrewmw94> I've tried :)
< andrewmw94> I do know how to get a backtrace though
< naywhayare> ok
< naywhayare> that should probably help clarify what's been set to null
< naywhayare> don't forget to save yourself some time by passing the -t RectangleTreeTest/RectangleTreeConstructionCountTest argument to 'run'
< naywhayare> so that gdb doesn't try to run all the tests, which will take forever and a half...
< andrewmw94> I don't trust myself to interpret it correctly
< andrewmw94> :)
< andrewmw94> But it seems that it is in GetPointSeeds() which should use LocalDataset() and possibly Points()
< andrewmw94> just LocalDataset
< andrewmw94> also Bound() if that could be the issue
< naywhayare> you can inspect the variables that are local to the frame you are in
< naywhayare> so, if you are in frame 0, which I think you are, you can do things like
< naywhayare> 'inspect tree'
< naywhayare> 'inspect tree.bound'
< naywhayare> 'inspect tree.localDataset'
< naywhayare> and so forth
< naywhayare> and that should help determine which one of those is null
< andrewmw94> yeah. It's the vector.
< naywhayare> the vector itself is null, or the elements in it?
< andrewmw94> the elements. I'll write a for loop to copy the elements to make sure that fixes it
< andrewmw94> no
< andrewmw94> but the output of the original program was:
< andrewmw94> children = std::vector of length 6, capacity 6 = {0x0, 0x0,
< andrewmw94> 0x0, 0x0, 0x0, 0x0},
< andrewmw94> I assume I am correct in thinking those are the elements?
< naywhayare> yes, I think those are the elements
< naywhayare> are you sure the elements are not null at the time the vector is being copied?
< andrewmw94> actually, they probably are
< andrewmw94> yeah, numChildren is 0, so they should be
sumedhghaisas has joined #mlpack
< naywhayare> are you maintaining numChildren as something separate than children.size()?
< sumedhghaisas> naywhayare: hey ryan... you free??
< naywhayare> sumedhghaisas: I am about to step out for lunch, unfortunately
< naywhayare> I'm probably here for 10 or 12 more minutes
< sumedhghaisas> ohh okay ... I just wanted to ask about iterator code...
< sumedhghaisas> did you check it??
< naywhayare> unfortunately I have not had a chance yet. I was busy all weekend
< sumedhghaisas> no problem ... release pressure??
< sumedhghaisas> let me know if I can help :)
< naywhayare> I'm just about done with it... I noticed some warnings that I am taking care of now
< naywhayare> but other than that it's just about ready
< sumedhghaisas> great :) mlpack 1.0.9 then :)
< naywhayare> yeah
< jenkins-mlpack> Project mlpack - svn checkin test build #2048: SUCCESS in 1 hr 27 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/2048/
< jenkins-mlpack> * Ryan Curtin: Propagate documentation fixes to trunk.
< jenkins-mlpack> * Ryan Curtin: Update documentation.
< jenkins-mlpack> * andrewmw94: code cleanup for R tree
< jenkins-mlpack> Starting build #2049 for job mlpack - svn checkin test (previous build: SUCCESS)
sumedhghaisas has quit [Ping timeout: 245 seconds]
udit_s has joined #mlpack
Anand has joined #mlpack
Anand has quit [Ping timeout: 246 seconds]
< jenkins-mlpack> Project mlpack - svn checkin test build #2049: SUCCESS in 1 hr 32 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/2049/
< jenkins-mlpack> Ryan Curtin: Merge in changes from release branch. I guess I should have done all this
< jenkins-mlpack> before I branched.
< jenkins-mlpack> Starting build #2050 for job mlpack - svn checkin test (previous build: SUCCESS)
< naywhayare> marcus_zoq: are you or Anand using shoeshine? or can I start the mlpack benchmarks on it?
< naywhayare> if you want me to wait that's just fine :)
< naywhayare> maybe it will motivate me to set up other benchmark systems...
udit_s has quit [Quit: Leaving]
< andrewmw94> naywhayare: I figured it out, I needed to have an arma::mat**. That didn't work (I assume because of type checking in C++ that differs from C) so I added a function that returns the pointer to the localDataset. I have a few questions about cleaning up the API, but first I was wondering about the operator=( const RectangleTree& other ) function. The BinarySpaceTree doesn't seem to have one, but I thought I was supposed to make one
< naywhayare> (your message was truncated at "supposed to make one"; I don't know if there was more)
< andrewmw94> just a few words: "if I make a copy constructor."
< naywhayare> I think operator=(const RectangleTree& other) will implicitly call the copy constructor
< naywhayare> ...I think. I'm not certain on that
< naywhayare> you could test it... does 'RectangleTree<...> b = a;' work? (if a is some other rectangle tree)
< andrewmw94> it seems to
< naywhayare> okay
< andrewmw94> but then what's the operator=( ... ) for in HRectBound?
< andrewmw94> just confused
< naywhayare> okay, I had to refresh myself on why it is the way it is
< naywhayare> operator=() can't really be provided for the BinarySpaceTree or the RectangleTree, because if the dataset is different, the reference can't be re-pointed
< naywhayare> HRectBound doesn't have any reference members, so an operator=() can be written there
< naywhayare> I think the code you wrote optimizes to RectangleTree<...> b(a)
< naywhayare> but had I written this code below, it would be invalid...
< naywhayare> RectangleTree<...> b = a; // valid
< naywhayare> b = c; // c is a tree, too, and this fails because the reference can't be reassigned.
< andrewmw94> ahh
< andrewmw94> ok, then I think the copy constructor is fine.
< naywhayare> (also, I found this post on MLOSS that references your project / the mlpack projects, except for Siddharth's which was somehow omitted: http://mloss.org/community/blog/2014/jun/03/google-summer-of-code-2014/ )
< naywhayare> (you have to follow the link to the ipython notebook)
< andrewmw94> cool
< andrewmw94> I had a few other questions about the API, hopefully quick ones...
< naywhayare> sure, go ahead... I already ate lunch so I am not going anywhere :)
< andrewmw94> right. Eat, sleep, code. :) So the first one is about the PointerToLocalDataset() function I added to get the code I showed you earlier working
< andrewmw94> basically I replaced this:
< andrewmw94>   // delete the LocalDataset to avoid memory leaks.   arma::mat* pointer = &(newNode->LocalDataset());   delete pointer;   pointer = localDataset;
< andrewmw94> ugh, let's try that again:
< andrewmw94>   // delete the LocalDataset to avoid memory leaks.
< andrewmw94>   arma::mat* pointer = &(newNode->LocalDataset());
< andrewmw94>   delete pointer;
< andrewmw94>   pointer = localDataset;
< andrewmw94> With this:
< andrewmw94> // delete the LocalDataset to avoid memory leaks.
< andrewmw94> delete newNode->PointerToLocalDataset();
< andrewmw94> newNode->PointerToLocalDataset() = localDataset;
< andrewmw94> I don't like adding the function PointerToLocalDataset() when this is the only use I can think of for it, but I tried to get it working with something like:
< naywhayare> it seems like the problem with the first snippet is that you aren't ever setting the newNode's local dataset
< naywhayare> but you can't do it with the old version, because &newNode->LocalDataset() isn't an lvalue
< andrewmw94> yeah. I have that fixed, but I'm trying to remove the function PointerToLocalDataset()
< andrewmw94> exactly
< andrewmw94> is there a workaround?
< naywhayare> there is a very ugly workaround... you can do something like newNode->LocalDataset().memptr() = localDataset
< naywhayare> but really that is a bad thing
< andrewmw94> is it worse than having a function that is only used for this?
< naywhayare> I would say so
< naywhayare> hang on, I'm trying to think of a way to sidestep the problem entirely
< naywhayare> right now you have a function called, I think, ExactClone(), right?
< andrewmw94> yeah
< andrewmw94> that's where this code is
< naywhayare> and the key difference between this and the copy constructor is that you are not making a deep copy of the children
< naywhayare> is that right?
< naywhayare> i.e. instead of for(size_t i = 0; i < other.children.size(); ++i) { children.push_back(new RectangleTree<...>(...)); }
< naywhayare> you want to do for(size_t i = 0; i < other.children().size(); ++i) { children.push_back(other.children[i]); }
< andrewmw94> No deep copy of the child nodes or of the local dataset. Yes.
< naywhayare> why not modify the copy constructor to take a boolean 'deepCopy', defaulting to true?
< naywhayare> then have an if statement in that copy constructor
< naywhayare> when the compiler looks for the default copy constructor, it takes deepCopy = true
< naywhayare> but in your one particular case where you want a shallow copy, you set that argument to false
< naywhayare> do you think that could work/
< naywhayare> *?
< andrewmw94> I'm not sure how that would work around the problem. Wouldn't I then have to access the memory address of other.localDataset
< naywhayare> it changes the problem a little bit
< naywhayare> now you can do 'localDataset = &other.localDataset'
< andrewmw94> ahh, right, it's no longer a lvalue
< naywhayare> but then you have to ensure that the other matrix doesn't delete the local dataset
< andrewmw94> ok, I think that will work
< naywhayare> what does LocalDataset() return?
< naywhayare> arma::mat* or are you dereferencing the pointer?
< andrewmw94> arma::mat&
< andrewmw94> I wanted it to match Dataset
< naywhayare> right
< andrewmw94> So my other question was about the Child() function
< andrewmw94> In BinarySpaceTree, I think this would return BinarySpaceTree&
< naywhayare> ok, go ahead. I think we still need to consider how to ensrue that the other node doesn't delete the copied local dataset, but we can consider that later
< naywhayare> *ensure
sumedhghaisas has joined #mlpack
< andrewmw94> However, I have a Children() function which matches the std::vector<RectangleTree*> children
< andrewmw94> and thus returns std::vector<RectangleTree*>&
< andrewmw94> so I wrote Child() to return RectangleTree*, so that Child(i) would be shorthand for Children()[i]
< andrewmw94> Should I leave it as it is now, or change Child(i) to be different from Children()[i]?
< naywhayare> you should dereference the pointer so it matches BinarySpaceTree
< andrewmw94> ok
< andrewmw94> so back to copying nodes. Whenever I use it, I set one of the LocalDataset's to be NULL, since it will no longer be a leaf node since that node moves up the tree
< andrewmw94> I could do the same thing here right?
< naywhayare> yes, that is a possibility
< marcus_zoq> naywhayare: You can use shoeshine, but I need to upgrade the config.yaml file just give me a few minutes
< naywhayare> marcus_zoq: ok, no hurry
< naywhayare> it takes a long time to run :)
< naywhayare> sumedhghaisas: row_col_iterator looks fine to me... before we send it to Conrad we have to adapt it to the weird Whitesmiths style though
< naywhayare> we also need to add a row_col_iterator typedef for SpMat and add begin_row_col() and end_row_col() functions for SpMat too
< sumedhghaisas> naywhayare: I don't know what is Whitesmith style :( let me just google it...
< sumedhghaisas> ohh yes... I will add them too...
< sumedhghaisas> naywhayare: Whitesmiths style is only for indentation right??
< jenkins-mlpack> Project mlpack - svn checkin test build #2050: SUCCESS in 1 hr 28 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/2050/
< jenkins-mlpack> Ryan Curtin: Fix warnings.
< jenkins-mlpack> Starting build #2051 for job mlpack - svn checkin test (previous build: SUCCESS)
< naywhayare> sumedhghaisas: yeah, mostly indentation. you can look at the armadillo source
< naywhayare> it looks pretty weird in my opinion, but Conrad says it "decreases cognitive load" or something like that
< sumedhghaisas> yeah... its looks weird to me too... probably cause we are too familiar with normal indentation style...
< naywhayare> :)
< sumedhghaisas> okay... I will do that...
< marcus_zoq> naywhayare: I thought I've created all necessary centroids files ... maybe it takes more than a few minutes
< naywhayare> oh, crap. I said I was going to do that and then I never did
< naywhayare> I need a better todo list or something. do you want me to help out?
< marcus_zoq> naywhayare: no need only one file left
< naywhayare> ok. sorry about that :(
< naywhayare> sumedhghaisas: I expanded the tests for row_col_iterator in r16907 and fixed a little bug in r16908
< sumedhghaisas> naywhayare: bug??
< sumedhghaisas> in tests??
< naywhayare> sumedhghaisas: in row_col_iterator::operator--(int)
< naywhayare> just a little misspelling, it should be row_col_iterator not const_row_col_iterator
< sumedhghaisas> ohh... sorry about that :(
< naywhayare> not a problem :)
< naywhayare> it only surfaced when I explicitly tested operator--() in the tests
< sumedhghaisas> I have a one hour test right now... I will complete the styling after that and commit the code...
< sumedhghaisas> I should have written more tests ...
< sumedhghaisas> I guess I only checked ++ operator...
< naywhayare> yeah; unfortunately testing is very time-consuming to do comprehensively
sumedhghaisas has quit [Ping timeout: 264 seconds]
< marcus_zoq> naywhayare: Benchmark started!
< naywhayare> marcus_zoq: awesome, thank you. now we wait two weeks :)
< marcus_zoq> There is an fork with R-bindings interresting
< naywhayare> yeah, Qiang produced the first version of that last week I think
< naywhayare> I should have announced it on the list for him
< marcus_zoq> the matlab bindings are outdated, right?
sumedhghaisas has joined #mlpack
< naywhayare> yeah, I should probably remove them from src/
< jenkins-mlpack> Project mlpack - svn checkin test build #2051: SUCCESS in 1 hr 29 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/2051/
< jenkins-mlpack> * Ryan Curtin: Fix botched LaTeX formula.
< jenkins-mlpack> * Ryan Curtin: This warning only appears on i386.
< jenkins-mlpack> Starting build #2052 for job mlpack - svn checkin test (previous build: SUCCESS)
< naywhayare> ok, Trac is fixed
< jenkins-mlpack> Project mlpack - svn checkin test build #2052: SUCCESS in 1 hr 29 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/2052/
< jenkins-mlpack> * Ryan Curtin: Fix typo.
< jenkins-mlpack> * Ryan Curtin: More comprehensive tests.
< jenkins-mlpack> * Ryan Curtin: Set date of release.
< jenkins-mlpack> Starting build #2053 for job mlpack - svn checkin test (previous build: SUCCESS)
< sumedhghaisas> naywhayare: okay I just need to typedef iterator to row_col_iterator right//
< sumedhghaisas> means similarly const_iterator
< sumedhghaisas> and add begin_row_col and end
< sumedhghaisas> okay done with that... can I send you all the files?? my commit still not working... :(
< sumedhghaisas> I guess its my college proxy problem...
andrewmw94 has quit [Quit: Leaving.]
< jenkins-mlpack> Project mlpack - svn checkin test build #2053: SUCCESS in 1 hr 30 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/2053/
< jenkins-mlpack> andrewmw94: R tree clean up. Change Child() to match BSP tree. Copy constructor takes a second argument to specify whether it is a deep copy.
sumedhghaisas has quit [Ping timeout: 272 seconds]