naywhayare changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
Anand has joined #mlpack
< Anand>
Marcus : The nan errors fixed! Sent you a mail. Have a look. I think we should make another merge before starting with the representation part. How about doing it today?
Anand has quit [Ping timeout: 246 seconds]
Anand has joined #mlpack
< marcus_zoq>
Anand: Thanks for the mail; go ahead and merge :)
< Anand>
Marcus : Ok. Will do it in a few minutes!
< Anand>
Also run scikit perceptron and see the metrics working.
< marcus_zoq>
Anand: Looks good, no more nan's.
< Anand>
Marcus : Yeah!
< marcus_zoq>
Anand: Maybe it is a good idea, to add a test tor the edge cases?
< marcus_zoq>
Anand: Btw. Thanks for the blog post!
< Anand>
Marcus : Ok, yeah, we can add some tests
< Anand>
Marcus : Added it today. My machine crashed yesterday! :P
< marcus_zoq>
Anand: No problem; Everything fine with your machine?
< Anand>
Yes, it is now.
< marcus_zoq>
Anand: Okay, good to hear; Hopefully you don't need to buy a new one.
< Anand>
Oh no. Just the OS crashed!
< Anand>
I re installed it
< marcus_zoq>
Anand: Sounds like a lot of work :(
< Anand>
Haha. Not really. Takes time though!
Anand has quit [Ping timeout: 246 seconds]
Anand has joined #mlpack
< Anand>
Marcus : merge done
< marcus_zoq>
Anand: Great!
Anand has quit [Ping timeout: 246 seconds]
andrewmw94 has joined #mlpack
andrewmw94 has quit [Quit: Leaving.]
andrewmw94 has joined #mlpack
< jenkins-mlpack>
Starting build #2048 for job mlpack - svn checkin test (previous build: SUCCESS)
< andrewmw94>
naywhayare: I have a probably silly question about C++. Are you free?
< naywhayare>
andrewmw94: yeah, I'm here
< naywhayare>
waiting on the release candidate to build...
< naywhayare>
I was hoping to finish it over the weekend, but then a friend showed up with a car and we spent the weekend working on it...
< andrewmw94>
Hopefully fun though :) Basically, I'm fixing the copy constructor so it will be what the user expects. In doing so I need to provide another way for me to get an exact copy of a node, so I wrote a function ExactClone()
< naywhayare>
okay
< andrewmw94>
*"in doing so" should have been "to replace what I currently use"
< naywhayare>
I remember our previous discussion, so I remember why this is necessary
< andrewmw94>
yeah. So I have a function that makes the new node, using a different constructor to avoid creating a bunch of new child nodes. I then go through each member variable and set them to be the same.
< andrewmw94>
Isn't that all I should need to do?
< naywhayare>
yeah, that should work fine, presuming that none of the members are references
< naywhayare>
if any members are references, you'll need to use a constructor with an initialization list to set them, since references can't be re-set after they are initially assigned
< andrewmw94>
Hmm. So I only have one variable that's a reference--the central dataset. And that's set correctly by the constructor I am using, so I don't try to change it.
< andrewmw94>
my guess for things that don't work would be the pointer or the vectors
< andrewmw94>
but from what I've read, I can just use vector1 = vector2 to copy them
< naywhayare>
the vector will do a deep copy; that shouldn't be a problem
< naywhayare>
you only need to delete the pointer if you are actually creating a new matrix in the constructor
< andrewmw94>
yeah, I am doing that.
< naywhayare>
ok
< naywhayare>
also, can you explain again why there is a separate local dataset and also the main dataset?
< andrewmw94>
The main dataset is used to order the points for the BaseCase
< andrewmw94>
the local dataset is used to have contiguous memory while allowing insertions and deletions
< naywhayare>
okay, but for an insertion, we still have to call insert_col() on the main dataset, right?
< andrewmw94>
contiguous meaning contiguous for say 15 points at a time
< andrewmw94>
Currently yes. You mentioned that we could perhaps in the future make a different implementation of the matrix that allowed dynamic resizing
< naywhayare>
right
< naywhayare>
okay
< naywhayare>
so I think that the extra cost of holding the local dataset will outweigh the gains you get from having those points in contiguous memory, but we can see when we get there
< naywhayare>
I think that's pretty easy code to modify either way
< andrewmw94>
yeah, it's just changing LocalDataset(i) to Dataset.col(Points()[i])
< andrewmw94>
Could you clarify what deep copying does?
< andrewmw94>
I suspect that's the problem
< naywhayare>
deep copying is not the behavior of the default copy constructor
< naywhayare>
suppose you have this struct:
< naywhayare>
struct a { int* b; }
< naywhayare>
then you use this struct with this code:
< naywhayare>
struct a myObject;
< naywhayare>
myObject.b = new int[500];
< naywhayare>
for (size_t i = 0; i < 500; ++i) { myObject.b[i] = i; }
< naywhayare>
then you try to copy it...
< naywhayare>
struct a myOtherObject = myObject;
< naywhayare>
the code that is actually executed here is a shallow copy:
< naywhayare>
myOtherObject.b = myObject.b
< naywhayare>
so now the pointer myOtherObject.b points to the same memory as myObject.b
< andrewmw94>
ahh, that's what I want to happen
< naywhayare>
but if I then do 'delete myObject.b', then myOtherObject.b is invalid
< andrewmw94>
is there a way to have it do a shallow copy? Or do I have to do that by hand?
< naywhayare>
a shallow copy is just copying the pointer value
< naywhayare>
so 'pointer = localDataset', which you referenced before, is a shallow copy
< andrewmw94>
yeah. But for a vector of pointers
< naywhayare>
i.e. std::vector<RectangleTree*>?
< andrewmw94>
Yeah. Should I just write a loop. Or is there a standard way to do this?
< naywhayare>
I'm not sure what the vector constructor will do there
< naywhayare>
it shouldn't make deep copies of each RectangleTree* object though
< naywhayare>
so if you do std::vector<RectangleTree*> b = a; // a is std::vector<RectangleTree*> too
< naywhayare>
then b should contain pointers to the same locations as a
< andrewmw94>
hmm. Ok, that's what I wanted.
< andrewmw94>
let me see if I can figure out how to use pastebin really quickly...
< naywhayare>
line 13 is calling your copy constructor; is that what you want it to do?
< naywhayare>
or, sorry, that's calling the explicit constructor you just referenced
< andrewmw94>
ahh. I was very confused
< naywhayare>
when you say it doesn't work, what do you mean?
< naywhayare>
is it a compilation error, or a runtime error?
< andrewmw94>
runtime.
< andrewmw94>
Do you want me to checkin the code?
< naywhayare>
what is wrong with the tree after line 41?
< andrewmw94>
it has a memory access violation
< andrewmw94>
the address is 0x00000000, which I thought would never occur in normal code, so I suspect I have something set to NULL that I shouldn't
< andrewmw94>
but I could just be wrong about address 0
< naywhayare>
the address of the node, or some member of the node?
< andrewmw94>
it just says: "unknown location(0): fatal error in "RectangleTreeConstructionCountTest": memory access violation at address: 0x00000000: no mapping at fault address"
< naywhayare>
ok
< naywhayare>
try running the test with gdb, then get a backtrace
< naywhayare>
have you used gdb before?
< andrewmw94>
I've tried :)
< andrewmw94>
I do know how to get a backtrace though
< naywhayare>
ok
< naywhayare>
that should probably help clarify what's been set to null
< naywhayare>
don't forget to save yourself some time by passing the -t RectangleTreeTest/RectangleTreeConstructionCountTest argument to 'run'
< naywhayare>
so that gdb doesn't try to run all the tests, which will take forever and a half...
< jenkins-mlpack>
Ryan Curtin: Merge in changes from release branch. I guess I should have done all this
< jenkins-mlpack>
before I branched.
< jenkins-mlpack>
Starting build #2050 for job mlpack - svn checkin test (previous build: SUCCESS)
< naywhayare>
marcus_zoq: are you or Anand using shoeshine? or can I start the mlpack benchmarks on it?
< naywhayare>
if you want me to wait that's just fine :)
< naywhayare>
maybe it will motivate me to set up other benchmark systems...
udit_s has quit [Quit: Leaving]
< andrewmw94>
naywhayare: I figured it out, I needed to have an arma::mat**. That didn't work (I assume because of type checking in C++ that differs from C) so I added a function that returns the pointer to the localDataset. I have a few questions about cleaning up the API, but first I was wondering about the operator=( const RectangleTree& other ) function. The BinarySpaceTree doesn't seem to have one, but I thought I was supposed to make one
< naywhayare>
(your message was truncated at "supposed to make one"; I don't know if there was more)
< andrewmw94>
just a few words: "if I make a copy constructor."
< naywhayare>
I think operator=(const RectangleTree& other) will implicitly call the copy constructor
< naywhayare>
...I think. I'm not certain on that
< naywhayare>
you could test it... does 'RectangleTree<...> b = a;' work? (if a is some other rectangle tree)
< andrewmw94>
it seems to
< naywhayare>
okay
< andrewmw94>
but then what's the operator=( ... ) for in HRectBound?
< andrewmw94>
just confused
< naywhayare>
okay, I had to refresh myself on why it is the way it is
< naywhayare>
operator=() can't really be provided for the BinarySpaceTree or the RectangleTree, because if the dataset is different, the reference can't be re-pointed
< naywhayare>
HRectBound doesn't have any reference members, so an operator=() can be written there
< naywhayare>
I think the code you wrote optimizes to RectangleTree<...> b(a)
< naywhayare>
but had I written this code below, it would be invalid...
< naywhayare>
RectangleTree<...> b = a; // valid
< naywhayare>
b = c; // c is a tree, too, and this fails because the reference can't be reassigned.
< andrewmw94>
ahh
< andrewmw94>
ok, then I think the copy constructor is fine.
< naywhayare>
(you have to follow the link to the ipython notebook)
< andrewmw94>
cool
< andrewmw94>
I had a few other questions about the API, hopefully quick ones...
< naywhayare>
sure, go ahead... I already ate lunch so I am not going anywhere :)
< andrewmw94>
right. Eat, sleep, code. :) So the first one is about the PointerToLocalDataset() function I added to get the code I showed you earlier working
< andrewmw94>
basically I replaced this:
< andrewmw94>
// delete the LocalDataset to avoid memory leaks. arma::mat* pointer = &(newNode->LocalDataset()); delete pointer; pointer = localDataset;
< andrewmw94>
ugh, let's try that again:
< andrewmw94>
// delete the LocalDataset to avoid memory leaks.
< andrewmw94>
I don't like adding the function PointerToLocalDataset() when this is the only use I can think of for it, but I tried to get it working with something like:
< naywhayare>
it seems like the problem with the first snippet is that you aren't ever setting the newNode's local dataset
< naywhayare>
but you can't do it with the old version, because &newNode->LocalDataset() isn't an lvalue
< andrewmw94>
yeah. I have that fixed, but I'm trying to remove the function PointerToLocalDataset()
< andrewmw94>
exactly
< andrewmw94>
is there a workaround?
< naywhayare>
there is a very ugly workaround... you can do something like newNode->LocalDataset().memptr() = localDataset
< naywhayare>
but really that is a bad thing
< andrewmw94>
is it worse than having a function that is only used for this?
< naywhayare>
I would say so
< naywhayare>
hang on, I'm trying to think of a way to sidestep the problem entirely
< naywhayare>
right now you have a function called, I think, ExactClone(), right?
< andrewmw94>
yeah
< andrewmw94>
that's where this code is
< naywhayare>
and the key difference between this and the copy constructor is that you are not making a deep copy of the children
< naywhayare>
is that right?
< naywhayare>
i.e. instead of for(size_t i = 0; i < other.children.size(); ++i) { children.push_back(new RectangleTree<...>(...)); }
< naywhayare>
you want to do for(size_t i = 0; i < other.children().size(); ++i) { children.push_back(other.children[i]); }
< andrewmw94>
No deep copy of the child nodes or of the local dataset. Yes.
< naywhayare>
why not modify the copy constructor to take a boolean 'deepCopy', defaulting to true?
< naywhayare>
then have an if statement in that copy constructor
< naywhayare>
when the compiler looks for the default copy constructor, it takes deepCopy = true
< naywhayare>
but in your one particular case where you want a shallow copy, you set that argument to false
< naywhayare>
do you think that could work/
< naywhayare>
*?
< andrewmw94>
I'm not sure how that would work around the problem. Wouldn't I then have to access the memory address of other.localDataset
< naywhayare>
it changes the problem a little bit
< naywhayare>
now you can do 'localDataset = &other.localDataset'
< andrewmw94>
ahh, right, it's no longer a lvalue
< naywhayare>
but then you have to ensure that the other matrix doesn't delete the local dataset
< andrewmw94>
ok, I think that will work
< naywhayare>
what does LocalDataset() return?
< naywhayare>
arma::mat* or are you dereferencing the pointer?
< andrewmw94>
arma::mat&
< andrewmw94>
I wanted it to match Dataset
< naywhayare>
right
< andrewmw94>
So my other question was about the Child() function
< andrewmw94>
In BinarySpaceTree, I think this would return BinarySpaceTree&
< naywhayare>
ok, go ahead. I think we still need to consider how to ensrue that the other node doesn't delete the copied local dataset, but we can consider that later
< naywhayare>
*ensure
sumedhghaisas has joined #mlpack
< andrewmw94>
However, I have a Children() function which matches the std::vector<RectangleTree*> children
< andrewmw94>
and thus returns std::vector<RectangleTree*>&
< andrewmw94>
so I wrote Child() to return RectangleTree*, so that Child(i) would be shorthand for Children()[i]
< andrewmw94>
Should I leave it as it is now, or change Child(i) to be different from Children()[i]?
< naywhayare>
you should dereference the pointer so it matches BinarySpaceTree
< andrewmw94>
ok
< andrewmw94>
so back to copying nodes. Whenever I use it, I set one of the LocalDataset's to be NULL, since it will no longer be a leaf node since that node moves up the tree
< andrewmw94>
I could do the same thing here right?
< naywhayare>
yes, that is a possibility
< marcus_zoq>
naywhayare: You can use shoeshine, but I need to upgrade the config.yaml file just give me a few minutes
< naywhayare>
marcus_zoq: ok, no hurry
< naywhayare>
it takes a long time to run :)
< naywhayare>
sumedhghaisas: row_col_iterator looks fine to me... before we send it to Conrad we have to adapt it to the weird Whitesmiths style though
< naywhayare>
we also need to add a row_col_iterator typedef for SpMat and add begin_row_col() and end_row_col() functions for SpMat too
< sumedhghaisas>
naywhayare: I don't know what is Whitesmith style :( let me just google it...
< sumedhghaisas>
ohh yes... I will add them too...
< sumedhghaisas>
naywhayare: Whitesmiths style is only for indentation right??
< jenkins-mlpack>
andrewmw94: R tree clean up. Change Child() to match BSP tree. Copy constructor takes a second argument to specify whether it is a deep copy.
sumedhghaisas has quit [Ping timeout: 272 seconds]