verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
seeni has joined #mlpack
< seeni>
can you say why this thing happpens while building mlpack " from Cython.Distutils import build_ext ModuleNotFoundError: No module named 'Cython' " . But i have Cython installed on my machine
seeni_ has joined #mlpack
seeni has quit [Quit: Page closed]
seeni_ is now known as seeni
seeni has quit [Quit: seeni]
< rcurtin>
seeni: do you have Cython installed for the correct version of python?
< rcurtin>
and which version is installed?
manish7294 has joined #mlpack
< manish7294>
rcurtin: Probably it's late, Are you there?
< rcurtin>
yeah, I am about to go to bed though, but I can stay up for a few more minutes :)
manish7294_ has joined #mlpack
< manish7294_>
rcurtin: Its regarding distance caching in impostors.
< manish7294_>
Do you mean the distance matrix we pass to knn search
< rcurtin>
right, there are a couple little complexities there
< manish7294_>
this right knn.Search(k, neighbors, distances);
< rcurtin>
but yes, when we do knn.Search(), it returns the distances between the point and its nearest neighbors in that matrix
< manish7294_>
?
< rcurtin>
right, exactly
< rcurtin>
if we cache the distance results, we can avoid the recalculation, does that make sense?
< manish7294_>
But I saw knn seach code and it reinitialize distance every time.
< manish7294_>
If I got it right here it is
< manish7294_>
arma::Mat<size_t>* neighborPtr = &neighbors; arma::mat* distancePtr = &distances; if (!oldFromNewReferences.empty() && tree::TreeTraits<Tree>::RearrangesDataset) { // We will always need to rearrange in this case. distancePtr = new arma::mat; neighborPtr = new arma::Mat<size_t>; } // Initialize results. neighborPtr->set_size(k, referenceSet->n_cols); distancePtr->set_size(k, referenceSet->n_col
< rcurtin>
right, and the same with the neighbors matrix
< manish7294_>
Ah! indentation!
manish7294 has quit [Ping timeout: 260 seconds]
< rcurtin>
but in Impostors() you are extracting the results of that neighbors matrix into the outputMatrix object
< manish7294_>
Right
< rcurtin>
no worries, I know the code you are talking about :)
< rcurtin>
so the idea would be, also extract the distances into some other output matrix
< manish7294_>
but if knn reinitialize distance everytime, so how would it help
< rcurtin>
and then they can be used by the other parts of EvaluateWithGradient()
< manish7294_>
Right, I got that idea but worrying about knn search code
< rcurtin>
yeah, I am not sure I understand why that is a problem though
< manish7294_>
These are the two lines there un starting of search code
< rcurtin>
right, but what I'm saying is the exact same thing is done for the neighbors matrix
< rcurtin>
yet you use the neighbors matrix just fine
< manish7294_>
So, basically we can use the previous distance matrix to relieve knn search from some calculation, right?
< rcurtin>
ah, sorry I think I see the confusion now
< rcurtin>
the idea is not to give the KNN object something that will help the search
< rcurtin>
the idea is to store the distances output from the KNN object so that we can avoid some metric.Evaluate() calls later in the EvaluateWithGradient() function
< manish7294_>
Right, thanks got the point
< rcurtin>
sure, hope that clarified it
< rcurtin>
let me know if not
< manish7294_>
Thanks for keeping up this late :)
< rcurtin>
sure, it's no problem :)
< rcurtin>
I will head to bed now if there's nothing else for now
< manish7294_>
Ya, I got these two ideas while reading that comment but just got too deeo into the obe I was talking about. :)
< rcurtin>
it's ok, I know how it goes :)
< manish7294_>
good night :)
< rcurtin>
I would say 'good night' but it is morning for you, so good morning :)
< manish7294_>
:)
manish7294_ has quit [Quit: Page closed]
vivekp has left #mlpack []
< ShikharJ>
zoq: Sorry for troubling you again, but can we merge the two PRs now? That would also help us in our code for GAN Optimizer and WGAN.
< zoq>
ShikharJ: Sure what do you think about adding a simple test?
< ShikharJ>
zoq: Test for the GANs?
< zoq>
ShikharJ: Batch support.
< zoq>
ShikharJ: Ahh, I see we already test GAN with batchSize > 1
< ShikharJ>
zoq: What I was thinking of doing was to uncomment the GANMNISTTest that we have, and set some low hyperparameters.
< zoq>
ShikharJ: Agrred that sounds reasonable.
< ShikharJ>
zoq: Now with the batch support PR, it takes really less time to compute something like a batch of 10, for one epoch, 20 pre-training and 50 maximum inputs.
< zoq>
ShikharJ: Okay the batch support is merged, do you like to incoperate the test in the DGAN PR?
< zoq>
ShikharJ: We can also open a new PR.
< ShikharJ>
zoq: Sure, I'll uncomment all the tests and change the test documentation a bit there. I'm guessing some merge conflicts would also arise in the DCGAN PR after batch support is merged.
< zoq>
ShikharJ: yes
< zoq>
ShikharJ: okay, modifiying the test is a good idea, let's do that :)
< ShikharJ>
zoq: Really happy with the work we've achieved. I'll also tmux a session to see how we currently fare against other libraries!
< zoq>
ShikharJ: Yeah, all this really nice additions and improvements.
< Atharva>
zoq: I have been facing a strange issue since yesterday.
< Atharva>
Certain gradient check tests in ANNLayerTest fail or pass based on their position in the file among other tests.
< Atharva>
With no code changed
< Atharva>
Also, similar issue is, if in the GradientLinearLayerTest, I change the loss to meansquared, then Atrous Convolution test fails
< Atharva>
What I found out was the model.Gradient() call from this tests return all zeros when they fail, but I can't figure out why, nothing else is changing.
< ShikharJ>
Atharva: I also found an issue like that sometime back, though it wasn't showing up on Travis so I ignored it.
< Atharva>
ShikharJ: So the tests don't give any problems on Travis?
< Atharva>
I might as well ignore it then.
< ShikharJ>
They didn't for me. But keep in mind this was some time back. The codebase has changed considerably from thereon.
< Atharva>
I will try and push a commit once and see if they fail.
< ShikharJ>
zoq: Could we have access to the Appveyor builds, they don't seem to have auto branch cancellation feature, and I had pushed a couple of unnecessary builds that I wish to cancel.
seeni has joined #mlpack
seeni has quit [Quit: seeni]
< zoq>
ShikharJ: hm, I thought every mlpack member should be able to start/stop the job, did you use the same github login?
< zoq>
Atharva: What version (last commit) do you use?
< ShikharJ>
zoq: Yes.
< zoq>
ShikharJ: hm, let me disable/enable the setting.
< zoq>
ShikharJ: Okay, can you test again?
< ShikharJ>
zoq: I'll need a running job for that.
< manish7294>
It seems we can't use a custom k value with matlab lmnn implementation, though I have not dig in the reason behind.
< manish7294>
And the matlab run is taking a way lot amount of memory
< manish7294>
rcurtin: It's regarding the tree building optimization, I have noticed that total tree building time is always very low(merely half a second on letters dataset). So, do you think this optimization will be efficient?
< manish7294>
And regaring the distance caching --- We need to calculate the distance after every iteration as metic.Evaluate() is called on transformed dataset(which changes after every iteration), but taking from your idea, we can avoid this calcuation at least the times(decided by range parameter) when we call impostors (here we will need to cache distance every time impostors is called) and then use it instead for metric.Evaluate. Does i
< ShikharJ>
zoq: I have tmux'd a session, let's see if it shows any improvement over the 3 day runtime that we saw earlier.
< Atharva>
sumedhghaisas: I know we decided on Thurdays 8pm ist, but is possible for you at 10pm ist?
< Atharva>
or about 9:30?
< manish7294>
rcurtin: Just a bumpy thought. It may sound weird, but I am writing it anyway :) ---- Regarding your bounds idea, we are facing the problem of deciding a particular value for it right? Is it possible to have adaptive bounding value just like the adaptive step size.
< ShikharJ>
zoq: As expected, the smaller GAN tests pass within the time bound, can we also merge the DCGAN PR now?
< zoq>
ShikharJ: Okay, left some comments regarding the test.
< ShikharJ>
zoq: Cool.
manish7294 has quit [Ping timeout: 260 seconds]
ImQ009 has joined #mlpack
< sumedhghaisas>
Atharva: Hi Atharva
< sumedhghaisas>
Sure. 10pm works for me as well.
< sumedhghaisas>
If you get free earlier let me know
travis-ci has joined #mlpack
< travis-ci>
manish7294/mlpack#29 (lmnn - d05cfd3 : Manish): The build has errored.
< rcurtin>
manish7294: a couple comments, sorry that I was not able to respond until now
< rcurtin>
don't worry about a lack of custom k---if the MATLAB script doesn't support it, it's not a huge deal
< rcurtin>
and I am not surprised it takes a huge amount of memory
< rcurtin>
for the tree building optimization, you are right, in some cases tree building can be fast (depends on the dataset)
< rcurtin>
at the same time, unless you've modified the code, it isn't counting the time taken to build the query trees
< rcurtin>
on, e.g., MNIST, tree building takes a much longer time
< rcurtin>
so I think it will be a worthwhile optimization on larger datasets
< rcurtin>
for the distance caching, you are right---we can only avoid the calculation exactly when Impostors() is called
< rcurtin>
for the bumpy thought, I'm not sure I fully understand---for bounding values, the bound will depend on | L_t - L_{t + 1} |_F^2, which is fast to calculate
manish7294 has joined #mlpack
< manish7294>
rcurtin: Regarding bumpy thought - If I am right we need to bound that expression under some value like "exp < b",then as per my understanding this b is varies a lot from one dataset to other. Some my earlier comment was about this b
< manish7294>
*so my earlier comment
< rcurtin>
it may vary, but I think it may not be all that much
< rcurtin>
basically the quantity I am talking about bounding is 'eval' in 'eval < -1'
< rcurtin>
I think it will not be hard to adapt the bounds from the notes that I wrote to show that the last iteration's 'eval' can be used to make a lower bound on this iteration's eval
< rcurtin>
I think it will look something like 'eval_t < eval_{t - 1} + \| L_t - L_{t + 1} \| * (some function of \| x_i \| and \| x_l \| or something like this)'
< rcurtin>
but I need to compute the exact value, unless you'd like to do that theory part :)
< manish7294>
Ah! I mixed it up with other thing, so you can see what will happen if I go doing that part right ;)
< rcurtin>
I think there are so many optimizations that we confuse ourselves a little bit talking about them... do you think that it would be easier if we went ahead, merged the LMNN code, then opened issues for each possible optimization?
< rcurtin>
then each optimization could be handled separately in its own PR, making discussion a lot easier (I think)
< rcurtin>
at least from my end I am always getting mixed up which part we are talking about :)
< manish7294>
It would be lot better than current situation; )
< rcurtin>
right, ok, so then let me know when you've got the current code pushed, and I'll review it and we can do the merge in the next handful of days
< manish7294>
I will push the modified distance cache then after build pass you can go ahead
< manish7294>
sure
< rcurtin>
right, that sounds good. you should have a review in a handful of hours; I think any remaining issues will be little ones, like documentation or options for lmnn_main.cpp
< rcurtin>
and then I'll also open issues for each of the possible optimizations and we can discuss there which are good ideas, which are bad ideas, and which are worth implementing :)
< manish7294>
ya there may be many of those :)
< manish7294>
great
< rcurtin>
I'm not too worried about the timeline, since the actual LMNN implementation you did was super fast, I think it will be similar for BoostMetric
< rcurtin>
most of the time we've spent has been optimization, but I just need to glance at the timeline again and make sure I don't suggest 1000 optimizations that there's not time for :)
< rcurtin>
as you know it's already quite fast by comparison, I just know that there is more there :)
< manish7294>
though optimazation took time but I think it's worth it
< manish7294>
Do you think we have achieved something at least a bare minimum for a workshop? :)
< rcurtin>
almost---it's hard to publish a paper if it is just a fast implementation, but when we can start adding clever bounds and computation reductions, we have something much better
< rcurtin>
so if we can start caching the query trees, and computing when no impostors will change (even if that only happens for some datasets), that plus what we already have is definitely something novel and publishable, I think
< manish7294>
I hope we achieve that :)
< rcurtin>
and given the speedups we are already showing, the experiments section will look great almost regardless
< rcurtin>
in some cases it looks like 50x-100x with comparable resulting kNN accuracies
< manish7294>
ya there are some datasets
< manish7294>
I think balance is most visible one
< Atharva>
sumedhghaisas: Hey Sumedh
< sumedhghaisas>
Atharva: Hi Atharva
< sumedhghaisas>
How are things going?
< Atharva>
I took a lot of time to debug the gradient check of ReconstructionLoss, but it's done now.
< Atharva>
What's next?
< sumedhghaisas>
haha... thats great. :) gradient errors are the worst
< Atharva>
yeah they are
< Atharva>
but the most important :P
< sumedhghaisas>
Did you send the CL for NormalDistribution layer?
< sumedhghaisas>
sorry no layer...
< sumedhghaisas>
just NormalDistributiom
< Atharva>
what do you mean by CL?
< Atharva>
sorry
< sumedhghaisas>
Also we can perform JacobianTest for NormalDistribution log prob
< sumedhghaisas>
ahh... PR
< sumedhghaisas>
at work we have CLs
< sumedhghaisas>
so usually I get confused
< Atharva>
oh, okayy
< Atharva>
Yeah, I opened a new PR
< Atharva>
please take a look at it when you get time, I had to change a lot of things because the input can also be negative
< sumedhghaisas>
hmm... I see
< sumedhghaisas>
I think this class is becoming too specific to ANN module
< sumedhghaisas>
maybe we should move it inside the module
< sumedhghaisas>
for now
< sumedhghaisas>
until we figure out how to generalize it for outer dists
< Atharva>
It would have to be because we have to keep the ReconstructionLoss layer generic for other distributions
< Atharva>
yeah
< sumedhghaisas>
We could keep the ReconstructionLayer generic for ANN dists
< sumedhghaisas>
for now
< Atharva>
I didn't get it, do you mean that we create a seperate folder for ANN dists
< sumedhghaisas>
yes... 'dists' in ANN folder
< Atharva>
okay
< Atharva>
So, next we will do a Jacobian test, after that?
< sumedhghaisas>
And regarding the PR, adding softplus depending on the input is wrong, input can be sometimes positive and sometimes negative
< sumedhghaisas>
Add a boolean, default it to True, which determines if softplus is added or not
< Atharva>
okayy
< Atharva>
got it
< sumedhghaisas>
also is there a specific reason you are not using the implemented SoftplusFunction?
< Atharva>
Yes, I tried that first but the build kept on errorring. I guess it's because the dists are core files
< Atharva>
It said Softplus wasn't defined
< sumedhghaisas>
thats weird
< sumedhghaisas>
could you send me the line you used to import the file?
< sumedhghaisas>
ahh I see
< Atharva>
Just to make sure I wasn't making any mistakes, I will try that again.
< Atharva>
okay
< sumedhghaisas>
thats due to the circular dependency maybe
< sumedhghaisas>
that should go away when you will move it inside the ANN module
< rcurtin>
manish7294: sounds good, I'll review them soon when I have a chance
< zoq>
sumedhghais: We can ignore files, but not lines at least not now. But we don't have to wait for a green build, we know that we can ignore some issues.
< ShikharJ>
zoq: Great news, with the new BatchSupport changes, I'm able to train on the full dataset within 10 hours. This is almost twice as fast as the expected time with Tensorflow on a desktop CPU (though with a server grade cpu we can still expect around 30~40% relative speedup)!
< ShikharJ>
zoq: I'll post the results on the BatchSupport PR.
< zoq>
ShikharJ: Great news indeed :)
manish7294 has quit [Ping timeout: 265 seconds]
< Atharva>
zoq: Is it okay if in a PR, I manually add some changes I need from another PR and then later remove them when the other PR is merged?