verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
lozhnikov has joined #mlpack
lozhnikov has quit [Ping timeout: 260 seconds]
robertohueso has left #mlpack []
ShikharJ has quit [Quit: ZNC 1.6.5-elitebnc:6 - http://elitebnc.org]
ShikharJ has joined #mlpack
xa0 has joined #mlpack
jenkins-mlpack has joined #mlpack
manish7294 has joined #mlpack
< manish7294> rcurtin: I have debugged the boostmetric implementation, there were few small issues. Here's the updated gist - https://gist.github.com/manish7294/3d97be37919658b96bba0125f2f3de84 I also re-run the smulations and the results are pretty good. simulations - https://gist.github.com/manish7294/2388267666b1159ce261ce7b95dc923c
< manish7294> I think we should definitely have this. If you want I can open a PR.
< manish7294> Maybe after some more optimizations we can make it even faster.
< rcurtin> manish7294: we need to see comparisons with LMNN
vivekp has quit [Ping timeout: 268 seconds]
vivekp has joined #mlpack
xa0 has quit [Excess Flood]
< rcurtin> LMNN with no impostor recalculation that is
xa0 has joined #mlpack
< manish7294> rcurtin: These are the results with eval bounds branch - https://gist.github.com/manish7294/2388267666b1159ce261ce7b95dc923c
< manish7294> Okay, I will do this for no impostors recalculation as well
< jenkins-mlpack2> Project docker mlpack nightly build build #10: FAILURE in 4 hr 40 min: http://ci.mlpack.org/job/docker%20mlpack%20nightly%20build/10/
< manish7294> rcurtin: I have updated simulations for LMNN no impostors recalculation as well. https://gist.github.com/manish7294/2388267666b1159ce261ce7b95dc923c
< manish7294> In this case it seems optimizer converges within one iteration for all the datasets.
< rcurtin> something is not right in those results if it converges in one iterationn---did you run setting range to some very high number?
< rcurtin> by the way, I am sorry I did not get to responding about BoostMetric yesterday; after lunch the rest of the day ended up being allocated
< rcurtin> we actually went hiking until 4am, so I don't know if I can read it today, I think that I cannot stay awake
< manish7294> no, I just removed the impostors recalculation code
< manish7294> no worries :)
< manish7294> as we are already calculating them in LMNNFunction constructor
< manish7294> I think tha's the case for LBFGS only as others are taking quite a number of iterations
< rcurtin> I'd urge you to take a look into it since it seems to me there is definitely a bug there
< manish7294> you were right, I missed one thing. I will update resylts soon
< rcurtin> sounds good, thanks
< rcurtin> ok, thanks
< rcurtin> the results are very mixed; it's not clear which of these three is best, and it doesn't seem like there's a consistent pattern
< rcurtin> that's not necessarily a problem, just an observation
< rcurtin> how difficult would it be for you to recalculate impostors in your BoostMetric implementation?
< rcurtin> (and the impostor-recalculating LMNN implementation, was that with range == 1?)
< manish7294> Ya, it's with range 1
< manish7294> Originally, boostmetric doesn't do this, but we can do this at every iterations by recomputing triplets and then Ar . https://gist.github.com/manish7294/3d97be37919658b96bba0125f2f3de84#file-boostmetric_impl-hpp-L40
< rcurtin> right, would you mind doing this and adding that to the simulations also?
< rcurtin> I want to see if this gives any consistent performance increase to the BoostMetric accuracy results
< manish7294> sure, will update that soon.
< rcurtin> in addition, if you are willing to add some of the datasets from the BoostMetric paper, it could be really useful to see if your implementation gets the same accuracy as theirs
< manish7294> sure
< rcurtin> thanks, I know it is a lot of work
< rcurtin> but very important to understand the behavior if we eventually want to make any claims about it
< manish7294> rcurtin: Looks like recalculating impostors at every iteration totally destroyed boostmetric algo https://gist.github.com/manish7294/2388267666b1159ce261ce7b95dc923c#file-simulation2-txt
< rcurtin> how do you know there is not a bug?
< manish7294> I just transformed dataset and applied usual updates, I probably don't think there should be one in doing this much.
< rcurtin> I'm almost certain there is a bug given the extremely poor performance
< rcurtin> that, or the algorithm simply cannot handle impostor recalculations
< rcurtin> but I have not been able to investigate it in full
manish7294 has quit [Ping timeout: 252 seconds]
sumedhghaisas2 has joined #mlpack
< rcurtin> sumedhghaisas2: how was ICML and Stockholm? :)
< sumedhghaisas2> rcurtin: Amazing... although tiring. the city is beautiful.
< sumedhghaisas2> and this was my first big conference... so I was mostly lost in the talks.:(
< sumedhghaisas2> I loved the poster sessions though
< sumedhghaisas2> zoq: Hey Marcus, got a minute?
< sumedhghaisas2> I was little confused about the math of VAE when MeanSquaredError is used as loss. Which distribution models p(x | z) in that case?
< zoq> Gaussian, I'm wondering if MSE is the right choice here, I thought BCE is more common?
< zoq> Atharva: Good news, Ryan fixed the issue :)
< sumedhghaisas2> I also thought it's Gaussian, but as it turns out if I use NormalDistribution with Reconstruction Loss I get very different and wrong results...
< sumedhghaisas2> I could reproduce the results with tensorflow as well
< sumedhghaisas2> I somehow get negative loss...
< Atharva> zoq: Yeah, I can see the post now :)
< zoq> strange, do you have a minimal sample to reproduce the issue, or perhaps Atharva can provide something?
< sumedhghaisas2> I also thought that MSE is setting a constant variance for the distribution, but even that does not simplify to MSE error...
< Atharva> zoq: How exactly do you mean?
< zoq> Some simple example that I can use to reproduce the issue, I guess I could also use the unit test?
< Atharva> Hmm, I don't think the unit test will help here. I will get back to you on this and give you an example.
< sumedhghaisas2> zoq: Also could we model normal Mnist with BCE?
< sumedhghaisas2> zoq: I could send you a tensorflow code that could be faster?
< sumedhghaisas2> Atharva: Could you change the MeanSquaredError to ReconstructionLoss to reproduce the results?
< Atharva> sumedhghaisas: Yes, I should be able to do that.
< zoq> sumedhghais: I think for the MNIST we can use BCE, since the pixel are not continuous.
< sumedhghaisas2> zoq: Maybe I getting confused... I though Mnist is between 0-1 and binary Mnist is binarized version?
< sumedhghaisas2> for Binary we can use Bernoulli dist which is same as BCE I guess
< zoq> sumedhghais: Right, you would have to use the binary version.
< sumedhghaisas2> zoq: This is really strange, I always thought it's Gaussian, just like you. :D
< sumedhghaisas2> but somehow the NLL becomes negative...
< zoq> definitely strange
< rcurtin> sumedhghaisas2: very cool, I was hoping I could go this year but then I quit my job ...
< rcurtin> the poster sessions are great, you can have lots of conversations with great people
< rcurtin> instead I am in your favorite place, Iceland :)
< rcurtin> I think it is a lot nicer here in the summers, but if I remember right you were only here in the winter, which I guess is way different...
sumedhghaisas3 has joined #mlpack
< sumedhghaisas3> rcurtin: ahh it would have been nice to meet you there. Actually what are you working on these days?
sumedhghaisas2 has quit [Ping timeout: 260 seconds]
< sumedhghaisas3> zoq: Atharva is sending the code version, let's see if we spot anything
< rcurtin> sumedhghaisas3: I left Symantec and am now going to start working at a startup focused on fast in-database machine learning
< rcurtin> I will start in August but they were having an internal meetup in Akureyri so I attended :)
< sumedhghaisas3> rcurtin: whaaaaaaat? that's in Iceland right... amazing place.
< sumedhghaisas3> what a place to have an internal Meetup... I wish mine was there
< rcurtin> yeah, it has been incredible
< rcurtin> I have some pictures here: http://www.ratml.org/misc/iceland_pics.html
< rcurtin> last night we went hiking overnight so I am very tired today... we were out from 9pm to 4am and it never got dark
xa0 has quit [Ping timeout: 244 seconds]
< rcurtin> I feel very lucky to be here, so, maybe this means my choice of new company is a good choice :)
< Atharva> rcurtin: The place seems extremely beautiful!
< Atharva> I need to go there.
< rcurtin> I highly recommend it, but I think maybe not during the winter
< Atharva> Oh yes, I guess it will all be covered in snow.
xa0 has joined #mlpack
< sumedhghaisas3> rcurtin: ohh I really miss Iceland.. shouldn't have looked at these pictures :(
< sumedhghaisas3> been to Akureyri twice I think... did you go around the whole Iceland?
< rcurtin> you are not too far away these days :)
< rcurtin> no, we only drove from the airport out to Akureyri (~6 hours)
< sumedhghaisas3> true... but also working so less chance
< rcurtin> I'll go home after this conference, I didn't make any extra time to tour around (maybe I should have)
< rcurtin> maybe you can work remotely for a week? :)
< sumedhghaisas3> yeah it only takes 4 days to roam around Iceland if you do it properly
< sumedhghaisas3> August is a month for that... going to Menorca and Croatia(hopeful)
< sumedhghaisas3> so what does in-database machine learning entail?
manish7294 has joined #mlpack
< rcurtin> nice, very cool!
< rcurtin> I will have to respond more about the company later, we are actually still in talks today so I should pay attention :(
< Atharva> zoq: sumedhghaisas: When I used a tanh activation after the encoder, the loss didn't go negative. But, the results were still poor.
< Atharva> encoder and decoder both*
< manish7294> rcurtin: Sorry, I don't mean to disturb you. Please ignore the upcoming messages until you have time.
< zoq> Atharva: Instead of Sigmoid?
< manish7294> rcurtin: I got some promising results by starting of with identity matrix instead of zeros.
< manish7294> that being so, I couldn't find any other error with the implementation.
< Atharva> zoq: No, instead of no activation.
< Atharva> The results I have posted are when I have used no non-linearity after the final layers of encoder and decoder. Using non-linearity, the results weren't as good.
< manish7294> rcurtin: And do you think we can merge #1461 as #1466 gonna have a lot of merge conflicts? So, I was thinking of completing it as well.
< zoq> Atharva: I see, so clipping the output of the last layer helps, but I guess this means that the output is somewhat wrong.
< zoq> Atharva: 'weren't as good' huge difference?
< Atharva> maybe yeah, I haven't tried training it well though, using activation.
< Atharva> I will train it overnight tonight using mean squared error and some activation and see how the results are.
< rcurtin> manish7294: no problem, I'll look into the BoostMetric stuff when I have time but today is the day I meant to merge #1461, so let me do it now
< zoq> Atharva: Yeah, would be interesting to see what the actual effect is.
< manish7294> This one is out of topic, I remember seeing the movie in one of your starting photos, never thought it's worth that much. Though, that was a pretty funny one :)
< Atharva> I was just trying to train it with reconstruction loss and tanH after encoder and decoder, but the loss is still going negative. I will mail you and Sumedh the code.
< zoq> Atharva: Okay, thanks!
sumedhghaisas3 has quit [Ping timeout: 276 seconds]
sumedhghaisas2 has joined #mlpack
< zoq> ShikharJ: Can you reproduce the issue on your system? Pretty sure you have to build with DEBUG=ON
< rcurtin> ha, I have no idea what that movie was even about. some kind of animated pig I guess, I have never seen it anywhere else (maybe I don't look hard enough, for all I know it was a huge blockbuster movie last year)
< ShikharJ> zoq: Actually, I spent away most of the time refactoring the code (It's pretty huge and confusing at the moment, hopefully I'll be able to simplify it).
< ShikharJ> rcurtin: Is it Okja (the name of the movie)?
< rcurtin> maybe, it was labeled Syngdu when I saw it in a gas station near Reykjavik
< zoq> ShikharJ: I see, I guess in the process you probably fix the code anyway, do you think I should wait for the updated code?
< rcurtin> don't know what it translates as
< zoq> I think it's called "Sing"
< manish7294> rcurtin : It's a based on town hall shows, but a funny one :)
< ShikharJ> zoq: You may go with the existing code, I'll probably just be fixing the aesthetics for the rest of the day.
< rcurtin> oh, I see, yeah, it must be Sing
< rcurtin> looks like indeed it was a giant hit I never heard of...
< zoq> I was looking it up to get the price in Euro :)
< manish7294> rcurtin: Right, that's the one
< ShikharJ> zoq: Most of the codebase would be pretty similar after the refactor, so I should be able to make out the differences.
< Atharva> zoq: sumedhghaisas: Just mailed it to you.
< zoq> ShikharJ: Would be great if it would fix the issue I saw on my system and on travis.
< zoq> Atharva: Okay, I can link against the latest code from the PR?
< Atharva> zoq: Yes, the latest code in the reconstruction loss PR
< ShikharJ> zoq: I'll see what I can do by the end of the day. The reason I haven't started working on that is because with the current code, it's really difficult keeping track of what variables are available where. For example batchSize might be available inside the RBM class, bt not inside the policy classes, which needs change everytime I wish to try something new.
< manish7294> zoq: If I remember correctly, I probably got it from torrent ;)
ImQ009 has joined #mlpack
< Atharva> Just one thing, the mnist_full.csv I have used is the one Shikhar uploaded.
< ShikharJ> zoq: So that's is a problem, which I should be able to fix by today.
< ShikharJ> zoq: Plus a lot of comments are outdated or incorrect, which needs to be looked up as well.
< zoq> ShikharJ: We can take all the time we need to get the code into shape, so need to hurry.
< zoq> ShikharJ: Yeah, defently frustrating.
< zoq> Atharva: Okay, so I can use that from the models repo.
< zoq> manish7294: That's one option :)
< Atharva> zoq: Yes
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#5304 (master - 3b7bbf0 : Ryan Curtin): The build was broken.
travis-ci has left #mlpack []
xa0 has quit [Excess Flood]
xa0 has joined #mlpack
< Atharva> zoq: sumedhghaisas: On training with mean squared error and tanh after the encoder and decoder, the loss becomes stagnant at ~200.
< Atharva> Without activation, it went down to ~130 after 1.5 hours.
< Atharva> This loss is mean squared error averaged over a batch and not over the features of a one datapoint.
< Atharva> I modified it locally for that.
< sumedhghaisas2> Atharva: so with just mean squared error it reached 130?
< sumedhghaisas2> should be lower than that
< Atharva> It should be. That was after 1.5 hours, on further training it went down to ~115 and stagnated.
manish7294 has quit [Ping timeout: 252 seconds]
ImQ009 has quit [Quit: Leaving]
sumedhghaisas2 has quit [Ping timeout: 240 seconds]
sumedhghaisas2 has joined #mlpack
sumedhghaisas2 has quit [Ping timeout: 245 seconds]
sumedhghaisas2 has joined #mlpack
< zoq> ShikharJ: Nice refactoring.