verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
Samir has quit [Ping timeout: 256 seconds]
Samir has joined #mlpack
Samir is now known as Guest29252
Guest29252 is now known as Samir
< ShikharJ> zoq: I'll spend the rest of the week implementing the test for SSRBM, and push in the PR when the existing one is merged. I'll look for optimizations as well, though the code already looks pretty optimized to me.
< zoq> ShikharJ: Sounds like a good plan to me, will think optimizations too.
vivekp has quit [Ping timeout: 276 seconds]
vivekp has joined #mlpack
< Atharva> sumedhghaisas: you there?
zoq_ has joined #mlpack
zoq has quit [Read error: Connection reset by peer]
vivekp has quit [Read error: Connection reset by peer]
vivekp has joined #mlpack
< sumedhghaisas> Atharva: Hi Atharva
< Atharva> sumedhghaisas: I tested the reconstruction loss with conv network and it works!
< Atharva> on bernoulli
< sumedhghaisas> great!!!!
< sumedhghaisas> huh... phewww
< Atharva> But still, the error went down to 100 with 25 kl, so the samples weren't that good, but some were
< sumedhghaisas> 100?? it doesn't go lower?
< Atharva> I am currently training it on the feedforward model
< Atharva> to be sure
< Atharva> no
< sumedhghaisas> could you try for another architecture please? just to check
< Atharva> Should we get a better conv architecture
< Atharva> Yeah
< sumedhghaisas> I know the values for that arch
< Atharva> Can you suggest some architecture or point me to some link?
< sumedhghaisas> yeah... i thinking how do I send you that arch
< Atharva> maybe write it on a paper and send me a photo on hangouts
< sumedhghaisas> conv2d = snt.nets.ConvNet2D( output_channels=[32, 32, 32, 64], kernel_shapes=[[3, 3]], strides=[1, 2, 1, 2], paddings=[snt.SAME], activate_final=True, initializers=INITIALIZER_DICT, activation=tf.nn.relu, use_batch_norm=False) # convolved [7, 7, 64] convolved = conv2d(input_) # output is [8, 8, 64] output = tf.pad(convolved, [[0, 0], [0, 1], [0, 1]
< sumedhghaisas> this doesn't work :P
< sumedhghaisas> okay I will mail it to you
< sumedhghaisas> Atharva: Sent a mail
< sumedhghaisas> the arch contains sonnet definition but its easy to decode
< Atharva> okay, i will check
< sumedhghaisas> let me know if you don't understand anything in that definition
< Atharva> sumedhghaisas: Sorry, I can't understand it fully. Is it only the encoder you mailed me?
< sumedhghaisas> The decoder will be the exact reverse of this
< Atharva> Okay, what does this mean `output_channels=[32, 32, 32, 64]` Only 64 seems to be the number of output channels
< sumedhghaisas> umm... okay so the input is in shape [batch, height, width, channels]
< sumedhghaisas> now we keep applying a conv layer and a non linearity
< sumedhghaisas> so [32, 32, 32, 64] means 4 conv + ReLu layers
< sumedhghaisas> their output channels are respective
< Atharva> Okayy
< sumedhghaisas> all 4 layers will have common stride [3, 3]
< sumedhghaisas> and no padding will be applied
< Atharva> paddings=[snt.SAME]
< Atharva> this is mentioned, how will it remain SAME then?
< sumedhghaisas> yes... that means no padding
< sumedhghaisas> ahh wait... i was wrong
< sumedhghaisas> strides are [1, 2, 1, 2]
< sumedhghaisas> kernal shape is [3, 3]
< Atharva> I have a doubt, doesn't SAME padding mean that the output will have the same height and width as the input?
< sumedhghaisas> so I can write down the output shape [-1, 28, 28, 1] -> [-1, 28, 28, 32] -> [-1, 14, 14, 32] -> [-1, 14, 14, 32] -> [-1, 7, 7, 64]
< Atharva> Yes, so achieve 28, 28 -> 28, 28, we will need some padding right?
< sumedhghaisas> ahh yes... zero padding
< sumedhghaisas> I read VALID
< sumedhghaisas> I usually get confused between the 2
< Atharva> Haha I just remember it as SAME padding means the height and width will be same
< sumedhghaisas> Actually we should have this kind of API to define a ConvNet
< sumedhghaisas> this is usual in research
< Atharva> Yeah, maybe we should
< Atharva> This is a big network, it will take time to train. I think it will also be useful for celebA
< Atharva> I downloaded the dataset yesterday
< Atharva> Also, about conditional VAE, how should be append labels to the data when we use conv nets?
< sumedhghaisas> hmm... there are many ways for that, simplest is to treat it like another channels
< sumedhghaisas> are you starting with conditional VAEs as well?
< Atharva> I am planning to do it before i go to celebA
< sumedhghaisas> I would suggest the reverse
< sumedhghaisas> conditional VAE might be little harder and we will run out of time
< Atharva> I don't think they will be that hard, we just need to append labels, right?
< Atharva> I just need to change repar layer a bit
< sumedhghaisas> wouldn't be that hard, thats true
< sumedhghaisas> why would we need to change the repar?
< sumedhghaisas> we should also get the open PRs in shape. :)
< Atharva> We would need to add the labels to the output of the repar layer as well, wouldn't we?
< Atharva> Yeah, I will do that today, reviewing the PRs. Also, I will open another PR for the last commits I put in the reconstruction loss PR
< Atharva> Also, do you think we can merge that PR now?
< sumedhghaisas> I am not sure if we need to change anything in Repar layer but I have to think about that again
< sumedhghaisas> and yes, about the ReconstructLoss
< sumedhghaisas> we can merge that PR. :)
< sumedhghaisas> if we remove the extra commits
< sumedhghaisas> we need to also remove NormalDistribution from it
< Atharva> Yeah, I will do that in a while
< Atharva> Oh, why?
< sumedhghaisas> Its not exactly required and we still haven't figured out theproblem with that distribution in VAE
< sumedhghaisas> We shouldn't keep that distribution if we aren't sure it works with VAE
< Atharva> Okay
< Atharva> sumedhghaisas: I was just thinking, as it's a an ANN dists folder, we haven't specifically said that it's for VAEs, should we keep the normal distribution?
< sumedhghaisas> umm... we could. But where will it be used the current moment?
< sumedhghaisas> Atharva:
< Atharva> sumedhghaisas: It won't be used anywhere atleast for now
< Atharva> Another thing, the results aren't very good with bernoulli either
< Atharva> I am hoping they would get better as I train it more
< sumedhghaisas> But for bernoulli you are using BinaryMinst right?
< Atharva> Yes
< sumedhghaisas> Did you check if the binary mnist images look okay?
< Atharva> Yeah, they look fine
< sumedhghaisas> great. So what is the loss you are getting?
< Atharva> wait I will send it to you on hangouts
< sumedhghaisas> Sure thing
< Atharva> the loss now is 130 out of which 8 is kl, this is for feedforward nets
< sumedhghaisas> And does it seem to be going down?
< sumedhghaisas> although kl of 8 does not seem correct
< sumedhghaisas> should be higher
< Atharva> Yes, it's still going down
< Atharva> I have sent an image on hangouts
< sumedhghaisas> I know we have checked this... but could you check again if we are summing the correct dimension for KL?
< sumedhghaisas> the first dimension should be summed and 0th dimension should be meaned
< Atharva> sumedhghaisas: But that's how it has been for feedforward vae models with normal MNIST as well, and the results were good
< Atharva> the kl then too was under 10
< sumedhghaisas> I agree... Wait but we haven't changed MeanSquaredError right?
< sumedhghaisas> I am sensing some mistake here
< sumedhghaisas> so lets see
< Atharva> Sorry, what change?
< sumedhghaisas> so mean_squared_loss is (total_batch_loss) / (batch_size * num_attributes)
< sumedhghaisas> is that right?
< sumedhghaisas> I think so
< Atharva> but I changed it to (total_batch_loss) / (batch_size)
< Atharva> locally
< sumedhghaisas> okay... so KL_loss is (total_loss) / (batch_size)
< sumedhghaisas> this would be awkward for other users though
< sumedhghaisas> I think we should shift to (total_loss) / (batch_size) for MeanSquaredLoss as well
< Atharva> I changed Meansquared to (total_batch_loss) / (batch_size) as well
< Atharva> locally
travis-ci has joined #mlpack
< travis-ci> manish7294/mlpack#15 (tree - 7116beb : Manish): The build is still failing.
travis-ci has left #mlpack []
ImQ009 has joined #mlpack
travis-ci has joined #mlpack
< travis-ci> manish7294/mlpack#80 (tree - 7116beb : Manish): The build is still failing.
travis-ci has left #mlpack []
zoq_ is now known as zoq
chrisanthamum has joined #mlpack
yaswagner has joined #mlpack
pd09041999 has joined #mlpack
wenhao has joined #mlpack
pd09041999 has quit [Ping timeout: 268 seconds]
chrisanthamum has quit [Ping timeout: 252 seconds]
< rcurtin> ok, just an update from my end, I still don't have any clarity on the Symantec build systems yet... I simply haven't gotten a response
< rcurtin> also, there is interest in the new company in mlpack, so I think the first work I do there will be to write Julia wrappers for mlpack
< rcurtin> I saw that the Julia machine learning software offerings are quite small, so I think that mlpack can be a nice addition that will see adoption in that community
< ShikharJ> rcurtin: Nice idea.
< rcurtin> there are neural network toolkits available already for Julia (most wrapped from other languages), but lots of "traditional" techniques like HMMs, GMMs, nearest neighbor search, and others don't seem to be readily available
< rcurtin> ShikharJ: thanks, I hope that the company will not be the only users of the bindings :)
ImQ009 has quit [Quit: Leaving]
< zoq> rcurtin: I guess, in this case, no response is a good thing, what are the chances they forget they existed.
< zoq> rcurtin: I'm wondering, is julia a 'database' language?
< zoq> ShikharJ: Do you need any help with the SSRBM test?
yaswagner has quit [Quit: Page closed]