ChanServ changed the topic of #mlpack to: "mlpack: a fast, flexible machine learning library :: We don't always respond instantly, but we will respond; please be patient :: Logs at http://www.mlpack.org/irc/
< KimSangYeon-DGU[>
The training time took about an hour and a half
< kartikdutt18[m]>
Hmm, I can't get even a single epoch in an hour. The model is bit different from darknet 19.
< KimSangYeon-DGU[>
Yes, it's simpler model than darknet 19
< KimSangYeon-DGU[>
Maybe, there is bottleneck in the darknet implementation.
< kartikdutt18[m]>
What do you suggest?
< KimSangYeon-DGU[>
When you trained, can you use multiple cores?
< KimSangYeon-DGU[>
* When you trained, could you use multiple cores by any chance?
< kartikdutt18[m]>
I am using open mp.
< KimSangYeon-DGU[>
Ok
< kartikdutt18[m]>
I'll just check the number of threads.
< KimSangYeon-DGU[>
My laptop spec is i7, 16gb (8 cores)
< kartikdutt18[m]>
I am using 12 threads according to activity manager.
< kartikdutt18[m]>
Should I send you the training subset or use zoq's machine if they would be faster?
< KimSangYeon-DGU[>
Let me try to train from the scratch. Can I reproduce the training work by using the darknet model PR?
< KimSangYeon-DGU[>
Did you make any change in your local repo?
< KimSangYeon-DGU[>
,not remote repo
< kartikdutt18[m]>
Let me just push it again ( I think only training params might be different).
< KimSangYeon-DGU[>
Ok, and I found in darknet framework, the loading time is so fast. It only loaded images as batch size
< KimSangYeon-DGU[>
The batch size was 128
< kartikdutt18[m]>
We load the whole dataset first and then train the model.
< KimSangYeon-DGU[>
Right
< KimSangYeon-DGU[>
I'll reproduce the small network in mlpack and let's see what's happening
< kartikdutt18[m]>
Sure, Let me know if I need to make the model.
< KimSangYeon-DGU[>
:)
< kartikdutt18[m]>
I can reuse the functions in darknet class.
< KimSangYeon-DGU[>
But, your laptop is already running to train the model, so I think it's good to leave it as it is for the time being.
< kartikdutt18[m]>
Sure :)
< KimSangYeon-DGU[>
Without stop
< KimSangYeon-DGU[>
:)
< KimSangYeon-DGU[>
I was trying to find any Darknet 19 references except for the official one, but I couldn't anything yet. I'll share it with you, if I find.
< kartikdutt18[m]>
Thanks, I'll check if we can load images in parallel as well. Will keep posting updates about training.
< KimSangYeon-DGU[>
Thanks!
ImQ009 has joined #mlpack
< KimSangYeon-DGU[>
kartikdutt18: Let me train the model using Darknet 19 in the Darknet framework first.
< kartikdutt18[m]>
Ok, on Cifar 10?
< KimSangYeon-DGU[>
Yes
< kartikdutt18[m]>
I think you would have to change a few pooling layer (increase padding) because for 32 x 32 the size will go to 0.
< saksham189Gitter>
@kartikdutt18 Did you find any examples of using Darknet on CIFAR dataset online? Do you think we should try with a different dataset?
< KimSangYeon-DGU[>
And the initial loss is `2.419931`
< kartikdutt18[m]>
I didn't find darknet 19 or 53 with cifar 10. There is only darknet small. I think I got nearly the same loss with Xavier Initialization but it didn't change after 2 epochs for various learning rates.
< KimSangYeon-DGU[>
I'm training the model with cifar-10 and Darknet 19.
< KimSangYeon-DGU[>
saksham189 (Gitter): Would it be better for us to change the dataset to imagenet?
< saksham189Gitter>
Yes but I guess the training time would be even longer, right?
< KimSangYeon-DGU[>
I think so
< kartikdutt18[m]>
I think so, yes
< saksham189Gitter>
and Imagenet has 1000 classes so, definitely alot harder problem.
< kartikdutt18[m]>
Yes.
sakshamb189[m] has joined #mlpack
< KimSangYeon-DGU[>
I'm training the Darknet-19 model for CIFAR-10 in the Darknet framework. After the first epoch, let me check the validation error.
< KimSangYeon-DGU[>
* I'm training the Darknet-19 model for CIFAR-10 in the Darknet framework. After the first epoch, let me check the validation accuracy.
< KimSangYeon-DGU[>
To make the benchmark
< KimSangYeon-DGU[>
Currently, the loss is not decreasing
< KimSangYeon-DGU[>
* Currently, the loss is not decreasing easily
< HimanshuPathakGi>
But I have to make some changes to support Gaussian in this implementation
< saksham189Gitter>
I think the Gaussian kernel is already implemented in `src/mlpack/core/kernels/` and you would be adding the radial basis function kernel?
< HimanshuPathakGi>
Yup that's why I am adding template parameter KernelType
< HimanshuPathakGi>
I will use that implementation
< saksham189Gitter>
and you would be adding a radial basis function kernel right?
< HimanshuPathakGi>
Radial basis kernel and Gaussian are same thing
< saksham189Gitter>
oh ok.. I see
< saksham189Gitter>
Have you written the blog for this week?
< HimanshuPathakGi>
I want to write it after completing the implementation
< HimanshuPathakGi>
So that I have something valuable to add
< saksham189Gitter>
okay sure :) just make sure you share it here when you are done
< HimanshuPathakGi>
Yup I will try to work tonight on it:)
< HimanshuPathakGi>
I have just one concern
< saksham189Gitter>
yup sure, we can discuss right now
< HimanshuPathakGi>
After completing I have to compare it with libsvm
< HimanshuPathakGi>
But in that they were using optimised version of smo
< HimanshuPathakGi>
So, I think it will be working better than our svm
< HimanshuPathakGi>
I think we should discuss this after implementation that will be better approach:)
< saksham189Gitter>
I think we can first implement the basic functionality of the kernel svm and then do the comparison and discuss which optimisations if any we might want to implement. Let me know what you think.
< HimanshuPathakGi>
Yup that make sense we will discuss it after implementing the basic functionality :)
< saksham189Gitter>
also are you following the implementation here
< HimanshuPathakGi>
Yup but I have to perform some changes in this to support Gaussian also
< HimanshuPathakGi>
So the implementation will not be exact same
< saksham189Gitter>
yup sure I see.
< saksham189Gitter>
Let me know when the PR is ready for a review.
< saksham189Gitter>
Is there anything else we should discuss?
< HimanshuPathakGi>
Sure I will tag you for a review :)
< HimanshuPathakGi>
Anything else you like to dicuss
< saksham189Gitter>
If there are any blockers you are facing then let me know.
< HimanshuPathakGi>
Yup right now I am not sure I will ask for help if I get stuck
< saksham189Gitter>
alright sure. Then we will meet next time. Have a great week ahead.
< HimanshuPathakGi>
Yeah until next time Have a nice week.
< HimanshuPathakGi>
:)
< sakshamb189[m]>
kartikdutt18: do we have a meeting right now?
< kartikdutt18[m]>
Hi sakshamb189 , we do.
< sakshamb189[m]>
I guess you are still training the DarkNet model, right?
< kartikdutt18[m]>
Right, Let me post another update on the PR.
< sakshamb189[m]>
Ok sure. Is there any improvement in the validation accuracy?
< sakshamb189[m]>
I think the model might be overfitting on the CIFAR dataset since we have also reduced the training size.
< kartikdutt18[m]>
Both are increasing but the increase is very slow.
< kartikdutt18[m]>
*Both training and validation, however they are not good enough for classification.
< sakshamb189[m]>
so what's the final validation accuracy you are getting after 3 epochs?
< kartikdutt18[m]>
The third epoch isn't complete yet. It's at 94%.
< kartikdutt18[m]>
For the second it was a bit over 11%
< sakshamb189[m]>
and the training set size was 12.5k right?
< kartikdutt18[m]>
Yes.
< sakshamb189[m]>
did you check the class distribution in the train set?
< sakshamb189[m]>
Is it uniform?
< kartikdutt18[m]>
Each class has the number of images. (1250)
< kartikdutt18[m]>
* Each class has the same number of images. (1250)
< sakshamb189[m]>
and I am guessing you have a uniform distribution with the validation set right?
< kartikdutt18[m]>
Hmm, I don't think so. I used the mlpack's data split and I don't think that gives uniform distribution.
< kartikdutt18[m]>
It randomly picks indices from the dataset.
< sakshamb189[m]>
when we used the entire train set for training we got 76% accuracy on the validation set right?
< kartikdutt18[m]>
On training set.
< kartikdutt18[m]>
We didn't have the validation part there then. I added it in the next trial
< sakshamb189[m]>
alright and right now our train accuracy is around 11% as well .
< kartikdutt18[m]>
Yes.
< sakshamb189[m]>
I am a bit confused since even if the model is overfitting the train accuracy should have been higher (as compared to our previous trial with the full train set) but right now it is very low.
< kartikdutt18[m]>
I can try a higher learning rate, I tried 0.1, 0.01 and 0.0001 for about 300 iterations each and only 0.001 led to decrease in loss
< kartikdutt18[m]>
Hmm, I don't think this version is overfitting, I think the one with the full dataset did.
< sakshamb189[m]>
Did you make any changes to the implementation after that?
< kartikdutt18[m]>
I tried a few things, I tried different initialization, right now I am using random since the model didn't change it's loss with a different one. Since I was trying high learning rates I also added a batch norm layer before the last linear layer.
< kartikdutt18[m]>
* I tried a few things, I tried different initialization, right now I am using random since the model didn't show decrease in loss with a different one. Since I was trying high learning rates I also added a batch norm layer before the last linear layer.
< sakshamb189[m]>
why do you think the previous one was overfitting while this one is not?
< kartikdutt18[m]>
The darknet 19 model converged very quickly, If that was the case it should have converged again with nearly the same params. I don't training accuracy of nearly 80% can be achieved on a single pass on the dataset.
< kartikdutt18[m]>
* The darknet 19 model converged very quickly, If that was the case it should have converged again with nearly the same params. I don't think training accuracy of nearly 80% can be achieved on a single pass on the dataset.
< sakshamb189[m]>
but we used almost the same architecture again with a smaller training set right ?
< kartikdutt18[m]>
Right.
< kartikdutt18[m]>
I even tried on the full set before training on subset I wasn;t able to reproduce the results of a few hundred iterations hence I said it was overfitting.
< sakshamb189[m]>
yeah but the only changes that were made in between were adding a batch-norm layer and changing some hyper-parameters right?
< kartikdutt18[m]>
Yes. I also don't understand how the accuracy was achieved.
< sakshamb189[m]>
IMO we can remove the batch norm layer and try training on the small test set and see if the helps to improve the validation accuracy.
< sakshamb189[m]>
Let me know what you think.
< kartikdutt18[m]>
Sure, but then we can't use the current weights. I can stop the training and make the change. Also I'll try experimenting with the learning rate. I know mlpack has hyper tuner class, can we use that?
< kartikdutt18[m]>
and ensmallen has grid search
< sakshamb189[m]>
Alright just save the current weights.
< kartikdutt18[m]>
Ohk, the weights will save after three epochs in 15-20 minutes.
< kartikdutt18[m]>
*5-10 mins
< sakshamb189[m]>
yes so we can wait for that to finish and then restart the training :)
< sakshamb189[m]>
then I guess we could go ahead with Xavier? what do you think?
< kartikdutt18[m]>
Sure, If we can find a good learning rate for that we won't have to do a lot of epochs.
< kartikdutt18[m]>
Also about the transferring weights from darknet framework. I'll try to develop a proof of concept for it. If it works we can avoid training completely.
< KimSangYeon-DGU[>
So sorry...
< sakshamb189[m]>
Alright sure. Is there anything else we should discuss?
< sakshamb189[m]>
KimSangYeon-DGU: it's fine. no worries
< kartikdutt18[m]>
Not really. KimSangYeon-DGU , did you get a validation accuracy?
< KimSangYeon-DGU[>
It'll be done after 2 mins
< kartikdutt18[m]>
Ahh nice.
< KimSangYeon-DGU[>
I had a difficulty in getting accustomed to the time difference...
< kartikdutt18[m]>
No worries.
< KimSangYeon-DGU[>
8000 images left
< KimSangYeon-DGU[>
So far, the accuracy is 10%
< kartikdutt18[m]>
On training or on validation?
< KimSangYeon-DGU[>
I did 4 epochs
< KimSangYeon-DGU[>
on the test dataset in cifar/test directory
< KimSangYeon-DGU[>
for the 10,000 images
< kartikdutt18[m]>
Hmm, I think the mlpack implementation gives the same result in nearly same number of epochs.
< KimSangYeon-DGU[>
Yes, I think the image size of CIFAR-10 is so small for the Darknet-19 architecture
< kartikdutt18[m]>
Would resizing help. I think that would degrade the image quality
< KimSangYeon-DGU[>
Yes, it degrades the quality and at the end, it's doesn't help...
< KimSangYeon-DGU[>
The validation accuracy is 10%
< KimSangYeon-DGU[>
Can we find more higher resolution image classification dataset?
< KimSangYeon-DGU[>
* Can we find higher resolution image classification dataset?
< KimSangYeon-DGU[>
Imagenet is too much... haha 122GB, as far as I know
< kartikdutt18[m]>
Hmm, I am also not if darknet handles it internally. In the object detection script, I used minmaxScaler on images
< KimSangYeon-DGU[>
I'm not sure as well. It needs to be checked
< KimSangYeon-DGU[>
zoq: As an alternative way, we thought the possibility of migration from Darknet's pretrained weights to mlpack. Does that make sense to you?
< zoq>
KimSangYeon-DGU[: Yes sounds like a good idea to me, I guess we "just" have to make sure the format is the same.
< zoq>
Personally I would work on the pre-trained model option first, I guess that includes images scaling as well.
< KimSangYeon-DGU[>
Ok, what do you think kartikdutt18?
< kartikdutt18[m]>
I will try developing POC for pretrained weights.
< KimSangYeon-DGU[>
Ok, if we succeeded to do that, we can do in other models as we said :)
< KimSangYeon-DGU[>
* Ok, if we succeeded to do that, we can also apply it to other models as we said :)
< walragatver[m]>
jeffin143: Hi.
< jeffin143[m]>
walragatver: hi
< jeffin143[m]>
@kimsangyeon-dgu:matrix.org: @kartikdutt18:matrix.org are you done with your meet :)
< jeffin143[m]>
If not we can wait
< KimSangYeon-DGU[>
Yes, we're done :) thanks for asking
< KimSangYeon-DGU[>
kartikdutt18: thanks for the meeting and please ping me if you have anything you want to discuss!
< jeffin143[m]>
Also if possible need your review on image logging pr too
< jeffin143[m]>
Also I have updated the blog for this week :)
< jeffin143[m]>
That's everything from my side
< walragatver[m]>
jeffin143: Sorry I got deviated
< walragatver[m]>
I will give you review soon.
< jeffin143[m]>
walragatver: no issues :) all good
< jeffin143[m]>
I have opened two pr for the reason so that work doesn't stop
< walragatver[m]>
jeffin143: What's the plan ahead? What would you be implementing next?
< jeffin143[m]>
walragatver: if we are done with image and testing
< jeffin143[m]>
I will go for text and audio next
< jeffin143[m]>
I am planning to complete image and testing pr before first phase ends
< walragatver[m]>
jeffin143: And I saw your blog. I think from this year participants are allowed to write blogs anywhere
< walragatver[m]>
<jeffin143[m] "I will go for text and audio nex"> Okay
< jeffin143[m]>
> jeffin143: And I saw your blog. I think from this year participants are allowed to write blogs anywhere
< jeffin143[m]>
Yes, but I guess that was true every year, we just went with irc last year :)
< jeffin143[m]>
Sry the blog repo*
< RyanBirminghamGi>
jeffin143: I'll also give another pass at reviews soon!
< jeffin143[m]>
<jeffin143[m] "Sry the blog repo*"> Ryan Birmingham (Gitter): sure :)
< walragatver[m]>
jefffin143: Okay, you are sure that this time we are allowed to use the blog repo?
< jeffin143[m]>
I guess so
< walragatver[m]>
Because no else is using it. And I think there was no mention of the blog repo in the introductory mails this time.
< walragatver[m]>
It might happen that we stop maintaining it. So just confirm it.
< jeffin143[m]>
May be I can send it through mailing list too
< jeffin143[m]>
walragatver: may be I can send it through mailing list too
< walragatver[m]>
jeffin143: It's fine I would say to continue with blog. Just change the location if something happens. Don't use mailing list because it's a communication medium.
< jeffin143[m]>
I will confirm it with Ryan and let you know
< walragatver[m]>
jeffiin143: Are you going to also add the graph support in the library?
< jeffin143[m]>
walragatver: we don't follow the graph convention in mlpack , right ?
< walragatver[m]>
Graph in the sense histogram etc.
< jeffin143[m]>
Yes histogram and pr curve
< jeffin143[m]>
Text audio
< jeffin143[m]>
Embedding
< jeffin143[m]>
5 more things are there in my list
< walragatver[m]>
Okay
< walragatver[m]>
The path ahead would be quite smooth now as the CI and testing implementation is over.
< jeffin143[m]>
walragatver: yes , I am sure we can speed a little bit :)
< jeffin143[m]>
I can write more tutorials then may be if I am left with some time
< walragatver[m]>
jeffin143: Have you given any thoughts on callbacks? Any implementatiion idea or something?
< jeffin143[m]>
walragatver: no , I will try to sketch it this week
< jeffin143[m]>
May be then we can schedule a meet with zoq to get input as well ??
< walragatver[m]>
jeffin143: And what about your joining. When would you be joining the firm?
< jeffin143[m]>
If a user Wants to log something else he has to create it as a custom function and pass it the callback
< walragatver[m]>
jeffin143: I am not sure about the tf callbacks i will check it out.
< walragatver[m]>
jeffin143: If it's all about accuracy and loss then it would be quite easy to implement.
< jeffin143[m]>
> jeffin143: And what about your joining. When would you be joining the firm?
< jeffin143[m]>
The contacted me and told me that hiring are paused they wouldn't be offering me a full time job
< jeffin143[m]>
They said they will try a contract offer not to leave me in mid