#mlpack on 2020-08-10 — irc logs at libera.irclog.whitequark.org

2018-11-12 22:39 ChanServ changed the topic of #mlpack to: "mlpack: a fast, flexible machine learning library :: We don't always respond instantly, but we will respond; please be patient :: Logs at http://www.mlpack.org/irc/

00:57 [DONG]Gibby is now known as d1

02:29 mistiry_ has joined #mlpack

02:32 mistiry has quit [Ping timeout: 240 seconds]

06:51 ImQ009 has joined #mlpack

07:07 < himanshu_pathak[> Hey Everyone here is the link to my weekly blog post https://medium.com/@hpathak336/gsoc2020-week-8-9-3ecd3cb26eb4

11:41 < say4n> Does mlpack participate in GSoD?

11:41 < say4n> (https://developers.google.com/season-of-docs/)

13:08 < rcurtin> say4n: we haven't in the past, but I think it might be a cool idea

13:41 < say4n> Ah, cool!

14:17 < chopper_inbound[> Hello everyone here is my weekly update for week 9 and 10. https://mrityunjay-tripathi.github.io/gsoc-with-mlpack/coding_period/week9_and_10.html Please have a look :)

14:29 < rcurtin> himanshu_pathak[: chopper_inbound[: awesome, thanks for the link!

14:29 < rcurtin> I like the git merge gif :)

14:30 < rcurtin> chopper_inbound[: I agree about reading, I always prefer to read something that's actually printed vs. on a screen

14:41 < chopper_inbound[> @rcurtin:matrix.org: 👍

16:10 ImQ009 has quit [Read error: Connection reset by peer]

17:00 ImQ009 has joined #mlpack

17:41 < himanshu_pathak[> <rcurtin "I like the git merge gif :)"> :)

19:34 < himanshu_pathak[> Hey zoq I didn't get this http://ci.mlpack.org/job/pull-requests-mlpack-static-code-analysis/6196/cppcheck/ What can I do I can't remove this variable and also we don't have to do anything with this ??

19:35 < himanshu_pathak[> * Hey zoq I didn't get this http://ci.mlpack.org/job/pull-requests-mlpack-static-code-analysis/6196/cppcheck/ What can I do I can't remove this variable and also we don't have to do anything else with this variable??

19:36 < zoq> himanshu_pathak[: You can ignore the issue, sometimes we get some false warnings; what we usally do in those cases is we add the error to: https://github.com/mlpack/jenkins-conf/blob/master/static/clean_cppxml.py#L14

19:39 < himanshu_pathak[> <zoq "himanshu_pathak: You can ignore "> Oh ok. when you get time can you do a review on my It will be a quick one . I can rebase my DBN pr after merging of this :)

19:39 ImQ009 has quit [Quit: Leaving]

19:39 < zoq> himanshu_pathak[: The RBM one?

19:40 < himanshu_pathak[> > himanshu_pathak: The RBM one?

19:40 < himanshu_pathak[> Yes removing r-value reference

19:41 < zoq> himanshu_pathak[: Alright, I'll take a look at the changes.

19:41 < himanshu_pathak[> <zoq "himanshu_pathak: Alright, I'll t"> Thanks

21:50 < say4n> For some reason the released script for ensmallen doesn't work on MacOS. I think it is because the `git diff | wc -l` in BSD output a tab followed by 0, which when compared to the string "0" is equated to false. A quick fix would be to just strip the whitespace characters if they exist from the string.

21:50 < say4n> Should I add it to the current PR updating the release script or make a separate one?

21:50 < say4n> *release script

21:50 < rcurtin> say4n: sure, feel free :)

21:50 < rcurtin> I certainly haven't tested it on OS X so any fixes are totally appreciated

21:51 < say4n> Alrighty! :)

22:06 < say4n> Also sed is pretty mysterious on BSD. I remember the last time I was playing with the release script, it was missing some flags that were being used by the script.

22:39 ak has joined #mlpack

22:42 < ak> hello, I am having some difficulty with training a logistic regression model one point at a time, it seems like the model is never getting better / parameters reset after each point

22:45 < ak> Say my model has 5 variables, I am doing model.Train(input, labels);, where input is a column vector of doubles and labels is a Row<size_t> of size 1.

22:48 < zoq> ak: So basically you want to avoid doing: https://github.com/mlpack/mlpack/blob/0ed562bd52f70bbdca02f62484836ed56e553f27/src/mlpack/methods/logistic_regression/logistic_regression_impl.hpp#L90-L91 after the first call of training.

22:51 < zoq> ak: So my idea would be to do the training outside the LR class itself, do the intalization once.

22:51 < zoq> and resign the trained parameters after the optimization process.

22:52 < zoq> something likeL model.Parameters() = parameters;

22:52 < zoq> that way you can reuse the LR class to call Classify() etc.

22:53 < ak> ah that makes a lot of sense, thank you. Does that mean I will lose out on any of the LR optimization stuff then?

22:53 < ak> optimization as in vector operations / other speed ups

22:54 < zoq> or you can modify the class itself and I think adding if (parameters.is_empty()) right before https://github.com/mlpack/mlpack/blob/0ed562bd52f70bbdca02f62484836ed56e553f27/src/mlpack/methods/logistic_regression/logistic_regression_impl.hpp#L90-L91 should do the trick as well.

22:56 < zoq> Hm, depending on the optimizer you might want to adjust the stepsize batch size.

22:56 < zoq> But the default settings should work as well.

23:00 < zoq> Actually wondering if we should just add that line, will test it out tomorrow.

23:08 < rcurtin> ak: it sounds like you are only giving one point for `input` and one label; maybe you should pass the entire dataset int hte call to `Train()`?

23:11 < ak> Yes, I sadly think I misread the docs. I am guessing https://www.mlpack.org/doc/mlpack-3.0.4/doxygen/classmlpack_1_1regression_1_1LogisticRegression.html#a0f30e4158da412fc6603ad8327c8d258, "This will use the existing model parameters as a starting point for the optimization." is not the same as incremental / online learning?

23:12 < rcurtin> it seems like maybe the documentation is incorrect! the code that zoq linked to resets the parameters in each call to Train()

23:12 < rcurtin> give me just a second---I will open a PR to fix the behavior

23:13 < ak> is there another ml algorithm that I could implement that I would work with that flow? I am kind of new to this work and want something that will get better whenever I have new data

23:14 < rcurtin> so, after I open this PR, LogisticRegression will behave like you expect (and like the documentation says it should)

23:14 < rcurtin> not all machine learning models can be trained incrementally, but logistic regression and neural networks can, at least

23:15 < rcurtin> I believe SoftmaxRegression (which is just an extension of logistic regression for more than 2 classes) will work that way too

23:15 < rcurtin> (and, glancing at the code there, I believe that incremental training will work correctly)

23:16 < ak> thats great, thank you! Ha, it is a relief to hear that!

23:16 < rcurtin> the bugs always sneak in somehow when we aren't looking :) thank you for reporting this!

23:17 < rcurtin> I believe that the model will train and perform best, though, when you can give it as many points as possible at a time when calling Train()

23:18 < rcurtin> ("perform best" both in terms of accuracy and speed)

23:19 < ak> that makes sense, I am just glad to know it is possible. The plan is to start the program with a fairly trained model, and as it goes it can only get better

23:24 < rcurtin> I was reading that comment too, I think that is also inaccurate according to the code I am seeing

23:25 < rcurtin> oops, sorry, for some reason my IRC client was scrolled to the wrong message

23:25 < rcurtin> ignore what I just wrote :)

23:25 < rcurtin> I'm waiting on tests to compile and pass here, then I'll have a patch posted

23:39 < ak> '=D I may not be able to update my installs for a bit, should I just add zoq's if-statement to enclose those two lines?

23:54 < rcurtin> ak: yeah, that's basically exactly what my patch does

23:54 < rcurtin> so you could just add that and it should work