ChanServ changed the topic of #mlpack to: "mlpack: a fast, flexible machine learning library :: We don't always respond instantly, but we will respond; please be patient :: Logs at http://www.mlpack.org/irc/
mindlifter has quit [Remote host closed the connection]
mindlifter has joined #mlpack
mindlifter has quit [Remote host closed the connection]
mindlifter has joined #mlpack
ImQ009 has joined #mlpack
mindlifter has quit [Remote host closed the connection]
mindlifter has joined #mlpack
mindlifter has quit [Remote host closed the connection]
mindlifter has joined #mlpack
< Aakash-kaushikAa> Sure, No problem with that.
mindlifter has quit [Remote host closed the connection]
mindlifter has joined #mlpack
mindlifter has quit [Remote host closed the connection]
mindlifter has joined #mlpack
< jonpsy[m]> @zoq Hi, do you have a minute? So I created a [test file](https://gist.github.com/jonpsy/46b8ad50b8120172830b734e87dddde6) for generating csv and its working great on local machine. I'm not sure how to compile cpp on binder instance, particularly the "python" linking part.
< jonpsy[m]> After installig pandas-datareader and setting up properly. The ```pFunc = Py_GetAttrString(...``` is returning NULL :(
< jonpsy[m]> .. which is returning "Call Failed" error
mindlifter has quit [Remote host closed the connection]
mindlifter has joined #mlpack
< jonpsy[m]> * @zoq Hi, do you have a minute? So I created a [test file](https://gist.github.com/jonpsy/46b8ad50b8120172830b734e87dddde6) for generating csv and its working great on local machine. I'm not sure how to compile cpp on binder instance, particularly the "python" linking part.
< jonpsy[m]> After installig pandas-datareader and setting up properly. The ``` PyCallable_Check(pFunc)``` returning NULL :(
mindlifter has quit [Remote host closed the connection]
mindlifter has joined #mlpack
mindlifter has quit [Remote host closed the connection]
mindlifter has joined #mlpack
mindlifter has quit [Remote host closed the connection]
mindlifter has joined #mlpack
< zoq> jonpsy[m]: Will look into it.
< zoq> jonpsy[m]: About the compiling question, you don't it's handled by xeus cling.
< jonpsy[m]> Actually xeus-cling notebook isn't verbose enough on errors
< jonpsy[m]> when I was facing errors on my local machine, the python errors should up concisely.
< jonpsy[m]> I reckoned if I could run my test.cpp, I can clearly find out the python errors
< zoq> Above you mentioned it works fine on your local machine, but not on the binder instance?
< jonpsy[m]> yep
< jonpsy[m]> point to remember, I used this (script)[https://gist.github.com/jonpsy/46b8ad50b8120172830b734e87dddde6]
< jonpsy[m]> * point to remember, I used this [script](https://gist.github.com/jonpsy/46b8ad50b8120172830b734e87dddde6)
< zoq> Have you tested the Python code in a Python notebook as well?
< zoq> Currently building your branch with binder, so I can test it in some minutes as well.
< jonpsy[m]> do you mean running locally using xeus cling
< zoq> I mean, using lab.mlpack.org point it your repo and branch, open-up a python notebook and see if https://github.com/mlpack/examples/blob/231c19bc28f6de535c95a78cfb1546eb27495f79/utils/portfolio.py runs fine.
< jonpsy[m]> Thanks, I'll be on it.
< jonpsy[m]> zoq: it works perfectly fine on Pyton notebook.
< jonpsy[m]> * zoq: it works perfectly fine on Python notebook.
< jonpsy[m]> brb dinner
< jonpsy[m]> * zoq: it works perfectly fine on Python notebook (I used custom inputs)
mindlifter has quit [Remote host closed the connection]
mindlifter has joined #mlpack
< zoq> jonpsy[m]: This is what I used inside a C++ notebook - https://gist.github.com/zoq/988cae9ed05844d95826d928d453cfc0
< zoq> jonpsy[m]: Which runs just fine and generates portfolio.csv
< zoq> Also, make sure to run it inside a folder and not in the root folder, because PyRun_SimpleString("sys.path.append(\"../utils/\")"); fail fail otherwise.
mindlifter has quit [Remote host closed the connection]
mindlifter has joined #mlpack
< Aakash-kaushikAa> Btw have we thought about improving the documentation through this https://developers.google.com/season-of-docs/docs
mindlifter has quit [Remote host closed the connection]
< Aakash-kaushikAa> If we are selected as an orginastion we can get good documentation and the other person does gets all the benifit provided by the program
mindlifter has joined #mlpack
< zoq> Aakash-kaushikAa: We thought about it, but hadn't had time in the past.
< Aakash-kaushikAa> Why i am saying this is because when i have to look something up i prefer looking it in the codebase rather than documnetation just becasue of the fact that this is far more easier. and the documentation looks a bit odd even by todays standards.
< jonpsy[m]> > jonpsy: This is what I used inside a C++ notebook - https://gist.github.com/zoq/988cae9ed05844d95826d928d453cfc0
< jonpsy[m]> Isn't this the test.cpp which I sent previously?
< Aakash-kaushikAa> But that is like the first thing we need to do so it's more inviting to users
< Aakash-kaushikAa> when i am looking at a new ml or dl library i go their tutorials and then refer documentation very heavily.
< Aakash-kaushikAa> go to*
< Aakash-kaushikAa> I mostly don't have a need to visit PyTorch's or Tensorflow's github
< Aakash-kaushikAa> i get the part where you are bored about people just talking about it and maybe it just get's left out.
< Aakash-kaushikAa> https://readthedocs.org/ I like these too. and i think it is actually free for open source projects without and hidden charges or anything
< zoq> jonpsy[m]: Close, except that I included #include <mlpack/xeus-cling.hpp> and run the code from within a subfolder.
< jonpsy[m]> I did that too
< jonpsy[m]> that's how i came up with the "../utils" idea.
< zoq> Aakash-kaushikAa: You are right, and we touched on the topic - https://github.com/mlpack/mlpack/issues/2922 as well.
< jonpsy[m]> I reckon you compiled it in binder instance?
< zoq> jonpsy[m]: Yes, I just opened a C++14 notebook, and used the code I linked to.
< zoq> jonpsy[m]: It dosn't work for you?
< jonpsy[m]> Do you mind testing this.., wait I'll send you
< jonpsy[m]> Let me know if you get "Call failed" erorr
< zoq> jonpsy[m]: The order of the arguments you pass to the Portfolio function isn't correct.
< jonpsy[m]> WHAAAAAAAAAAAAAAAAAAAAAAAAAAA
< jonpsy[m]> You're correct.. So I wasted 3 hours on arg ordering :(
mindlifter has quit [Quit: Leaving.]
< zoq> Wondering if the python code returns any errors.
< jonpsy[m]> no it doesn't.
< jonpsy[m]> Lemme try specifying custom filepath, that'll be the nail in the coffin. Then it's GTM I suppose?
< rcurtin[m]> Aakash-kaushik (Aakash kaushik): yeah, we've thought about it before, the problem is, it would take a lot of time for someone to oversee it, and I don't think anyone has had the time
< rcurtin[m]> but I totally agree, it would be great if users could mostly follow examples and tutorials on the website, and never need to dive into the source
< rcurtin[m]> unfortunately we don't have the kind of resources that Google or Facebook pour into TF or PyTorch 😃
< rcurtin[m]> but, I do think this is something that will definitely be improved soon (in part by your project and others this year!)
< jonpsy[m]> > Lemme try specifying custom filepath, that'll be the nail in the coffin. Then it's GTM I suppose?
< jonpsy[m]> Done! Custom path works as well. Yipee
< zoq> jonpsy[m]: Nice!
< zoq> Aakash-kaushikAa: Btw. I made some progress with the windows issue, in the sense that I can reproduce it locally.
< shrit[m]> Aakash-kaushik (Aakash kaushik): Totally agree with rcurtin on this point, we need to provide more and more documentations, and please when ever you noticed something is missing, either open a pull request or an issue so we can keep track of it, since we might forget where the missing things are
< shrit[m]> We need to have some kind of an open documentation issue, where we update constantly the keep tracking of missing parts that need to be added, this will allow new contributors to pick from the list and contribute to mlpack., basically the same way we have done with activation layers,
< shrit[m]> * We need to have some kind of an open documentation issue, where we update it constantly and keep tracking of missing parts that need to be added, this will allow new contributors to pick from the list and contribute to mlpack., basically the same way we have done with activation layers,
< jonpsy[m]> +1, I'd say the issue should be opened on mlpack/examples.
< shrit[m]> I am not sure if there any active documentation in `mlpack/examples`, I think all of mlpack docs are on the mlpack main repository
< jonpsy[m]> My line of reasoning is that, since mlpack/examples are meant to be as an "introduction" to the library with ready to use examples. Its only fitting to have extensive documentations of each method there.
< jonpsy[m]> Ofcourse, do correct me if my vision's limited
< rcurtin[m]> jonpsy: one of the things that makes it tricky is that we have interfaces in lots of different languages
< rcurtin[m]> I think for the bindings for other languages, the generated documentation is pretty good: https://www.mlpack.org/doc/mlpack-3.4.2/cli_documentation.html
< rcurtin[m]> but what would be even more awesome is if we could link each method with an example from the examples repo or something :)(
< jonpsy[m]> that'd be cool!
< rcurtin[m]> it's likely that we could redo our C++ documentation in a more clear way, but I'm not exactly sure what the best way to do it would be. it would be a big undertaking
< rcurtin[m]> right now doxygen (or sphinx if we switched to that) does an ok job of showing the API and functions of each class, but it can be hard to navigate and doesn't make it clear what the "important" classes are and what the "support" classes are
< rcurtin[m]> shrit: nice PRs that you just opened, I'll try to review them shortly 😃
< shrit[m]> You are welcome, I did not check for the installation one on my machine, let us see if it works correctly on the CI
< rcurtin[m]> yeah, for #2952 I will probably run it locally in a container with only a C++ compiler and openblas installed, then install it and make sure everything is there :)
< zoq> Agreed, also I encountered another issue where the first cmake pass failed (because it couldn't find some dependencies) but the second pass was sucessfull, since the auto-downloader downloaded some files.
< Aakash-kaushikAa> Hey @zoq great on the windows issue that you could reproduce, was it just because of the permissions or something else too ?
< rcurtin[m]> zoq: I've been playing with the `ann-vtable` branch... on this branch, each layer stores its own weights separately, but I think ensmallen would require that all weights are stored in a single `arma::mat` (or equivalent)... so I think maybe we will have to ensure that we set each layer's weights to be an alias before merge?
< zoq> Aakash-kaushikAa: Not sure yet, what the issue is, I'll continue my debugging session later today :)
< Aakash-kaushikAa> And while coming back to the documentation part I agree to all of this and as @zoq said this could be an issue and can be worked slowly like activation layers I think we should decide on a clear path and break that down into smaller ones so new comers can contribute to that and it is easy for us to maintain and actually reach somewhere because i know it's going to be huge but if we don't break it down and start and
< Aakash-kaushikAa> let others know that we have this problem maybe as a good first issue it won't really go anywhere.
< Aakash-kaushikAa> also sorry for the big paragraph 😛
< Aakash-kaushikAa> > @Aakash-kaushik: Not sure yet, what the issue is, I'll continue my debugging session later today :)
< Aakash-kaushikAa> sure, i tried changing up the install prefix and seeing if i could find something for cmake_install_prefix in mlpack's cmake file because that doesn't errors out while installing but no success there too.
< zoq> rcurtin[m]: Yes, I don't have a good solution for that, the issue I have with the current solution is that all weights have to be kept in memory.
< rcurtin[m]> yeah, you mean that this prevents out-of-core learning?
< zoq> yes, am I wrong?
< rcurtin[m]> no, I think you're correct, you'd have to write some interesting shims to make it work otherwise
< rcurtin[m]> let me think about this a minute, because really it would be great to support that
< zoq> Aakash-kaushikAa: Sounds like a good idea to me, we can use the issue I mentioned above to figure out what our options are and then we can go from there.
< zoq> rcurtin[m]: I guess that would mainly be interesting for ressource constrained devices.
< rcurtin[m]> yeah, there is some trickiness. ensmallen expects some kind of parameters matrix that matches the Armadillo API, and ensmallen will expect to be able to compute the objective and the gradient based on those parameters (and then it will update the parameters)
< rcurtin[m]> inside of `Evaluate()` and `Gradient()`, we can be careful with how we access the parameters (in fact that will generally be one layer at a time), but when the optimizer does some linear algebra expression to update the weights, we will not have control over how memory is accessed (if we are just using an `arma::mat`, that is)
< rcurtin[m]> it almost seems to me like the "right" way to do this would be to have some kind of `CollectionOfMatrices` "shim" class (I chose an awful name, we can do better :)), which presented the same API as Armadillo, but under the hood perhaps was a little bit smarter about memory handling, loading/unloading individual matrices as necessary to handle memory pressure
< zoq> Right, I guess that is not something we should fix on the mlpack side, wondering if Conrad thought about it.
< rcurtin[m]> another thing that could be done is just the `mmap()` trick, which actually should be pretty easy for FFN parameters
< rcurtin[m]> (in this situation, you create a big mmap-ed file, then open it with `mmap()` and pass the given memory pointer to the Armadillo advanced constructor. then, you just hope the kernel brings things in and out of RAM in a reasonable way :))
< zoq> That is POSIX only right?
< rcurtin[m]> `mmap()` could work well if all operations on it are in-place, which at least for SGD-type optimizers should be true
< rcurtin[m]> yeah, so it wouldn't work on Windows
< rcurtin[m]> but I think Windows has similar functionality
< zoq> Hm, that sounds like a reasonable solution to me.
< rcurtin[m]> birm and I played with this some years ago, and then another student took over the effort; in part this culminated in a nice workshop paper: https://dl.acm.org/doi/pdf/10.1145/2882903.2914830
< rcurtin[m]> in any case, the mmap solution would require all the parameters to "seem" contiguous inside of one Armadillo matrix, so I guess we would still need some way to force all the layers in the `ann-vtable` branch to use the same memory
< rcurtin[m]> I'm trying to get the serialization tests to pass, so in order to do that I'll first do a big hack (copy out each layer's parameters during `EvaluateWithGradient()`) just to make it work, and then we can make that more efficient later
< zoq> Sounds good to me.
< zoq> I guess if it works for one, it's easy to do the same thing for the other layers as well.
< zoq> The M3 paper mentioned that it was using mlpack for testing.
< zoq> With a minimal modification :)
< rcurtin[m]> yeah, the 'minimal modification' was just creating an `arma::mat` with the advanced constructor using `mmap()` :)
< rcurtin[m]> I think maybe L-BFGS needed to be modified a little bit, but I can't remember
< rcurtin[m]> that `mmap()` hack doesn't work if, e.g., you try to copy the matrix (it'll be too big) or do any operations that cause really big intermediate matrices or something
< zoq> Honestly that sounds like a really simple and neat idea to me.
< rcurtin[m]> it won't work for GPUs, but we can figure out what to do with GPUs some other day :)
< rcurtin[m]> we do have a decent amount of flexibility since ensmallen just requires some type that implements the Armadillo API
< rcurtin[m]> so, a "wrapper class" that, e.g., holds weights for different layers on different GPUs could be reasonably possible
< zoq> Right, good thing the armadillo API is simple.
< rcurtin[m]> yeah, and in the worst case we can define an implicit conversion to some Armadillo type for, e.g., decompositions and similar
< zoq> Sure, in ten years :)
< rcurtin[m]> someday when time permits 😃
ImQ009 has quit [Quit: Leaving]
mindlifter has joined #mlpack