verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
umberto has quit [Quit: Page closed]
Nilabhra has joined #mlpack
ank_95_ has joined #mlpack
ranjan123 has joined #mlpack
tsathoggua has joined #mlpack
tsathoggua has quit [Client Quit]
< ranjan123> Hello! zoq,rcurtin
Nilabhra has quit [Ping timeout: 260 seconds]
Nilabhra has joined #mlpack
ank_95_ has quit [Quit: Connection closed for inactivity]
Nilabhra has quit [Remote host closed the connection]
< zoq> ranjan123: I'm able to run the code without any problems. I have to write int main() instead of just main(), but if the code compiles on your side, that's okay. Did you change the SGD optimizer somehow? Maybe you can debug the code step by step?
Nilabhra has joined #mlpack
ranjan123_ has joined #mlpack
< ranjan123_> zoq: No I haven't changed anything in SGD optimizer!
< ranjan123_> could you please change Number of funtion to 55
< ranjan123_> and then compile ,run
< ranjan123_> *function
< ranjan123_> I have only added Log::Warn to sgd_impl.hpp to get some msg else everything unchanged
< zoq> ranjan123_: Works without any problems.
< ranjan123_> ohhk
< ranjan123_> thanks zpq
< ranjan123_> zoq
< ranjan123_> let me see
< ranjan123_> :)
< zoq> ranjan123_: If you could narrow down the error to a specific line, we could go from there.
< zoq> ranjan123_: Also does the error occur at i == 55? In the Evaluation of in the Gradient function?
< ranjan123_> no you are saying that it works without any problem, then I guess I have changed something as I was playing around with sgd.
< ranjan123_> yes
< ranjan123_> ohh i==55
< ranjan123_> not sure
< ranjan123_> let me check with fresh mlpack. then I will come back
< zoq> ranjan123_: I also checked the code, and it looks good to me, but maybe I missed something. Sounds good.
ranjan123_ has quit [Quit: Page closed]
ank_95_ has joined #mlpack
ranjan123_ has joined #mlpack
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#719 (master - 460e326 : Marcus Edel): The build passed.
travis-ci has left #mlpack []
< ranjan123_> zoq: It is giving same error . I have cloned new mlpack from github then build it installed it, then ran my prog. But I have no idea Why it is running in your system .
< ranjan123_> I have compiled it this way: g++ find_equ.cpp -lmlpack -lboost_serialization -larmadillo -std=c++11
< zoq> ranjan123_: That's weird, can you go through the program step by step, to narrow down the error position? I guess, once we know the exact line, we can solve the error.
< ranjan123_> see this
< ranjan123_> Program received signal SIGSEGV, Segmentation fault. 0x00000000004095e1 in MyFunction::Evaluate(arma::Mat<double> const&, unsigned long) const () (gdb) backtrace #0 0x00000000004095e1 in MyFunction::Evaluate(arma::Mat<double> const&, unsigned long) const () #1 0x000000000040d56a in mlpack::optimization::SGD<MyFunction>::Optimize(arma::Mat<double>&) () #2 0x8000000000c53134 in ?? () #3 0x8000000000000000 in ?? () #4 0xbf1e1d
< ranjan123_> oh! wait a min
< ranjan123_> Compiled it with debug mode . Here you can see the error http://pastebin.com/8CSW6JsZ
< ranjan123_> at /usr/include/armadillo_bits/Mat_meat.hpp:2588 2588 arma_debug_check( col_num >= n_cols, "Mat::col(): index out of bounds");
< ranjan123_> I can't access n_cols
< ranjan123_> So that I can see the value by Printing it.
< zoq> So "index out of bounds", just to be sure 'cout<<" "<<data1.n_cols<<" "<<data1.n_rows<<endl;' return '58 3' on your side right?
< ranjan123_> yes
< zoq> ranjan123_: Okay, let's make sure you don't link against some old SGD file. Can you comment everything in Evaluate and just write 'return 1' and also comment out everything in Gradient and just write 'gradient.zeros(4);'?
< ranjan123_> ok
< ranjan123_> zoq: done ! without any error
< zoq> ranjan123_: okay, great, so I think if you just uncomment the Evaluate function it results in an segmentation fault?
< ranjan123_> hmm, yes !
< ranjan123_> uncommenting Evaluate function where ?
< ranjan123_> I have just returned 1 in the defination of Evaluate function.
< zoq> and it results in an segmentation error?
< ranjan123_> na , as I already said !
< ranjan123_> [20:31] <ranjan123_> zoq: done ! without any erro
< ranjan123_> there is no segmentation fault
< rcurtin> is there a pastebin with the code? I am happy to take a glance and see if I see anything wrong
< rcurtin> or, with the current code, that is
< rcurtin> I realize I am jumping in a little after the session has started here :)
< rcurtin> okay, thanks
< zoq> To be honest, everything looks good to me. So I'm kind of lost ...
< rcurtin> GetInitialPoint() is returning a four-dimensional point, but your data is three-dimensional in traj.txt
< ranjan123_> I don't know whats wrong with my system , where as zoq has no error .
< ranjan123_> yes
< zoq> yeah, Ri.insert_rows(3,1); adds another row
< ranjan123_> data is three-dimensional
< ranjan123_> coordinate is 4D
< ranjan123_> sum_over_i (a*x_i + b*y_i + c*t_i + d)^2
< ranjan123_> coordinate=(a,b,c,d)
< rcurtin> oh, I see, sorry
< rcurtin> yeah, I don't see anything wrong here either
< ranjan123_> hmmm.
< ranjan123_> :(
< rcurtin> let me try to run the code on my system
< rcurtin> *** stack smashing detected ***: ./find_equ terminated
< rcurtin> looks like there is some invalid memory access
< ranjan123_> hmm
< ranjan123_> yes
< rcurtin> hang on, I'm digging into it with gdb
< rcurtin> interesting, if I compile with -fno-stack-protector, it runs fine
< ranjan123_> yes
< ranjan123_> it is runs in my system also
< ranjan123_> valgrind output: http://pastebin.com/0dJMVQuj
< ranjan123_> Use of uninitialised value of size 8
< rcurtin> ah now you are onto something
< rcurtin> I couldn't get valgrind to give me anything useful
< rcurtin> try valgrind with --track-origins=yes
< rcurtin> or is it --trace-origins? I can never remember
< rcurtin> yeah, --track-origins=yes
< rcurtin> are you sure that your mlpack installation hasn't been modified?
< ranjan123_> yes
< ranjan123_> I am sure
< rcurtin> my guess based on that valgrind output is that Evaluate() is being called with an invalid function index
< rcurtin> okay
< rcurtin> can you attach gdb to the process and inspect what the value of i is when this memory issue happens?
< rcurtin> you can use valgrind and gdb in tandem:
< ranjan123_> i=24
< rcurtin> i = 24 causes that issue?
< rcurtin> but traj.txt has 58 columns
< ranjan123_> yes
< rcurtin> so I don't think that I am running the same code you are
< rcurtin> size_t NumFunctions() const { return 25; }
< rcurtin> is this different in your setup?
< ranjan123_> size_t NumFunctions() const { return 55; }
< rcurtin> (that was 20, but I changed it to 25 to try and reproduce the i=24 error but I could not do it)
< rcurtin> okay, let me try 55
< rcurtin> no, I have no issue with 55
< ranjan123_> ohk
< rcurtin> ah wait hang on
< rcurtin> now if I compile without -fno-stack-protector I can reproduce your issue...
< ranjan123_> for now I can ran it by -fno-stack-protector
< ranjan123_> ohh
< ranjan123_> you compiled it with -fno-stack-protector
< rcurtin> yeah, but needing to compile with -fno-stack-protector is not a great solution in the long run, we should figure out what the actual issue is
< ranjan123_> hmm. That is right
< ranjan123_> I guess problem is in armadillo mat .
< ranjan123_> armadillo_bits/Mat_meat.hpp
Nilabhra has quit [Remote host closed the connection]
< rcurtin> I very much doubt this is a bug in armadillo
Nilabhra has joined #mlpack
< rcurtin> when I look through this with gdb, what I can see is that when I try to inspect data1 at the time of the crash, that fails:
< rcurtin> (gdb) inspect data1
< rcurtin> Cannot access memory at address 0x8002b1be22ce5ae5
< rcurtin> but that is a different memory location than during the previous calls to Evaluate()!
< rcurtin> (where it was 0x7fffffffdfa0)
< rcurtin> hang on, look at Gradient():
< rcurtin> if(i==3)
< rcurtin> {
< rcurtin> gradient[i]=2*val;
< rcurtin> }
< rcurtin> else
< rcurtin> {
< rcurtin> gradient[i]=2*val*Ri[i];
< rcurtin> }
< ranjan123_> yes
< rcurtin> but i takes values between 0 and 55, not 0 and 4---and the gradient is of size 4
< rcurtin> so there is your invalid access
< ranjan123_> na
< ranjan123_> coordinate=(a,b,c,d)
< ranjan123_> so there will be 4 gradient
< rcurtin> no
< zoq> nice catch
< rcurtin> the 'i' parameter to the Gradient() method indicates which of the functions the gradient is being taken for
< rcurtin> for your SGD situation, 'i' refers to the data point you are looking for
< ranjan123_> ahhh
< rcurtin> yeah
< rcurtin> so I think i is misused in those couple lines
< rcurtin> maybe you meant to use j within the loop, or something? I am not sure
< rcurtin> I think that what is happening on our systems is this:
< rcurtin> the compiler arranges things in memory such that the memory address 26 bytes after the start of the gradient memory is the address of the 'data1' matrix
< rcurtin> so when i = 26, we end up overwriting the address of the data1 matrix and everything goes to hell
< ranjan123_> hmmm..
< rcurtin> but on marcus's system which I think is OS X, the compiler probably does something a little bit different which is why he was not able to replicate the issue
< ranjan123_> oops
< rcurtin> OS X or FreeBSD, I don't know what is being used today :)
< zoq> FreeBSD :)
< rcurtin> was the compiler clang or gcc, out of curiosity?
< zoq> clang
< ranjan123_> clang
< rcurtin> ranjan123_: you were using clang too?
< ranjan123_> na
< ranjan123_> gcc
< rcurtin> ahh, okay
< rcurtin> so I guess clang and gcc must be organizing things in memory differently
< rcurtin> I gotta get lunch, I'll be back later
< ranjan123_> hmmm
< ranjan123_> Thanks ! zoq , rcurtin
< ranjan123_> I have to go for dinner !
ranjan123_ has quit [Ping timeout: 250 seconds]
tsathoggua has joined #mlpack
tsathoggua has quit [Client Quit]
ank_95_ has quit [Quit: Connection closed for inactivity]
ranjan123_ has joined #mlpack
< ranjan123_> I was kind of lost implementing CG and counfued with sgd . :P . Now it is working with sgd, psgd :D
< ranjan123_> *confused
< rcurtin> ranjan123_: glad that you got it working. when I merge #611, if you want to add your name to the list of contributors in src/mlpack/core.hpp and COPYRIGHT.txt and open another PR, I'll merge it
< rcurtin> #611 is maybe a small contribution but it is still a contribution :)
< ranjan123_> haha
< ranjan123_> I will add that later
< ranjan123_> :)
ranjan123_ has quit [Quit: Page closed]
Nilabhra has quit [Remote host closed the connection]
sumedhghaisas has joined #mlpack
K4k has quit [Quit: WeeChat 1.4]
K4k has joined #mlpack
kirizaki has joined #mlpack
< kirizaki> Hello everone
< kirizaki> sorry for long offline
< kirizaki> some difficulties with new job ;)
< rcurtin> kirizaki: hi there, long time no see! :)
< rcurtin> hope the new job is going well :)
< kirizaki> yes
< kirizaki> I'm getting little bit closer with IT dept. :P
< rcurtin> haha
< rcurtin> it was the same way when I started my new job in August
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#723 (master - 78f6ac2 : Ryan Curtin): The build was broken.
travis-ci has left #mlpack []
ank_95_ has joined #mlpack
kirizaki has quit [Remote host closed the connection]
ranjan123_ has joined #mlpack
sumedhghaisas has quit [Ping timeout: 276 seconds]
ranjan123_ has quit [Quit: Page closed]