verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
< jenkins-mlpack>
* jaiswalshikhar87: Rebase to master
< jenkins-mlpack>
* jaiswalshikhar87: Implement O'Reilly Test
< jenkins-mlpack>
* Ryan Curtin: Handle training sets that don't divide evenly by k.
< jenkins-mlpack>
* Ryan Curtin: Add ShuffleData() with weights.
< jenkins-mlpack>
* Ryan Curtin: Add 'shuffle' parameter to KFoldCV.
< jenkins-mlpack>
* Ryan Curtin: Safer handling of sparse matrix values arrays.
< jenkins-mlpack>
* Ryan Curtin: Remove trainingSubsetSize and clarify comments.
< jenkins-mlpack>
* Ryan Curtin: Rename variables for clarity.
< jenkins-mlpack>
* Ryan Curtin: Expose Shuffle() to users.
< jenkins-mlpack>
* Ryan Curtin: Update documentation.
< jenkins-mlpack>
* Ryan Curtin: Fix condition, thanks Kirill for pointing it out.
< jenkins-mlpack>
* Ryan Curtin: Add section on using the Python bindings without installing.
< jenkins-mlpack>
* Ryan Curtin: Clarify any need for LD_LIBRARY_PATH setting.
< jenkins-mlpack>
* Ryan Curtin: Be sure to print `output = ` if there are output parameters.
< jenkins-mlpack>
* Ryan Curtin: Fix the parameter name we are using. (Thanks @rasbt!)
< jenkins-mlpack>
* Ryan Curtin: Don't overwrite the 'perceptron' module in Python. (Thanks @rasbt!)
manish7294 has quit [Quit: Page closed]
vivekp has joined #mlpack
< ShikharJ>
zoq: I'll help you :P Imagine an input point of 3x3 dimension and 3 channels. So you now have a 3x3x3 input cube with batchSize = 1 (for simplicity). You want 3 output channels as well, so your kernel has (inSize * outsize = 3x3 = 9) slices. Let's assume even the kernel has dimensions 3x3, so your final kernel dimensions are 3x3x9, which would give you an output of 1x1x3.
< ShikharJ>
zoq: Now try iterating through the Gradients step with the previous approach and the current approach, the first three input slices convolve with the first three kernel slices to give you the first output slice. So the first output slice should be convolved with the first three input slices, to give you the gradients for the first three kernel slices. But this does not happen in the previous approach, as the first three
< ShikharJ>
kernel gradients are now instead stored at a distance of (outSize) from each other (see the variable s and how it's updated).
< ShikharJ>
zoq: I'll also need to add the support for batches in BilinearInterpolation method
< Atharva>
sumedhghaisas: zoq: rcurtin: Should I rebase the PR even if I have resolved the conflicts?
< ShikharJ>
Atharva: It's a good practice, as it helps to weed out any error that might come up due to the underlying routines being changed.
< manish7294>
rcurtin: whoa, you are up already :)
< manish7294>
rcurtin: I am facing some issues with setting up benchmarks on slake, maybe due to dependencies issues.
< manish7294>
I think mysql server is not there and the python version is 2.7
< manish7294>
With shogun script I am continously getting IO error
< manish7294>
It is originating from SplitTrainData from util/misc.py
sulan_ has joined #mlpack
< rcurtin>
manish7294: hmm, I know you can run the system to use a local sqlite db
< rcurtin>
in the config.yaml you can do:
< rcurtin>
database: 'benchmarks.db'
< rcurtin>
driver: 'sqlite'
< rcurtin>
and you can comment out databaseHost and port
< rcurtin>
and yeah, it's an early flight today...
< rcurtin>
not sure why I picked this one but it is too late to change it now :)
< manish7294>
rcurtin: Now other one popped up File "/home/manish/benchmarks/util/database.py", line 73, in Execute return self.Execute(command) RecursionError: maximum recursion depth exceeded Makefile:176: recipe for target '.run' failed
< manish7294>
rcurtin: And it seems, LMNN authors applied PCA before LMNN to reduce dimensionality of large dimensional datasets(like for mnist 784 to 164)
< rcurtin>
manish7294: what if you create a very simple configuration with no LMNN runs?
< rcurtin>
that will help narrow down whether the issue is specific to the LMNN benchmark scripts
< manish7294>
if I turn off log then it works
< manish7294>
sure, I will try debug more.
< rcurtin>
hmm, can you paste your config somewhere?
< rcurtin>
looks like maybe it is not specific to the LMNN scripts
< rcurtin>
I will try this out and see how well the bound could work
< rcurtin>
don't feel obligated to drop what you are doing and try it; I'll let you know what I find out
manish7294 has joined #mlpack
wenhao has quit [Ping timeout: 260 seconds]
sulan_ has joined #mlpack
ImQ009 has joined #mlpack
< manish7294>
rcurtin: I read your research and it is precisely explained :)
< manish7294>
Just have a doubt (you earlier had this) : finding the violation be too costly.
< manish7294>
*may be too costly
< rcurtin>
manish7294: there is a problem with the bounding, I think it is backward
< rcurtin>
I am revising it now
< rcurtin>
I did some quick simulations, I found that (at least for covertype), || L_{t + 1} - L_t ||_F^2 tends to be something like 1e-5 or smaller with sgd and lbfgs
< rcurtin>
of course with sgd it will depend on the step size too
< rcurtin>
but that is far smaller than the norm of the points
< rcurtin>
anyway let me fix this bound, I see my logical error
< rcurtin>
the results I saw imply that maybe we have to recalculate impostors for the first several iterations, but for later iterations we may be able to go a very long time without performing the calculation of impostors
< manish7294>
rcurtin: I agree with this.
< manish7294>
sgd and pretty much every optimizer shows these characterstics
< rcurtin>
the flaw is basically that I did the bounding backwards. when I go from (11) to (12), I used a lower bound on d_L_{t + 1}(x_i, x_a) and an upper bound on d_L_{t + 1}(x_i, x_b)
< rcurtin>
but it should be the other way around; I need to use an upper bound on d_L_{t + 11}(x_i, x_a) and a lower bound on d_L_{t + 1}(x_i, x_b)
< rcurtin>
oops, t + 1 not t + 11...
< manish7294>
no worries, got the point :)
< manish7294>
zoq: rcurtin: Can you please tell what could be the possible reason behind getting this result with benchmarks https://pastebin.com/8bTTcUh9
< zoq>
okay, so the wine issue is fixed with the latest commit
< manish7294>
zoq: the main problem is with shogun script https://pastebin.com/8bTTcUh9 It does not appear to be timeout as there is a line in script except timeout_decorator.TimeoutError: return -1
< zoq>
manish7294: Can you check if the PCA shogun benchmark works?
< zoq>
manish7294: The shogun installation on gekko failed so maybe it failed on the other ones as well?
< manish7294>
zoq: started, let's see the results
< manish7294>
It is confirmly not a timeout as I cross-checked with a large value.
< manish7294>
zoq: It's working
< manish7294>
Probably, I am doing something wrong.
< zoq>
manish7294: Okay, let's see if I can reproduce the issue.
< zoq>
Can you post the make command?
< rcurtin>
zoq: I think I see a bug; in run_benchmark.py, the block at 422 implies that runtime timeout is -1 and failure is -2, but the block at 457 implies the opposite
< manish7294>
make run BLOCK=shogun METHODBLOCK=LMNN
< rcurtin>
plane is landing... I will have to go now
< manish7294>
zoq: I think the issue is that shogun's lmnn train needs a initial matrix to be passed but I haven't been doing that.
< zoq>
okay, I get an definitions undefined error
< manish7294>
Ya, I removed it later
< manish7294>
forgot to do so on remote
< zoq>
okay, now I get: 'LMNN' object has no attribute 'X'
< zoq>
just added print(e) in the exception block
< manish7294>
I think error is on line 63
< manish7294>
self shouldn't be there
< zoq>
right
< zoq>
numpy.float64 should be np.float64
< zoq>
afterwards I get at least some timings, not sure the output is correct
< manish7294>
oh! I made a bunch of mistake, Don't know how I let them pass by :)
< manish7294>
Now, I am wondering why didn't I got any error while running this?
< manish7294>
zoq: Can you please post the timings you got?
< yaswagner>
Hey ryan! I've been mimicking the structure of pca.pyx in Go and calling functions in the same order. It is able to call ResetTimers(), EnableTimers(), DisableBacktrace(), and DisableVerbose(), but then when it tries to call RestoreSettings("Principal Components Analysis"), I get the following error: std::invalid_argument: no settings stored under the name 'Principal Components Analysis'. I'm thinking it might be because I haven't
< yaswagner>
in mlpack_main.hpp*
< yaswagner>
When you hand-binded pca had you created a PyOption and added a BINDING_TYPE_PYTHON to mlpack_main.hpp ahead of time?
< rcurtin>
yaswagner: ah, right, I know what is going on here
< rcurtin>
when I did the hand binding I did have to do work with the PARAM macros, so I think my email to you was a little incorrect; sorry about that
< rcurtin>
for the CLI and Python bindings, what the PARAM macros do is actually declare some Option type that then registers itself with the CLI singleton when its constructor is called
< rcurtin>
I am not 100% sure but I think before including pca_main.cpp, you could '#define BINDING_TYPE BINDING_TYPE_PYX' for now and we can make a different type later as needed
< rcurtin>
then I think the only thing remaining wohld be to set the programName string
< rcurtin>
unfortunately I'm not in a great place to dig in and help right now but maybe that can help get you startedm
< yaswagner>
Ok that makes sense! Thanks, Ill add the #define
< rcurtin>
yeah, and I think if you have a way to set 'programName = "Principal Components Analysis"' that could also be necessary
manish7294 has quit [Quit: Page closed]
< yaswagner>
Ya simply adding the #define BINDING_TYPE BINDING_TYPE_PYTHON doesnt work. Ill look into adding programName. Thanks!
< zoq>
manish7294: I just timed the iris dataset, maybe it's just taking a long time?
< zoq>
manish7294: I can post the modified script if that helps.
manish7294 has joined #mlpack
< manish7294>
zoq: Is that with shogun?
< zoq>
manish7294: yes
< zoq>
shogun
< zoq>
iris 0.332911
< manish7294>
zoq: can I ask you one more favor
< zoq>
yeah, sure
< manish7294>
can you please try mlpack script too
< manish7294>
I think I messed up something here, so probably will need to do fresh build
< manish7294>
zoq: And did you made any other changes in the PR code other than what you mentioned earlier for running this?
< zoq>
I added metrics_folder = os.path.realpath(os.path.abspath(os.path.join(
< Atharva>
rcurtin: Is it okay to have a normal_distribution_impl.hpp file in dists rather than normal_distribution.cpp. I am having some trouble with template funtions.
< zoq>
Atharva: Sounds fine.
< zoq>
manish7294: Running the script now.
< Atharva>
zoq: Thanks!
< zoq>
Atharva: Do not forget to include the impl inside the hpp at the end :)
< Atharva>
zoq: Did that :)
< Atharva>
Is there some reason template functions error when built with cpp files?
< manish7294>
zoq: rcurtin: Thanks to you, I somehow managed to run the script on iris, and results are quite unbelievable - https://pastebin.com/DRHS6F5u
< manish7294>
there is a large difference eventhough I have not used N iterations functionality of ours
< manish7294>
zoq: Please confirm if you get similar results. Thanks!
< Atharva>
zoq: As I am modiying core files, it takes a lot of time to build. Is there a quicker way to do this?
< zoq>
Atharva: Not really you can disable the executables and the tests if you not going to use them: -DBUILD_TEST=OFF and -DBUILD_CLI_EXECUTABLES=OFF
< zoq>
Atharva: Also you can build with multiple cores: make -j4 will build the code using 4 cores.
< zoq>
ohh and you can also disable the python bindings: -DBUILD_PYTHON_BINDINGS=OFF
< Atharva>
Okay, I will disable the executables, tests and python bindings. As for the cores, I am using all 8 of them.
< rcurtin>
manish7294: results look great so far! another thing to do would be to add the resulting kNN accuracy as a metric, so that we can know that mlpack and shogun are probiding roughly the same improvement
< Atharva>
zoq: Sorry to disturb again. The NegativeLogLikelihood class has not been not moves to the loss function folder. Is there some reason or should I do it. I will open a PR tomorrow with the inputparameter member removed from layers that don't need it. I can throw in this change too
< Atharva>
moved*
manish7294 has quit [Ping timeout: 260 seconds]
< zoq>
Atharva: No reason, we can move it.
< Atharva>
zoq: okay
witness_ has joined #mlpack
< Atharva>
zoq: Should I also remove the delta member of the loss layers, I don't think it's being used there
< zoq>
Atharva: yes, sure
< Atharva>
zoq: Okay!
ImQ009 has quit [Quit: Leaving]
< Atharva>
zoq: rcurtin: I added some files to core/dists and built mlpack with it. Now, I am working on a different branch where those files don't exist. But, due to CXX.includecache files, my build is failing. How do I clear the cache?
< zoq>
Atharva: Either remove the build folder or perhaps 'make clean' works as well.