--- Log opened Fri Jul 05 00:00:18 2013 | ||
@sonney2k | pickle27, seen this CJADiag.diagonalize | 00:04 |
---|---|---|
@sonney2k | mathematics/ajd/JADiag_unittest.cc:111: Failure | 00:04 |
@sonney2k | Value of: true | 00:04 |
@sonney2k | Expected: isperm | 00:04 |
@sonney2k | Which is: false | 00:04 |
@sonney2k | [ FAILED ] CJADiag.diagonalize (5 ms | 00:04 |
@sonney2k | https://travis-ci.org/shogun-toolbox/shogun/jobs/8746957 | 00:05 |
-!- gsomix [~gsomix@109.188.126.210] has quit [Ping timeout: 248 seconds] | 00:05 | |
pickle27 | sonney2k: yeah I did see that | 00:05 |
pickle27 | sonney2k: it creates new test data each time but its never failed when I've ran it | 00:06 |
pickle27 | sonney2k: it should use a chi square but there wasn't an easy way to do that so I left it as gaussian which may be the problem | 00:06 |
pickle27 | sonney2k: but like I said its never happened to me before and it passed the other builds right | 00:07 |
pickle27 | sonney2k: is there an easy way to do chi squared in Shogun? there is a nice way to do it with C11... | 00:09 |
@sonney2k | pickle27, maybe you didn't initialize the rng then? | 00:11 |
shogun-buildbot | build #1318 of deb3 - modular_interfaces is complete: Failure [failed test python_modular] Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb3%20-%20modular_interfaces/builds/1318 blamelist: Soeren Sonnenburg <sonne@debian.org> | 00:11 |
@sonney2k | if test data is random you will get different results all the time | 00:11 |
pickle27 | sonney2k: yeah but the final result from this test should always be a permutation matrix | 00:11 |
pickle27 | even with random input | 00:11 |
pickle27 | I mean it is constrained random input so that it should work | 00:12 |
@sonney2k | pickle27, welcome to the wonderful world of the numerics of float/double | 00:12 |
pickle27 | sonney2k: haha yup | 00:12 |
@sonney2k | pickle27, so just do a fixed seed | 00:12 |
@sonney2k | if that works locally then it should work remotely | 00:13 |
pickle27 | sonney2k: okay I'll do that | 00:13 |
@sonney2k | pickle27, CMath::init_random(17) | 00:13 |
pickle27 | sonney2k: I have another PR up for the second alg so I'll just push to that | 00:13 |
@sonney2k | pickle27, I also have some code style comments | 00:13 |
@sonney2k | pickle27, please do for (int i...) | 00:14 |
@sonney2k | space between for and ( | 00:14 |
pickle27 | ah right | 00:14 |
@sonney2k | and also when a for loop as more than 1 line use { } | 00:14 |
pickle27 | sonney2k: should be fixed up in my latest PR | 00:24 |
pickle27 | im testing it on my system right now | 00:24 |
pickle27 | sonney2k: looks like setting the random seed didn't fix the second build | 00:33 |
pickle27 | sonney2k: could it be a clang problem? | 00:34 |
pickle27 | sonney2k: my is_perm function isn't the best because the matrix will have a random scale | 00:34 |
pickle27 | it would help if I could reproduce on my computer, can I just apt-get clang and try? | 00:35 |
pickle27 | nvm gcc failed on the other one.. | 00:36 |
shogun-buildbot | build #1319 of deb3 - modular_interfaces is complete: Failure [failed test python_modular] Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb3%20-%20modular_interfaces/builds/1319 blamelist: Soeren Sonnenburg <sonne@debian.org> | 00:39 |
pickle27 | sonney2k: I sent a commit to print the matrix, hopefully this will help me figure out whats up! | 01:02 |
pickle27 | how long do travis logs stay for? I have to step out for a bit | 01:03 |
pickle27 | got the logs saved before I had to head out! | 01:12 |
-!- shogun-notifier- [~irker@7nn.de] has quit [Quit: transmission timeout] | 02:25 | |
-!- iglesiasg [~nando@s83-179-44-135.cust.tele2.se] has quit [Quit: Ex-Chat] | 02:36 | |
shogun-buildbot | build #391 of nightly_none is complete: Success [build successful] Build details are at http://www.shogun-toolbox.org/buildbot/builders/nightly_none/builds/391 | 03:19 |
-!- FSCV [~FSCV@50.7.50.60] has quit [Quit: Leaving] | 03:36 | |
shogun-buildbot | build #383 of nightly_all is complete: Success [build successful] Build details are at http://www.shogun-toolbox.org/buildbot/builders/nightly_all/builds/383 | 03:46 |
-!- zxtx_ [~zv@cpe-75-83-151-252.socal.res.rr.com] has joined #shogun | 03:55 | |
-!- Netsplit *.net <-> *.split quits: zxtx | 03:59 | |
shogun-buildbot | build #448 of nightly_default is complete: Failure [failed test] Build details are at http://www.shogun-toolbox.org/buildbot/builders/nightly_default/builds/448 | 04:33 |
-!- nube [~rho@49.244.93.13] has quit [Quit: Leaving.] | 05:01 | |
-!- nube [~rho@116.90.239.3] has joined #shogun | 06:03 | |
-!- nube [~rho@116.90.239.3] has quit [Quit: Leaving.] | 06:43 | |
-!- nube [~rho@116.90.239.3] has joined #shogun | 06:44 | |
-!- nube [~rho@116.90.239.3] has quit [Quit: Leaving.] | 07:44 | |
@sonney2k | pickle27, for eternity | 08:17 |
@sonney2k | wiking_, ping again | 08:17 |
wiking_ | sonney2k: pong | 09:04 |
-!- nube [~rho@116.90.239.3] has joined #shogun | 09:06 | |
-!- votjakovr [~votjakovr@host-46-241-3-209.bbcustomer.zsttk.net] has joined #shogun | 09:07 | |
sonne|work | wiking_: good morning! | 09:20 |
sonne|work | wiking_: could you please give me some url for the feed? | 09:20 |
-!- hushell [~hushell@c-24-21-141-32.hsd1.or.comcast.net] has joined #shogun | 09:20 | |
hushell | sonney2k: Hi, I got a strange problem. After included SGObject.h, I have to include Parameter.h to use SG_ADD, but CParameter has already been declaried | 09:24 |
votjakovr | sonne|work: guten morgen :) i see, that you added evaluate_*() methods to black list for c# modular, is that temporal solution? | 09:25 |
sonne|work | votjakovr: good enough for some time :) | 09:26 |
hushell | Another question, how can I register some member whose type is const char*, or I have to use SGString? | 09:27 |
votjakovr | sonne|work: ok | 09:27 |
sonne|work | votjakovr: it is totally unclear why using 2 SGVectors or other SG* datatypes works with all other typemaps but our csharp one | 09:28 |
sonne|work | votjakovr: needs to be bug reported / toy example created | 09:28 |
sonne|work | votjakovr: anyway not so bad | 09:28 |
sonne|work | hushell: well depends :) what do you use your char* for | 09:29 |
sonne|work | hushell: best is to use SGVector<char> | 09:29 |
hushell | sonne|work: I want to have an identity | 09:32 |
-!- nube [~rho@116.90.239.3] has quit [Quit: Leaving.] | 09:33 | |
hushell | "astring" cannot be converted to SGVector<char> implicitly | 09:33 |
hushell | but std::string is possible | 09:34 |
-!- nube [~rho@116.90.239.3] has joined #shogun | 09:35 | |
sonne|work | hushell: asstring? | 09:37 |
hushell | sonne|work: for the const char* member, I need to pass a "name" in the argument of constructor | 09:37 |
hushell | "astring" means a string in c++ :) | 09:38 |
hushell | I mean in a function call | 09:39 |
sonne|work | hushell: as in void foo(int x, const char* name) ? | 09:40 |
hushell | sonne|work: yep | 09:41 |
wiking_ | sonne|work: http://maeth.com:8000/shogun_workshop.ogg | 09:41 |
sonne|work | hushell: yeah sure | 09:41 |
hushell | so const char* cannot be used as a member? if we have to register it | 09:42 |
-!- iglesiasg [~iglesias@2001:6b0:1:1041:fda4:69d9:9772:7713] has joined #shogun | 10:01 | |
-!- mode/#shogun [+o iglesiasg] by ChanServ | 10:01 | |
sonne|work | hushell: register means? | 10:22 |
sonne|work | hushell: without more context I cannot really answer this | 10:23 |
hushell | sonne|work: I solved it by using SGString, but I am wondering why we need to register member variables? register here means SG_ADD | 10:35 |
sonne|work | you only need that when you want to say that this variable needs to be saved (serialization) or could be used in modelselection | 10:36 |
hushell | sonne|work: Thanks! then not everybody need to do that | 10:41 |
-!- van51 [~van51@athedsl-408350.home.otenet.gr] has joined #shogun | 11:07 | |
sonne|work | hey van51 | 11:07 |
sonne|work | good morning | 11:07 |
van51 | sonne|work: hello | 11:07 |
van51 | sonne|work: good morning to you too | 11:07 |
sonne|work | I was wondering how the training time is now and what parameters you used | 11:08 |
van51 | sonne|work: I think it's still taking a lot of time | 11:08 |
van51 | sonne|work: all I did was to specify C=4 | 11:08 |
sonne|work | van51: and epsilon? | 11:09 |
sonne|work | van51: any normalization? | 11:09 |
sonne|work | as in make vectors norm =1 ? | 11:10 |
sonne|work | no right? | 11:10 |
sonne|work | then it is no wonder :) | 11:10 |
van51 | sonne|work: no nothing like that :) | 11:10 |
sonne|work | van51: so what epsilon did you set then? | 11:11 |
sonne|work | default? | 11:11 |
van51 | sonne|work: yea I just wanted to get it running first | 11:11 |
sonne|work | ok 1e-3 with not properly scaled data will kill you | 11:11 |
sonne|work | van51: a standard trick is to divide the vector by the number of non-zero elements | 11:12 |
sonne|work | van51: so you should implement support for that in your features (optional of course) | 11:13 |
sonne|work | for n-grams it is rather easy since a constant | 11:14 |
sonne|work | for delimited words it depends on #words | 11:14 |
sonne|work | van51: OK? | 11:14 |
van51 | sonne|work: ok | 11:15 |
sonne|work | van51: just add a hack for the moment to see how fast it becomes | 11:15 |
van51 | sonne|work: so just before i returns the vector in the dense_dot? | 11:16 |
-!- nube [~rho@116.90.239.3] has quit [Quit: Leaving.] | 11:16 | |
van51 | sonne|work: actually it doesn't return a vector | 11:16 |
-!- nube [~rho@116.90.239.3] has joined #shogun | 11:17 | |
sonne|work | van51: it returns a scalar so just multiply with that normalization const | 11:17 |
sonne|work | i.e. norm_const = 1.0/num_ngrams | 11:18 |
sonne|work | van51: with add_to_dense_vec you have to do it for each element | 11:18 |
sonne|work | van51: and dot the thing squared | 11:19 |
van51 | sonne|work: why dot the thing squared? | 11:27 |
sonne|work | van51: it is (a * norm_const) * (b*norm_const) | 11:29 |
sonne|work | (both a & b are normalized) | 11:29 |
-!- iglesiasg_ [~iglesias@n131-p244.kthopen.kth.se] has joined #shogun | 11:37 | |
-!- iglesiasg [~iglesias@2001:6b0:1:1041:fda4:69d9:9772:7713] has quit [Read error: Connection reset by peer] | 11:37 | |
-!- iglesiasg_ [~iglesias@n131-p244.kthopen.kth.se] has quit [Client Quit] | 11:37 | |
-!- iglesiasg [~iglesias@2001:6b0:1:1041:fda4:69d9:9772:7713] has joined #shogun | 11:37 | |
-!- mode/#shogun [+o iglesiasg] by ChanServ | 11:37 | |
van51 | sonne|work: so in a setting of 500 examples, c=4 , default e | 11:49 |
van51 | sonne|work: with converter it takes like 2-3 secs | 11:49 |
van51 | sonne|work: dot-features took -last night- 4 mins | 11:49 |
van51 | sonne|work: and I don't see an improvement with normalization | 11:50 |
sonne|work | yeah but C=1 is probably for scaled data OK C=4 for unscaled as you have is way to high. I guess more in the range of 1e-3 | 11:51 |
sonne|work | van51: that cannot be | 11:51 |
van51 | sonne|work: there is a significant speedup with C=0.001 | 11:59 |
van51 | sonne|work: but with normalization the results seem worse | 11:59 |
sonne|work | van51: sure results are not comparable | 12:00 |
sonne|work | you need different C | 12:00 |
-!- HeikoS [~heiko@nat-180-11.internal.eduroam.ucl.ac.uk] has joined #shogun | 12:00 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 12:01 | |
-!- van51 [~van51@athedsl-408350.home.otenet.gr] has quit [Read error: Connection reset by peer] | 12:03 | |
-!- van51 [~van51@athedsl-408350.home.otenet.gr] has joined #shogun | 12:04 | |
-!- votjakovr [~votjakovr@host-46-241-3-209.bbcustomer.zsttk.net] has left #shogun ["Went to the store!"] | 12:14 | |
sonne|work | van51: can you show me how you normalize? | 12:14 |
-!- nube [~rho@116.90.239.3] has quit [Ping timeout: 248 seconds] | 12:15 | |
van51 | sonne|work: yea | 12:15 |
van51 | sonne|work: one moment | 12:15 |
van51 | sonne|work: https://gist.github.com/van51/5933588 | 12:20 |
-!- lambday [67157e4f@gateway/web/cgi-irc/kiwiirc.com/ip.103.21.126.79] has joined #shogun | 12:23 | |
lambday | HeikoS: hi | 12:23 |
@HeikoS | lambday: hi! | 12:23 |
lambday | HeikoS: I tested with one matrix with pathetic condition number (10^4) | 12:23 |
@HeikoS | and? | 12:23 |
lambday | and the accuracy is 1E-5 | 12:23 |
lambday | the trace | 12:24 |
lambday | of log | 12:24 |
lambday | I think if we want more accuracy, we should use arprec | 12:24 |
lambday | (for shifts, weights etc) | 12:24 |
lambday | the accuracy I wanted is 1E-19 | 12:24 |
@HeikoS | this is with direct solves? | 12:24 |
lambday | yup | 12:24 |
@HeikoS | lambday: and can you easily try arprec? | 12:25 |
lambday | HeikoS: for Jacobi elliptic functions I already have the arprec version | 12:25 |
@HeikoS | lambday: I see | 12:25 |
@HeikoS | lambday: this might actually be the solver | 12:26 |
lambday | HeikoS: brb... a call | 12:26 |
lambday | :( | 12:26 |
@HeikoS | lambday: I think thats fine for now (especially since this is only to test whether things work), for the real deal with sparse matrices and cocg_m, we should get a better accurac<y | 12:26 |
lambday | HeikoS: bakc | 12:29 |
lambday | HeikoS: yes... | 12:29 |
lambday | and also, this is the difference in the trace | 12:29 |
@HeikoS | lambday: what do you mean? | 12:29 |
lambday | not the norm of the difference of the approximated log(m) and actual log(m) | 12:30 |
@HeikoS | ah yes | 12:30 |
@HeikoS | sure | 12:30 |
@HeikoS | but we have the exact trace right? | 12:30 |
lambday | yes... I was checking with octave... I'll soon add the eigen3 version soon | 12:30 |
@HeikoS | lambday: okay | 12:31 |
@HeikoS | lambday: good! | 12:31 |
lambday | not too bad, right? | 12:31 |
lambday | hmm :) | 12:31 |
@HeikoS | sounds good, yes :) | 12:31 |
@HeikoS | so now, the more interesting things begin :D | 12:31 |
lambday | yes :D | 12:31 |
@HeikoS | conjugate gradient pain ;) | 12:31 |
lambday | next two days I can give fully to gsoc.. | 12:31 |
lambday | :D | 12:31 |
lambday | weekends yay :D | 12:31 |
lambday | I should add sparse thing before going into cocg | 12:32 |
lambday | right? | 12:32 |
@HeikoS | lambday: yes thats true | 12:32 |
lambday | oh and what about having a different base for cocg_m? | 12:33 |
@HeikoS | lambday: explain this a bit | 12:33 |
lambday | I don't think we can manage it in the same interface as other solvers | 12:33 |
lambday | since their solve returns an SGVector | 12:33 |
-!- wiking_ is now known as wiking | 12:33 | |
-!- wiking [~wiking@info2k1.hu] has quit [Changing host] | 12:34 | |
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun | 12:34 | |
-!- mode/#shogun [+o wiking] by ChanServ | 12:34 | |
lambday | for cocg_m, we should return a SGMatrix instead | 12:34 |
lambday | and the sum can't get inside the solve | 12:34 |
lambday | because each of the solution vectors need to be multiplied with their corresponding weight before the sum | 12:34 |
lambday | getting the weights inside the solve of cocg_m will go, but I don't think that's a good idea :( | 12:34 |
@HeikoS | lambday: yeah you are right | 12:35 |
@HeikoS | lambday: damn ;) | 12:35 |
@HeikoS | do you have a suggestion? | 12:35 |
lambday | tell me about it :'( | 12:35 |
lambday | no :'( | 12:35 |
lambday | except having a different base... | 12:35 |
lambday | it won't cost generality because I moved the m_linear_solver down the the implementation of CRationalApproximation | 12:36 |
lambday | so, CLogRationalApproximationIndividual will have CLinearSolver m_linear_solver, and CLogRationalApproximationCOCG will have C<suggest-something>Solver m_linear_solver | 12:37 |
@HeikoS | cant we use a base class for these two types of solvers? | 12:37 |
@HeikoS | well | 12:38 |
@HeikoS | you know what | 12:38 |
@HeikoS | thats fine, your suggestion | 12:38 |
lambday | how shall we differentiate the signatures? :( | 12:38 |
@HeikoS | it *is* a different solver | 12:38 |
lambday | its just the return type that changes :( | 12:38 |
@HeikoS | which does something different | 12:38 |
@HeikoS | i.e. solve multiple systems | 12:38 |
lambday | yes it is... | 12:38 |
@HeikoS | so thats fine | 12:38 |
lambday | please suggest names (I suck at it :( ) | 12:39 |
-!- iglesiasg [~iglesias@2001:6b0:1:1041:fda4:69d9:9772:7713] has quit [Ping timeout: 245 seconds] | 12:45 | |
lambday | holy crap using eigen3 we get super duper accuracy! :-o | 12:48 |
lambday | this is rational approximation: 4.60517018598809446672 | 12:48 |
lambday | this is rational approximation: 4.60517018598809446672 | 12:49 |
lambday | oops | 12:49 |
lambday | sorry | 12:49 |
lambday | 4.60517018598809180219 | 12:49 |
lambday | this | 12:49 |
van51 | sonne|work: I have to g2g | 12:49 |
van51 | sonne|work: I'll be back in 2-2.5 hours | 12:49 |
lambday | HeikoS: :D | 12:49 |
@HeikoS | lambday: wow :D | 12:50 |
-!- van51 [~van51@athedsl-408350.home.otenet.gr] has quit [Quit: Leaving.] | 12:50 | |
@HeikoS | awesome | 12:50 |
@HeikoS | lambday: wait what did you change there? | 12:50 |
lambday | HeikoS: look at the accuracy we got:: 2.664535259100375697e-15 | 12:51 |
lambday | :-o | 12:51 |
lambday | nothing! I just used eigen3's log instead of testing with what octave gives | 12:51 |
@HeikoS | lambday: nice! | 12:51 |
@HeikoS | yeah octave sucks ;D | 12:51 |
lambday | yahooooooo! | 12:51 |
@HeikoS | lambday: carefull about this though | 12:51 |
@HeikoS | eigen3 probably uses a similar trick for computing matrix logs :) | 12:52 |
lambday | errr.... | 12:52 |
lambday | :( | 12:52 |
@HeikoS | so its like running the same code twice | 12:52 |
@HeikoS | but it is still very good! | 12:52 |
lambday | :) :) | 12:52 |
lambday | but | 12:53 |
lambday | umm.. | 12:53 |
@HeikoS | This function computes the matrix logarithm using the Schur-Parlett algorithm | 12:53 |
lambday | eigen3's log gives the whole matrix | 12:53 |
@HeikoS | no its different | 12:53 |
lambday | ahan! | 12:53 |
@HeikoS | its the higham paper | 12:53 |
@HeikoS | I tried that before, ours will be better for large one :) | 12:53 |
@HeikoS | cool | 12:53 |
lambday | hope so :) :) | 12:53 |
@HeikoS | very very encouraging | 12:53 |
@HeikoS | ! | 12:53 |
lambday | yessss!! :D | 12:54 |
lambday | I'll add this unit-test real soon! | 12:54 |
-!- iglesiasg [~iglesias@n131-p244.kthopen.kth.se] has joined #shogun | 12:58 | |
lambday | HeikoS: using arprec we got that ~1e15 accuracy, using normal float64_t we got ~1e-8 | 13:02 |
@HeikoS | lambday: ok good, these are very useful values for the documentation later on | 13:02 |
@HeikoS | lambday: so keep them, make them into unit tests | 13:02 |
lambday | yes | 13:02 |
lambday | okay :) | 13:02 |
lambday | I'll be back later.... :) | 13:23 |
lambday | see you | 13:23 |
-!- lambday [67157e4f@gateway/web/cgi-irc/kiwiirc.com/ip.103.21.126.79] has quit [Quit: lambday] | 13:23 | |
-!- iglesiasg [~iglesias@n131-p244.kthopen.kth.se] has quit [Quit: Ex-Chat] | 14:27 | |
-!- Netsplit *.net <-> *.split quits: @HeikoS, pickle27, hushell, sonne|work, flxb, shogun-buildbot, zxtx_, naywhayare, @sonney2k, @wiking, (+1 more, use /NETSPLIT to show all of them) | 14:51 | |
-!- Netsplit over, joins: @wiking, @sonney2k, shogun-buildbot | 14:56 | |
-!- Netsplit over, joins: @HeikoS, hushell, zxtx_, pickle27, sonne|work, flxb, naywhayare, sanyam | 14:57 | |
-!- mode/#shogun [-ooo sonney2k wiking HeikoS] by ChanServ | 15:11 | |
-!- Netsplit *.net <-> *.split quits: hushell | 15:14 | |
-!- Netsplit over, joins: hushell | 15:15 | |
-!- Netsplit *.net <-> *.split quits: flxb, naywhayare, HeikoS | 15:18 | |
-!- Netsplit over, joins: @HeikoS | 15:20 | |
-!- Netsplit over, joins: flxb | 15:20 | |
-!- naywhayare [~ryan@spoon.lugatgt.org] has joined #shogun | 15:26 | |
-!- van51 [~van51@athedsl-408350.home.otenet.gr] has joined #shogun | 15:55 | |
-!- van51 [~van51@athedsl-408350.home.otenet.gr] has quit [Client Quit] | 15:55 | |
-!- van51 [~van51@79.131.147.28] has joined #shogun | 15:56 | |
-!- van51 [~van51@79.131.147.28] has quit [Remote host closed the connection] | 16:08 | |
-!- van51 [~van51@athedsl-408350.home.otenet.gr] has joined #shogun | 16:12 | |
-!- kevin_ [~kevin@rcv3-lab-pc.ee.queensu.ca] has joined #shogun | 16:32 | |
-!- pickle27 [~kevin@rcv3-lab-pc.ee.queensu.ca] has quit [Ping timeout: 276 seconds] | 16:36 | |
-!- kevin_ is now known as pickle27 | 16:40 | |
-!- foulwall [~user@2001:da8:215:c252:482c:7add:959d:1be5] has joined #shogun | 17:09 | |
-!- van51 [~van51@athedsl-408350.home.otenet.gr] has left #shogun ["QUIT :Leaving."] | 17:30 | |
-!- sonne|work [~sonnenbu@91-64-72-127-dynip.superkabel.de] has left #shogun [] | 17:40 | |
-!- foulwall [~user@2001:da8:215:c252:482c:7add:959d:1be5] has quit [Remote host closed the connection] | 17:40 | |
-!- lisitsyn [~lisitsyn@109-226-115-134.clients.tlt.100megabit.ru] has joined #shogun | 18:06 | |
-!- nube [~rho@49.244.14.60] has joined #shogun | 18:10 | |
-!- nube [~rho@49.244.14.60] has quit [Quit: Leaving.] | 18:32 | |
-!- van51 [~van51@athedsl-408350.home.otenet.gr] has joined #shogun | 19:13 | |
-!- hushell [~hushell@c-24-21-141-32.hsd1.or.comcast.net] has quit [Ping timeout: 264 seconds] | 19:59 | |
-!- mode/#shogun [+o sonney2k] by ChanServ | 20:02 | |
@sonney2k | van51, I meant how you normalize | 20:04 |
@sonney2k | van51, as gist | 20:04 |
van51 | sonney2k: I'm not following :) | 20:07 |
van51 | sonney2k: there was the normalization in the gist I sent you earlier | 20:08 |
van51 | sonney2k: do you want something else? | 20:11 |
-!- hushell [~hushell@8-92.ptpg.oregonstate.edu] has joined #shogun | 20:17 | |
-!- HeikoS [~heiko@nat-180-11.internal.eduroam.ucl.ac.uk] has quit [Quit: Leaving.] | 20:21 | |
@sonney2k | van51, oops looked at the wrong one | 20:46 |
pickle27 | sonney2k: my unit test is still failing on Travis, I set the random seed for CMath but I'm also using setRandom from Eigen3 | 20:46 |
pickle27 | do you know how to set the random seed for eigen3? I can't find how to do it | 20:46 |
@sonney2k | van51, ok some bug is in there still | 20:46 |
@sonney2k | van51, in line 12 it should be 1.0/((sv1.size()-3)*(sv2.size()-3)) | 20:47 |
@sonney2k | van51, note that you do 1/sv1.size() which will always be 0 | 20:47 |
@sonney2k | in line 44 | 20:47 |
@sonney2k | it should be 1.0/(sv.size()-3) for the same reason | 20:47 |
van51 | sonney2k: woops | 20:48 |
@sonney2k | and line 62 should be | 20:48 |
@sonney2k | alpha/(sv.size()-3) | 20:48 |
@sonney2k | and then line 65 should be removed | 20:48 |
@sonney2k | and line 72 should be just += n_const; | 20:49 |
van51 | sonney2k: idd | 20:49 |
@sonney2k | van51, please fix and show me again | 20:49 |
van51 | sonney2k: I updated the gist | 20:53 |
lisitsyn | pickle27: hey | 20:57 |
@sonney2k | van51, looks good except for the missing ; in line 63 | 20:57 |
@sonney2k | van51, so try again! | 20:58 |
van51 | sonney2k: yea compiler told me! | 20:58 |
@sonney2k | van51, btw how do you compile / what interfaces do you compile for? | 20:58 |
lisitsyn | pickle27: may be srand does the job | 20:58 |
@sonney2k | van51, I am kind of your compiler too | 20:58 |
@sonney2k | van51, quick new results pleas :-) | 20:58 |
@sonney2k | it should be lightning fast now | 20:58 |
pickle27 | lisitsyn: I tried with srand and it didn't fix travis | 20:59 |
lisitsyn | sonney2k: struct a; template <typename T> struct b { }; struct a : b<a> { }; | 20:59 |
lisitsyn | :D | 20:59 |
@sonney2k | lisitsyn, van51's compiler not yours :P | 20:59 |
@sonney2k | pickle27, why srand? | 21:00 |
van51 | haha | 21:00 |
lisitsyn | sonney2k: warum ich bin allein | 21:00 |
@sonney2k | pickle27, CMath::init_random! | 21:00 |
lisitsyn | sonney2k: eigen's random | 21:00 |
@sonney2k | lisitsyn, is that needed? | 21:00 |
@sonney2k | no idea what you do | 21:00 |
lisitsyn | sonney2k: I'd not use it actually | 21:01 |
lisitsyn | didn't notice pickle27 used it | 21:01 |
pickle27 | lisitsyn: sonney2k I don't think thats the problem anymore looking into some other things | 21:01 |
pickle27 | sonney2k: lisitsyn I replaced using Eigens random with CMath, lets see what happens with travis now | 21:03 |
@sonney2k | pickle27, but it works locally? | 21:04 |
pickle27 | yeah | 21:04 |
pickle27 | sonney2k: its never failed for me | 21:04 |
pickle27 | its testing whether or not the end result is a permutation matrix | 21:05 |
pickle27 | on Travis there is a column that is all zeros | 21:06 |
@sonney2k | van51, how do you compile? | 21:07 |
@sonney2k | van51, any results already? | 21:07 |
van51 | sonney2k: on 50 examples, with C=0.001 it takes 75s | 21:08 |
van51 | sonney2k: with C=1 it takes 25s | 21:08 |
@sonney2k | 50k examples? | 21:08 |
@sonney2k | or 50? | 21:08 |
van51 | sonney2k: I compile for static interface | 21:08 |
van51 | just 50 | 21:08 |
van51 | :( | 21:08 |
@sonney2k | errm | 21:08 |
@sonney2k | with or w/o optimizations | 21:08 |
van51 | sonney2k: on that machine right now it's with | 21:09 |
@sonney2k | van51, if you just need C++ | 21:09 |
@sonney2k | van51, then you can do ./configure --interfaces= | 21:09 |
@sonney2k | and then make / make install | 21:09 |
@sonney2k | etc | 21:09 |
van51 | sonney2k: ah ok | 21:09 |
@sonney2k | you sure that it takes the right lib? | 21:09 |
van51 | sonney2k: yeah I believe so | 21:10 |
@sonney2k | van51, how many positive / negative examples has this? | 21:13 |
van51 | 22/28 | 21:14 |
@sonney2k | and ngram-size is what 3? | 21:14 |
van51 | yeap | 21:14 |
pickle27 | sonney2k: lisitsyn why hasn't travis started on my latest commit? | 21:15 |
lisitsyn | pickle27: I guess it is enqueued | 21:16 |
@sonney2k | van51, look at page 89 in http://sonnenburgs.de/soeren/publications/SonRaeRie07.pdf | 21:16 |
@sonney2k | van51, table 4.4 | 21:16 |
pickle27 | doesn't look like anything is queued | 21:16 |
@sonney2k | that is a 'slow' method (compared to what you have) running on webspam | 21:16 |
@sonney2k | it takes 2 secs for 100 examples | 21:17 |
@sonney2k | van51, try with n=8 | 21:18 |
pickle27 | lisitsyn: okay its building now, hopefully Travis likes it this time | 21:24 |
@sonney2k | van51, ok so lets do a quick benchmark | 21:38 |
@sonney2k | van51, take the 50 examples and just call add_to_dense_vec with all of them to some null vector and measure the time | 21:39 |
van51 | sonney2k: ok | 21:39 |
@sonney2k | van51, btw this is a good benchmark for dotfeatures anyway - so it makes a lot of sense to do this in the CDotFeatures class | 21:39 |
@sonney2k | van51, maybe there even is sth like this already in there | 21:39 |
van51 | sonney2k: on it | 21:39 |
@sonney2k | van51, indeed | 21:39 |
@sonney2k | there is | 21:40 |
@sonney2k | van51, jsut call benchmark_add_to_dense_vector() | 21:40 |
@sonney2k | and benchmark_dense_dot_range() | 21:40 |
@sonney2k | van51, I would expect it takes <1s | 21:41 |
van51 | sonney2k: with default number of repeats? | 21:41 |
@sonney2k | van51, yeah | 21:41 |
@sonney2k | it is averaging | 21:41 |
van51 | sonney2k: http://pastebin.com/Jua8ZKxZ | 21:43 |
lisitsyn | pickle27: something is happening with your PR | 21:47 |
lisitsyn | ;0 | 21:48 |
lisitsyn | ;) | 21:48 |
@sonney2k | van51, ok then if liblinear is taking > 1000 iterations you can get such bad results | 21:48 |
@sonney2k | van51, lets try SVMOcas instead of liblinear | 21:48 |
@sonney2k | van51, same syntax just CSVMOcas(C,data,labels) | 21:49 |
van51 | sonney2k: ok and I was looking for the class reference now :P | 21:49 |
@sonney2k | van51, I will have to leave in 10 minutes - so please give me a result before :) | 21:50 |
lisitsyn | sonney2k: we need to fix lua detection | 21:51 |
lisitsyn | let me try to do that | 21:51 |
@sonney2k | van51, in any case you should update olivier/benoit on your progress and even send them the example you wrote and describe what you did | 21:51 |
@sonney2k | lisitsyn, hmmhh so I guess I broke it | 21:51 |
lisitsyn | if it finds lua it tries to compile *even* if no headers are there | 21:51 |
@sonney2k | lisitsyn, I was addign support for lua52 some months back | 21:52 |
@sonney2k | I guess I broke sth | 21:52 |
van51 | sonney2k: now it takes 0.88s for 100 examples | 21:52 |
lisitsyn | sonney2k: well it should just fail with no headers | 21:52 |
lisitsyn | I will try to patch it now | 21:52 |
van51 | sonney2k: it's much much faster | 21:52 |
@sonney2k | van51, ok then give it say 10k examples | 21:53 |
@sonney2k | van51, it might be that liblinear recovers with many more examples | 21:53 |
@sonney2k | van51, liblinear is numerically not that stable | 21:53 |
lisitsyn | btw I can confirm now iphone has libsvm inside :D | 21:54 |
lisitsyn | kind of huge success for these guys | 21:54 |
@sonney2k | weird though | 21:57 |
@sonney2k | lisitsyn, what do they learn with libsvm/liblinear | 21:57 |
@sonney2k | van51, btw did you enable progress output? | 21:57 |
lisitsyn | sonney2k: no idea but license is inside | 21:57 |
lisitsyn | sonney2k: face recognition? who knows | 21:58 |
van51 | sonney2k: no I did not | 21:58 |
van51 | sonney2k: I have a run that finished iin 112s | 21:58 |
van51 | sonney2k: but the first one segfault'ed | 21:58 |
van51 | sonney2k: and the next one said corrupted double-linked list | 21:58 |
@sonney2k | lisitsyn, I mean I can understand they *learn* some models on some cluster(s) but then just applying stuff doesn't need a license or anything | 21:58 |
lisitsyn | sonney2k: no it is in license of any iphone | 21:59 |
lisitsyn | so some code is running on iphone | 21:59 |
@sonney2k | lisitsyn, no face recog etc that is all pretrained | 21:59 |
@sonney2k | van51, sounds bad | 21:59 |
@sonney2k | van51, enable progress output! | 21:59 |
lisitsyn | sonney2k: I know | 21:59 |
@sonney2k | van51, svm.io.set_progress_enabled() or so | 21:59 |
@sonney2k | van51, not good about the crash - valgrind on some subset... | 22:00 |
@sonney2k | van51, 10k examples took >6000s with the 'old' approach so yes about 100 sounds right | 22:01 |
@sonney2k | van51, alright I am off - keep it going! | 22:01 |
van51 | sonney2k: ok! at least that is promising | 22:02 |
van51 | cu! | 22:02 |
lisitsyn | van51: can you explain me the things you are doing in a few words? | 22:03 |
van51 | lisitsyn: sure | 22:03 |
van51 | lisitsyn: right now we are trying to benchmark CHashedDocDotFeatures which stores internally a CStringsFeatures object and whenever a dot product is required it tokenizes the appropriate string feature vector on the fly | 22:04 |
van51 | lisitsyn: then hashes the tokens to a dimension d | 22:05 |
van51 | which is much smaller than the dimension of the entire document collection | 22:05 |
van51 | and the idea then is that you train a linear model on that smaller dimension | 22:05 |
lisitsyn | so the internal storage is still strings? | 22:06 |
van51 | yes | 22:06 |
lisitsyn | why is it more efficient than just store hashes? | 22:07 |
lisitsyn | I mean sounds like hashes are compressing things | 22:08 |
lisitsyn | van51: just trying to understand ;) | 22:09 |
van51 | lisitsyn: well from what I understand pre-hashing the tokens takes time and space | 22:09 |
van51 | lisitsyn: maybe not that much now that the collection fits in memory | 22:09 |
lisitsyn | say I have | 22:10 |
lisitsyn | 1 mb text file | 22:10 |
lisitsyn | how much space hashed thing takes? | 22:10 |
van51 | it depends on the hash size that you specify | 22:11 |
van51 | imagine you try to fit that text file in a vector of size 0..to 2^16 for instance | 22:12 |
van51 | lisitsyn: this post here explains it well : http://metaoptimize.com/qa/questions/6943/what-is-the-hashing-trick | 22:13 |
lisitsyn | van51: but that's BoW right? | 22:14 |
lisitsyn | I mean transforming doc -> 2^16 binary features | 22:14 |
van51 | lisitsyn: actually it's a count | 22:15 |
van51 | lisitsyn: and the BoW representation would have a large dimension of all posible tokens, say N | 22:15 |
van51 | lisitsyn: here we specify a dimension d << N | 22:15 |
lisitsyn | van51: one question that would clarify | 22:16 |
lisitsyn | BoW is indeed memory inefficient (like N possible tokens) | 22:16 |
lisitsyn | but you say when using hashing we get d<<N, why not to compute them explicitly? | 22:17 |
van51 | lisitsyn: explicitly you mean beforehand? | 22:18 |
lisitsyn | van51: yes | 22:18 |
van51 | lisitsyn: well I'm not an expert, I'll just tell you what I have read and come to understand | 22:19 |
lisitsyn | van51: yes I am not expert at all too :) | 22:19 |
van51 | lisitsyn: precomputing it would take up some time before-hand and also more space | 22:20 |
van51 | lisitsyn: either on disk or in memory | 22:20 |
lisitsyn | van51: so it takes more time with hashing but less space? | 22:20 |
van51 | lisitsyn: I'm guessing is the good old trade-off yea | 22:21 |
lisitsyn | alright thanks | 22:21 |
van51 | lisitsyn: also I think it would be hard if your collection had to be streamed | 22:21 |
-!- shogun-notifier- [~irker@7nn.de] has joined #shogun | 22:22 | |
shogun-notifier- | shogun: Sergey Lisitsyn :develop * 8a34d14 / src/configure: https://github.com/shogun-toolbox/shogun/commit/8a34d146ea5e8ab4eb218f81387fb388f93fa95e | 22:22 |
shogun-notifier- | shogun: Fixed lua detection | 22:22 |
lisitsyn | naywhayare: I guess that ^ fixes the thing you reported on lua | 22:22 |
naywhayare | rockin'. glad I could help a bit :) | 22:24 |
naywhayare | (even though technically I only pointed out the problem and didn't quite help) | 22:24 |
naywhayare | thanks :) | 22:24 |
lisitsyn | naywhayare: thanks fro reporting! | 22:24 |
-!- hushell [~hushell@8-92.ptpg.oregonstate.edu] has quit [Ping timeout: 268 seconds] | 22:41 | |
-!- travis-ci [~travis-ci@ec2-23-20-235-49.compute-1.amazonaws.com] has joined #shogun | 22:45 | |
travis-ci | [travis-ci] it's Sergey Lisitsyn's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: http://travis-ci.org/shogun-toolbox/shogun/builds/8778519 | 22:45 |
-!- travis-ci [~travis-ci@ec2-23-20-235-49.compute-1.amazonaws.com] has left #shogun [] | 22:45 | |
shogun-buildbot | build #1197 of bsd1 - libshogun is complete: Failure [failed test_1] Build details are at http://www.shogun-toolbox.org/buildbot/builders/bsd1%20-%20libshogun/builds/1197 blamelist: Sergey Lisitsyn <lisitsyn.s.o@gmail.com> | 22:50 |
-!- iglesiasg [~nando@s83-179-44-135.cust.tele2.se] has joined #shogun | 23:05 | |
-!- mode/#shogun [+o iglesiasg] by ChanServ | 23:05 | |
shogun-buildbot | build #1320 of deb3 - modular_interfaces is complete: Failure [failed test python_modular] Build details are at http://www.shogun-toolbox.org/buildbot/builders/deb3%20-%20modular_interfaces/builds/1320 blamelist: Sergey Lisitsyn <lisitsyn.s.o@gmail.com> | 23:09 |
--- Log closed Fri Jul 05 23:23:30 2013 | ||
--- Log opened Fri Jul 05 23:23:36 2013 | ||
-!- shogun-toolbox [~shogun@7nn.de] has joined #shogun | 23:23 | |
-!- Irssi: #shogun: Total of 13 nicks [2 ops, 0 halfops, 0 voices, 11 normal] | 23:23 | |
-!- Irssi: Join to #shogun was synced in 7 secs | 23:23 | |
pickle27 | lisitsyn: yeah I saw, it still failed though, I don't under stand | 23:46 |
pickle27 | lisitsyn: I'll discuss with you later maybe tomorrow? the result should be a permutation matrix and on my systems it is, but on Travis one of the columns doesn't have a one | 23:47 |
pickle27 | lisitsyn: it looks like its usually the first column too | 23:47 |
pickle27 | lisitsyn: I don't know whats up | 23:47 |
-!- pickle27 [~kevin@rcv3-lab-pc.ee.queensu.ca] has quit [Quit: Leaving] | 23:47 | |
--- Log closed Sat Jul 06 00:00:19 2013 |
Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!