--- Log opened Wed Jul 20 00:00:44 2011 | ||
@sonney2k | blackburn, so we have | 00:15 |
---|---|---|
@sonney2k | - compile time | 00:15 |
-!- serialhex [~quassel@99-101-148-183.lightspeed.wepbfl.sbcglobal.net] has joined #shogun | 00:16 | |
@sonney2k | - no modules | 00:16 |
@sonney2k | that's it right? | 00:16 |
blackburn | I guess so | 00:16 |
blackburn | but no modules is not a - | 00:16 |
blackburn | it is a ----------- hehe | 00:16 |
blackburn | oh /me just realized that computing svd is just an another way to compute eigenvectors of AA' or A'A | 00:19 |
@sonney2k | blackburn, what advantage do you have from using modules? | 00:20 |
blackburn | seems no advantage | 00:21 |
blackburn | but it is unusual to shogun users.. | 00:21 |
blackburn | if there are any | 00:21 |
@sonney2k | there are I am afraid | 00:21 |
blackburn | all of the changes before didn't affect any user interface | 00:22 |
@sonney2k | blackburn, ?? | 00:24 |
@sonney2k | not true | 00:24 |
@sonney2k | classify -> apply | 00:24 |
@sonney2k | etc | 00:24 |
blackburn | okay not so heavy changes | 00:24 |
@sonney2k | all the people using shogun from C++ | 00:24 |
@sonney2k | I don't know just writing from shogun import xxx instead of from shogun.Kernel import xxx is not really a big change | 00:25 |
@sonney2k | considering what we have done before | 00:25 |
blackburn | seems you are right | 00:25 |
@sonney2k | the big issue I see is that compiling the wrapper will now require 2G of memory | 00:26 |
blackburn | 2G????? | 00:26 |
@sonney2k | yes | 00:26 |
blackburn | sh why? | 00:26 |
@sonney2k | because the .cxx file is huge | 00:27 |
blackburn | any way to avoid this? | 00:27 |
@sonney2k | 19M | 00:27 |
@sonney2k | blackburn, yes move as much as possible into .cpp | 00:27 |
@sonney2k | and we could think about not wrapping certain code | 00:27 |
@sonney2k | such as arrays | 00:27 |
blackburn | yes we should hide all the possible things | 00:28 |
@sonney2k | in particular all these %template things are massively duplicating code | 00:28 |
@sonney2k | (the .cxx file has >500000 loc) | 00:30 |
@sonney2k | half a million :D | 00:30 |
blackburn | I'm afraid we can't lower it pretty much with some .h->.cpp | 00:31 |
@sonney2k | blackburn, you are right | 00:39 |
@sonney2k | there is some std template library stuff that will take a few kloc that could go | 00:40 |
@sonney2k | and array too I guess | 00:40 |
@sonney2k | but 500k lines is a lot | 00:40 |
@sonney2k | ok all examples run now again | 00:40 |
blackburn | any other options? | 00:40 |
@sonney2k | with some fake modularity | 00:40 |
@sonney2k | blackburn, well do a clean split of course :) | 00:41 |
@sonney2k | into modules | 00:41 |
@bettyboo | lolomat sonney2k | 00:41 |
@sonney2k | blackburn, can you do wc -l *wrap* in your python_modular dir? | 00:41 |
@sonney2k | how big is it? | 00:41 |
blackburn | 1018343 total | 00:42 |
blackburn | even more? | 00:42 |
blackburn | ah | 00:42 |
blackburn | without .o? | 00:43 |
@sonney2k | *cxx | 00:43 |
blackburn | 822890 total | 00:43 |
@sonney2k | even more sure | 00:43 |
@sonney2k | but in separate files | 00:43 |
* sonney2k is compiling octave_modular now | 00:43 | |
@sonney2k | gcc is at 2G | 00:44 |
@sonney2k | 2.1G | 00:44 |
@sonney2k | 2.2 | 00:44 |
@sonney2k | 2.3 | 00:45 |
@sonney2k | 2.4 | 00:45 |
blackburn | hey I have only 3G ram | 00:45 |
@sonney2k | 'only' | 00:45 |
@sonney2k | 2.5 | 00:45 |
blackburn | you said you have 8? | 00:45 |
@sonney2k | 2.6 | 00:45 |
@sonney2k | 2.7 | 00:46 |
@sonney2k | 2.8 | 00:46 |
@sonney2k | 2.9 | 00:46 |
blackburn | may be some gcc option should be set? | 00:46 |
@sonney2k | 3.3 | 00:46 |
@sonney2k | octave was always more heavy | 00:46 |
@sonney2k | templates and C++ too | 00:47 |
@sonney2k | blackburn, yeah 8 just recently (costed 40 EUR) | 00:48 |
blackburn | installing more ram to my machine will lead me to reinstall the whole OSs | 00:48 |
@sonney2k | ? | 00:48 |
blackburn | x86 now | 00:48 |
blackburn | not 64 | 00:48 |
@sonney2k | why x86? | 00:48 |
blackburn | I don't know | 00:49 |
@sonney2k | it is compiling for 5minutes now on that file | 00:49 |
@sonney2k | with optimizatons to the max | 00:49 |
blackburn | hey so much memory for compilation is not good | 00:50 |
@bettyboo | roger | 00:50 |
@sonney2k | memory requirements go down now | 00:50 |
@sonney2k | blackburn, I know | 00:50 |
@sonney2k | and compiling for 10 minutes on one file is also not good | 00:51 |
@sonney2k | linking... | 00:52 |
@sonney2k | >1G for the linker even | 00:52 |
@sonney2k | 1.5 | 00:52 |
@sonney2k | 1.8 | 00:53 |
@sonney2k | wait the assembler | 00:53 |
@sonney2k | done | 00:53 |
blackburn | crazy | 00:53 |
@sonney2k | all octave examples run smootly w/o crashing | 00:55 |
@sonney2k | smoothly | 00:55 |
@sonney2k | shogun.oct file is 78MB - hah! | 00:56 |
blackburn | heavier? | 00:56 |
@sonney2k | stripped 20MB | 00:56 |
@sonney2k | 21 | 00:56 |
blackburn | "..let me see you stripped.." | 00:57 |
blackburn | :D | 00:57 |
@sonney2k | (strip -s) | 00:57 |
@sonney2k | lets see what happens with java | 00:58 |
@sonney2k | hmmhh the wrapped src is only 7.8 MB... | 01:03 |
@sonney2k | but compilation failed :D | 01:03 |
blackburn | why? | 01:04 |
blackburn | just glanced over heiko's t-student borrowed code | 01:04 |
blackburn | amazing | 01:04 |
blackburn | zk | 01:04 |
blackburn | fdkd | 01:04 |
blackburn | kdsf | 01:04 |
blackburn | dsfs | 01:04 |
@sonney2k | blackburn, sth with feature type | 01:07 |
@sonney2k | blackburn, one more question: how big is the biggest *cxx file curently (in kloc?) | 01:20 |
blackburn | 198361 673467 7061818 Features_wrap.cxx | 01:21 |
blackburn | 198361 Features_wrap.cxx | 01:21 |
blackburn | 198л | 01:21 |
blackburn | 198k | 01:22 |
@sonney2k | so 200k vs 500k now right | 01:22 |
@sonney2k | yeah features must be big because of these many %templates | 01:23 |
@sonney2k | impressive that swig requires <450MB to generate the wrapper | 01:26 |
@sonney2k | yeah java compiled | 01:26 |
blackburn | I should change my daytime from night to day | 01:28 |
@sonney2k | blackburn, good idea | 01:38 |
@sonney2k | we are now facing another issue | 01:38 |
@sonney2k | we have libshogun.so in /usr/lib | 01:38 |
@sonney2k | (the C++ lib) | 01:38 |
blackburn | again.. | 01:38 |
blackburn | and? | 01:38 |
@sonney2k | and libshogun in java | 01:38 |
@sonney2k | I guess that is why the example is not working | 01:38 |
@sonney2k | but note that it is much easier to use shogun only w/ java | 01:39 |
@sonney2k | in python we don't have that problem since the file is called shogun.so | 01:40 |
@sonney2k | so we need a better name | 01:40 |
@sonney2k | libshogun.so -> C++ | 01:40 |
@sonney2k | libshogunif -> modules? | 01:41 |
@sonney2k | as in shogunif aka shogun interface | 01:41 |
blackburn | no idea | 01:41 |
@sonney2k | or sginterface? | 01:41 |
blackburn | not sg, sg is used only internally | 01:41 |
@sonney2k | ok but what name other than jsut 'shogun' | 01:42 |
blackburn | nuhohs | 01:43 |
blackburn | nugohs | 01:43 |
@sonney2k | :) | 01:43 |
@sonney2k | modshogun ? | 01:43 |
blackburn | :D | 01:43 |
blackburn | i have no idea | 01:43 |
blackburn | I guess we better go sleep ;) | 01:44 |
@sonney2k | I think modshogun will do the job | 01:44 |
blackburn | okay | 01:44 |
@sonney2k | it is clear that this is the modular shogun thing | 01:44 |
blackburn | sonney2k: is AA' always symmetrical? | 01:45 |
blackburn | hmm yes, it should be | 01:45 |
blackburn | :D | 01:45 |
@sonney2k | :) | 01:46 |
@sonney2k | recompiling | 01:46 |
blackburn | I've been staring on these dgesvd and dsyev all the night long | 01:46 |
blackburn | can't choose best way :D | 01:47 |
@sonney2k | I knwo that feeling | 01:47 |
@sonney2k | know | 01:47 |
blackburn | the most funny thing about it - it is related to k*k eigenproblem where k is usually ~20 or so | 01:47 |
blackburn | I don't know why I spend so much time on such premature optimizations | 01:48 |
@sonney2k | yay! | 01:49 |
@sonney2k | java modular works now | 01:49 |
@sonney2k | checking python_modular again | 01:49 |
blackburn | everything seems to be better except memory | 01:49 |
@sonney2k | yeah,but lets see | 01:50 |
@sonney2k | I removed array from the swig bindings | 01:50 |
@sonney2k | I guess swig is jsut not that well tested for modules | 01:51 |
@sonney2k | well wrapper is down to 16MB in python | 01:52 |
@sonney2k | gcc required 'just' 1.7G | 01:52 |
@sonney2k | done | 01:53 |
blackburn | hey it is kind of success ;) | 01:53 |
@sonney2k | I guess we should carefully check whether we really need all types | 01:54 |
@sonney2k | ok python modular works still | 01:54 |
@sonney2k | modshogun is 23M | 01:55 |
@sonney2k | or 13M stripped | 01:55 |
@sonney2k | what doesn't work yet is doxygen support | 01:55 |
@sonney2k | but I guess we are in better shape now for lua to work | 01:56 |
@sonney2k | and just having octave work reliably is a big big advantage | 01:56 |
blackburn | yeah, better to not crash | 01:57 |
@sonney2k | definitely | 01:58 |
@sonney2k | maybe we get the wrapper down by another 3MB or so and then it is fine memory wise to | 01:59 |
@sonney2k | (currently 421 kloc) | 01:59 |
@sonney2k | blackburn, yay! | 02:02 |
@sonney2k | all lua examples work | 02:02 |
blackburn | already? | 02:03 |
@sonney2k | blackburn, well the ones sploving translated | 02:03 |
blackburn | I lost the moment when it's done | 02:03 |
@sonney2k | well one example with classifier/kernel/distance works | 02:04 |
@sonney2k | now if r_modular was working w/o crashing - that would be heaven | 02:04 |
@sonney2k | the are wrapper for swig needs more memory | 02:07 |
@sonney2k | 600M | 02:07 |
blackburn | crazy amounts of memory | 02:07 |
@sonney2k | blackburn, the object file is 48M! | 02:07 |
@sonney2k | compiled | 02:08 |
blackburn | so solution for some of issues you couldn't resolve for a years - is to flatten shogun? | 02:08 |
blackburn | err twisted language | 02:09 |
blackburn | more vodka please | 02:09 |
blackburn | or less | 02:10 |
@sonney2k | blackburn, yes ... swig bugs I would say | 02:11 |
@sonney2k | it doesn't help for R though | 02:11 |
@sonney2k | that wrapper seems to be just very broken | 02:12 |
* blackburn is currently writing the slowest algo ever | 02:12 | |
@sonney2k | or maybe the typemaps don't work for R yet :D | 02:13 |
@sonney2k | could be my fault this time | 02:13 |
@sonney2k | blackburn, anyway I think it is worth committing now or? | 02:13 |
blackburn | up to you | 02:14 |
blackburn | I'll have merging issues anyway | 02:15 |
@sonney2k | not necessaryily | 02:15 |
@sonney2k | but we will see | 02:15 |
blackburn | I mean I modified some interfaces when doing HLLE | 02:15 |
blackburn | hmmm HLLE is just like HELL but HLLE | 02:16 |
CIA-87 | shogun: Soeren Sonnenburg master * r4b9fc30 / (36 files in 6 dirs): (log message trimmed) | 02:20 |
CIA-87 | shogun: Just create a single modshogun object for the modular interfaces. While | 02:20 |
CIA-87 | shogun: this increases wrapper size (memory requirements when compiling) - it | 02:20 |
CIA-87 | shogun: avoids all the issues we were having utilizing %import and multiple swig | 02:20 |
CIA-87 | shogun: modules. Namely octave_modular is now working reliably / rock-stable. | 02:20 |
CIA-87 | shogun: Require order in lua_modular decides no longer if sth works or not, etc | 02:20 |
CIA-87 | shogun: To reduce wrapper size the Array* wrappings were removed from the | 02:20 |
CIA-87 | shogun: Soeren Sonnenburg master * rec1939d / src/shogun/preprocessor/LocallyLinearEmbedding.cpp : Merge branch 'master' of github.com:shogun-toolbox/shogun - http://bit.ly/qegG8w | 02:20 |
@sonney2k | pushing the hell out of you | 02:20 |
CIA-87 | shogun: Soeren Sonnenburg master * re31becc / (5 files in 2 dirs): | 02:22 |
CIA-87 | shogun: Merge pull request #214 from karlnapf/master | 02:22 |
CIA-87 | shogun: introduced CStatistics class - http://bit.ly/pGaF4C | 02:22 |
@sonney2k | blackburn, heiko is doing crazy shit... | 02:23 |
blackburn | :D | 02:23 |
blackburn | why he started doing ^ | 02:23 |
blackburn | ? | 02:23 |
blackburn | big=4.503599627370496e15; | 02:24 |
blackburn | biginv=2.22044604925031308085e-16; | 02:24 |
blackburn | maxgam=171.624376956302725; | 02:24 |
blackburn | uh oh | 02:24 |
blackburn | alglib is strange | 02:24 |
@sonney2k | blackburn, now that I merge I see one problem... | 02:26 |
@sonney2k | with doxygen generated documentation | 02:26 |
blackburn | again and again | 02:26 |
@sonney2k | before we had for one module the docuemntaton | 02:26 |
@sonney2k | now it will be all in modshogun | 02:26 |
@sonney2k | and I don't see how one could split this up | 02:27 |
@sonney2k | might be a minor issue but anyways | 02:27 |
blackburn | I see | 02:27 |
@sonney2k | I really used help(Kernel) | 02:27 |
@sonney2k | etc | 02:27 |
blackburn | what is it? | 02:27 |
blackburn | :D | 02:27 |
@sonney2k | blackburn, !?!?!?!?!?!?!?!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! | 02:27 |
@sonney2k | you ever used help on shogun's objects in python? | 02:28 |
blackburn | nope | 02:28 |
@sonney2k | don't tell me no one uses documentation | 02:28 |
@sonney2k | it is a pain to write | 02:28 |
@sonney2k | then just do | 02:28 |
@sonney2k | from shogun.Classifier import SVM | 02:28 |
@sonney2k | help(SVM) | 02:28 |
blackburn | okay methods and fields | 02:29 |
blackburn | I see | 02:29 |
blackburn | I just usually know the methods I use, in preprocessors there are 2-3 of them ;) | 02:31 |
blackburn | and I wan't developing any serious stuff with shogun | 02:31 |
@sonney2k | blackburn, I always forget the names of some methods | 02:32 |
blackburn | the most popular method I use is apply_to_feature_matrix :D | 02:32 |
blackburn | and set_k, set_landmark_number | 02:33 |
blackburn | so on | 02:33 |
blackburn | okay it is dawn here already | 02:33 |
@sonney2k | blackburn, uh oh | 02:34 |
@sonney2k | here it is still dark | 02:34 |
@sonney2k | no noise outside... | 02:34 |
blackburn | why aren't you sleeping? do you have to go to job morning? | 02:34 |
blackburn | I guess I know the answer for first question, but anyway | 02:35 |
blackburn | :) | 02:35 |
blackburn | yesterday at 4-5am two girls were fighting here on the street :D I was watching through window ahaha | 02:37 |
blackburn | crazy country | 02:37 |
@sonney2k | blackburn, did you figure out what the fight was about? | 02:37 |
blackburn | nope but it was cruel hehe | 02:37 |
blackburn | nice show before going to sleep | 02:38 |
@sonney2k | so you are waiting for it to happen again right? | 02:38 |
@sonney2k | that is why you don't sleep yet | 02:38 |
@sonney2k | blackburn, I just want to get this finished | 02:38 |
@sonney2k | blackburn, it is bad if sploving becomes demotivated because of such problems | 02:39 |
blackburn | no, this time I just implementing HELL^W HLLE | 02:39 |
@sonney2k | blackburn, I think I will work from home tomorrow and sleep a bit at day-time | 02:39 |
blackburn | hmm nice that you have ability to work from home | 02:39 |
blackburn | no fight today, sad | 02:42 |
blackburn | no bears with ak-47 | 02:42 |
blackburn | no missiles in the sky | 02:42 |
blackburn | boring | 02:42 |
* sonney2k checks on http://www.flightradar24.com/ which plane cruises over his house | 02:44 | |
@sonney2k | it is SXS923 departing from SXF to Izmir, Adnan Menderes | 02:46 |
blackburn | departing turkish from germany? :D | 02:46 |
@bettyboo | :*) | 02:46 |
@sonney2k | airline SunExpress | 02:47 |
@sonney2k | hmmhh | 02:47 |
@sonney2k | you are right http://data.flight24.com/flights/xq923/ | 02:47 |
blackburn | I heard there are a lot of turkish in germany ;) | 02:48 |
@bettyboo | haha? | 02:48 |
blackburn | no aircrafts near me | 02:49 |
blackburn | samara - frankfurt in ~300km | 02:49 |
@sonney2k | another one going to turkey | 02:50 |
CIA-87 | shogun: Soeren Sonnenburg master * r417cfbb / (18 files in 2 dirs): doxygen support for python_modular's modshogun - http://bit.ly/nmvP9A | 02:52 |
@sonney2k | it all works :D | 02:52 |
blackburn | I've always wanted to know if aircrafts are cruising through syberia | 02:52 |
blackburn | they do | 02:53 |
blackburn | especially sth like Vienna-Seoul | 02:54 |
blackburn | I think it's time to sleep | 02:55 |
@sonney2k | yes I agree | 02:57 |
blackburn | I've never travel by plane ;( | 02:57 |
@sonney2k | I did way too often but kids grounded me now for quite a while | 02:58 |
@sonney2k | blackburn, you should come to germany at some point maybe do a practical with gunnar in bioinformatics... | 02:58 |
@sonney2k | anyways | 02:58 |
@sonney2k | bed time! | 02:58 |
blackburn | or drink vodka with you | 02:58 |
blackburn | btw not easy to get ehhh | 02:58 |
@sonney2k | water looks very similar ;) | 02:59 |
@sonney2k | blackburn, enough for yesterday/today :) | 02:59 |
blackburn | ah nevermind | 02:59 |
blackburn | can't remember right word | 02:59 |
blackburn | :D | 02:59 |
@sonney2k | I really need to go to bed now - I will be woken up in 3 hrs | 02:59 |
blackburn | see you | 02:59 |
@sonney2k | cu | 02:59 |
-!- blackburn [~blackburn@109.226.102.88] has quit [Quit: Leaving.] | 02:59 | |
-!- alesis-novik [~alesis@cpat001.wlan.net.ed.ac.uk] has quit [Quit: Leaving] | 04:03 | |
-!- f-x [~user@117.192.198.162] has quit [Ping timeout: 260 seconds] | 04:24 | |
-!- in3xes [~in3xes@180.149.49.227] has quit [Ping timeout: 240 seconds] | 06:54 | |
-!- in3xes [~in3xes@180.149.49.227] has joined #shogun | 07:19 | |
-!- gsomix [~gsomix@80.234.26.210] has joined #shogun | 07:54 | |
-!- gsomix [~gsomix@80.234.26.210] has quit [Client Quit] | 07:55 | |
CIA-87 | shogun: Shashwat Lal Das master * rd4aa6d9 / src/shogun/io/InputParser.h : Parser changed to use condition variable waiting in place of the spinlock. - https://github.com/shogun-toolbox/shogun/commit/d4aa6d9a084d32da144b94d2b8a7f3e1b5cd3e61 | 08:46 |
CIA-87 | shogun: Soeren Sonnenburg master * rf2b4092 / src/shogun/io/InputParser.h : | 08:46 |
CIA-87 | shogun: Merge pull request #215 from frx/streaming_1 | 08:46 |
CIA-87 | shogun: Use of pthread_cond_wait in place of busy-loop in parser - https://github.com/shogun-toolbox/shogun/commit/f2b409277e46e9a32fc428d63cda2673278fa62b | 08:46 |
CIA-87 | shogun: Soeren Sonnenburg master * r1e6799c / src/Makefile.template : | 09:16 |
CIA-87 | shogun: install modshogun in base level directory and only underneath in | 09:16 |
CIA-87 | shogun: shogun/* Kernel, Features, ... compatibility helpers - https://github.com/shogun-toolbox/shogun/commit/1e6799c31de7225cce2179692e93d492e39d056c | 09:16 |
-!- in3xes_ [~in3xes@180.149.49.227] has joined #shogun | 09:52 | |
-!- in3xes [~in3xes@180.149.49.227] has quit [Ping timeout: 240 seconds] | 09:55 | |
-!- in3xes_ is now known as in3xes | 10:02 | |
-!- heiko [~heiko@main.uni-duisburg.de] has joined #shogun | 10:50 | |
-!- blackburn [~blackburn@109.226.102.88] has joined #shogun | 12:20 | |
-!- tank_ [d96db927@gateway/web/freenode/ip.217.109.185.39] has joined #shogun | 12:21 | |
-!- tank_ [d96db927@gateway/web/freenode/ip.217.109.185.39] has quit [Quit: Page closed] | 12:37 | |
@sonney2k | heiko, can you still compile shogun? | 13:31 |
heiko | sonney2k, only my pull request | 13:46 |
heiko | did not checkout newly since then | 13:46 |
@sonney2k | heiko, I had to make another drastic change ... everything is in one module now | 13:47 |
@sonney2k | it is not user visible though due to some compatibility layers I did add | 13:47 |
heiko | ok | 13:47 |
heiko | should I checkout and try? | 13:48 |
@sonney2k | heiko, how much memory do you have? | 13:48 |
heiko | 1gb | 13:48 |
@sonney2k | urgs! | 13:48 |
@sonney2k | then you won't be able to compile ... | 13:48 |
heiko | well, i just got an email that my new modules were shipped today | 13:48 |
@sonney2k | do you have access to some more powerful machine? | 13:48 |
heiko | will receive them tomorrow or the day after | 13:48 |
heiko | then i will have 4gb | 13:48 |
@sonney2k | you need now ~2GB to compile shogun | 13:49 |
heiko | oh... | 13:49 |
heiko | mmh | 13:49 |
heiko | for which parts? | 13:49 |
heiko | libshogun consumes so much? | 13:49 |
@sonney2k | for any modular interface | 13:49 |
@sonney2k | libshogun - nope that is the only cheap part | 13:49 |
@sonney2k | if you don't do make -j 8 | 13:49 |
@sonney2k | or so :)( | 13:49 |
heiko | hehe ok :) | 13:50 |
heiko | well I will focus on libshogun the next days then | 13:51 |
heiko | i will do the kernel machine model stuff now anyways | 13:51 |
heiko | confidence intervals now work | 13:51 |
heiko | sonney2k, I got another question regarding this git stuff | 13:55 |
heiko | with the rebase | 13:55 |
heiko | how should I update my local repo? | 13:55 |
heiko | because git pull --rebase just pulls from my shogun fork | 13:56 |
heiko | how do I update the latter? | 13:56 |
@sonney2k | blackburn, you are the expert here | 13:57 |
@sonney2k | I would say some thing but using a different origin | 13:57 |
blackburn | don't understand the issue | 13:57 |
heiko | how to update my local repo? | 13:57 |
heiko | i used to do | 13:57 |
heiko | git fetch upstream | 13:57 |
heiko | git merge upstream/master | 13:57 |
heiko | my local repo is sync with my fork | 13:58 |
heiko | but not with the original shogun repo | 13:58 |
blackburn | ah | 13:58 |
blackburn | git pull origin | 13:58 |
heiko | rebase?? | 13:58 |
blackburn | I don't really understand why you became un-sync with your fork | 13:59 |
heiko | no I am synched with my fork | 13:59 |
blackburn | ah | 14:00 |
heiko | local repo and fork are synched | 14:00 |
blackburn | git pull upstream | 14:00 |
blackburn | :D | 14:00 |
heiko | with rebase or not? | 14:00 |
blackburn | have you any other branches? | 14:01 |
heiko | I would now do | 14:01 |
blackburn | or only master? | 14:01 |
heiko | branches | 14:01 |
blackburn | so just git pull upstream being on master branch | 14:01 |
blackburn | and then rebase your branches | 14:01 |
heiko | thanks, blackburn :) | 14:03 |
@bettyboo | lolomat | 14:03 |
@sonney2k | ok that is how I am also doing it with other git's | 14:05 |
-!- gsomix [~gsomix@109.169.132.112] has joined #shogun | 14:07 | |
heiko | sonney2k, libshogun just compiled fine :) | 14:08 |
@bettyboo | great | 14:08 |
heiko | i will get the new memory hopefully tomorrow | 14:09 |
@sonney2k | at least... | 14:09 |
heiko | funny coincidency that I just bought it .) | 14:09 |
heiko | coincidence | 14:09 |
@sonney2k | heiko, I only hope 4G is enough :D | 14:12 |
heiko | really? | 14:12 |
heiko | oh no :) | 14:12 |
heiko | thats my machine limit | 14:12 |
heiko | why exactly did you do this? | 14:12 |
@sonney2k | heiko, because we were running into new issues with swig | 14:13 |
@sonney2k | this time with lua | 14:13 |
heiko | mmh, ok | 14:14 |
@sonney2k | include order did matter if sth works or not | 14:14 |
@sonney2k | having a single module now with all the shogun stuff named 'modshogun' | 14:14 |
@sonney2k | resolves all the issues we ever had | 14:14 |
@sonney2k | i.e. octave_modular is now no longer crashing | 14:14 |
heiko | oh ok | 14:15 |
@sonney2k | the compiled module is smaller, java_modular just works nicely | 14:15 |
heiko | then its pretty cool :) | 14:15 |
@sonney2k | the only downside is that it is generating a wrapper that is 19MB or so in size | 14:15 |
@sonney2k | compiling this C++ code takes time and requires lots of memory | 14:16 |
heiko | sad that swig does not do any intelligent stuff there automatically | 14:16 |
@sonney2k | and that is only because we have soo many templated classes and when we do %template we get yet another copy | 14:16 |
@sonney2k | heiko, the way we did it before is the suggested way for large swig modules | 14:17 |
@sonney2k | (with %include / %import) | 14:17 |
@sonney2k | but either I did it wrong or it doesn't work reliably | 14:18 |
heiko | mmh | 14:18 |
heiko | ok the hopefully 4gb are enough | 14:18 |
heiko | did anyone complain already? | 14:18 |
@sonney2k | I don't know whether you noticed but we had zillions of %includes everywhere suddenly | 14:19 |
@sonney2k | the problem really is that in shogun there are no real seperate modules | 14:19 |
@sonney2k | I mean e.g. a kernel module needs to communicate with features | 14:19 |
@sonney2k | a distance can be created from a kernel | 14:19 |
@sonney2k | a file stream can be created from features | 14:20 |
@sonney2k | heiko, the most demanding interface is octave with about 3GB memory requirements... | 14:20 |
@sonney2k | because octave is written in C++ and heavy templated too | 14:21 |
@sonney2k | heiko, no noone complained and I think we can still reduce wrapper size quite a bit | 14:21 |
heiko | ok | 14:22 |
@sonney2k | the only other viable option is modules again | 14:22 |
@sonney2k | I just don't see how to do a clean split | 14:22 |
heiko | well, one will see if this brings more problems | 14:23 |
heiko | I mean the mem level is large but it is not unfeasable | 14:23 |
@sonney2k | heiko, all examples etc everything stayed the same and still runs | 14:24 |
@sonney2k | ok java_modular examples will need a change but that lang is new anyways | 14:24 |
heiko | name ok for example: | 14:25 |
heiko | mathematics_statistics_confidence_intervals.cpp | 14:25 |
heiko | ? | 14:25 |
@sonney2k | a bit long | 14:25 |
@sonney2k | maybe drop statistics | 14:25 |
-!- blackburn [~blackburn@109.226.102.88] has quit [Ping timeout: 255 seconds] | 14:26 | |
-!- blackburn [~blackburn@188.122.253.215] has joined #shogun | 14:33 | |
heiko | then pull request incoming ... | 14:34 |
@sonney2k | heiko, btw did you check the gamma function of ALGLIB vs. lgamma etc? | 14:36 |
CIA-87 | shogun: Soeren Sonnenburg master * r43cd548 / (4 files in 2 dirs): | 14:36 |
CIA-87 | shogun: Merge pull request #216 from karlnapf/master | 14:36 |
CIA-87 | shogun: confidence interval example and some minor fixes (+6 more commits...) - https://github.com/shogun-toolbox/shogun/commit/43cd548bfad5a1d09b5fcf309962e3605445937e | 14:36 |
heiko | sonney2k, no I made ALGLIB use the lgamma | 14:37 |
@sonney2k | did you compare the test in alglib vs the one we have in shogun now? | 14:38 |
heiko | because if I used ALGLIBS gamma method, I had to add another 5 or 6 methods on which ALGLIBS gamma relies | 14:38 |
@sonney2k | do they give identical results? | 14:38 |
@sonney2k | or up to 1e-16 precision? | 14:38 |
heiko | at least, the results of the methods that use lgamma are correct | 14:38 |
heiko | but isnt gnu libc reliable? | 14:39 |
heiko | ALGLIB's design is a bit strange | 14:39 |
heiko | but i believe the reason that they do use their own impl is that its available for many languages | 14:40 |
@sonney2k | I don't trust anyone (no longer...) | 14:40 |
heiko | (also for those who do not have an lgamma impementation) | 14:40 |
@sonney2k | could be yes | 14:40 |
heiko | I thought its more probalbe that i make a mistake when "inserting" the coede into shogun, than that the gammafunction implementations are different | 14:41 |
heiko | the alglib routines have to be edited at a lot of places | 14:41 |
heiko | and in this numerical stuff, its very hard to find mistakes, which happen easily when porting | 14:41 |
heiko | I also did not compile alglib | 14:42 |
heiko | but the results of the methods which use lgamma are ok | 14:42 |
heiko | i compared to result of R | 14:42 |
@sonney2k | heiko, the t-test? | 14:42 |
heiko | yes | 14:42 |
@sonney2k | and I assume that looks good right? | 14:43 |
heiko | yes | 14:43 |
@sonney2k | up to which precision? | 14:43 |
heiko | let me check ... | 14:43 |
blackburn | hey where do we need this things? | 14:44 |
heiko | when evaluating a classifier via cross-validation | 14:44 |
@sonney2k | can you be a bit more specific? | 14:45 |
@sonney2k | I mean I know one computes mean and variance usually for these things | 14:45 |
heiko | if you do a cross-validation of the same classifier on the same data twice, the results are not equal | 14:46 |
heiko | because of the random splitting of the data | 14:46 |
heiko | so repeat the procedure | 14:46 |
heiko | and use mean | 14:46 |
heiko | but its cooler to have a confidence interval for the true mean | 14:46 |
heiko | so one can really precisely evaluate a classifier | 14:47 |
@sonney2k | I always used to use just std-deviation there | 14:47 |
@sonney2k | now you give the mean and ? | 14:48 |
heiko | one can specify a p-value and one gets the sample mean and the confidence interval for the true mean | 14:48 |
heiko | so it outputs: with 95% probability the true AUC of the classifier is X | 14:49 |
heiko | ehm sorry | 14:49 |
heiko | with 95% prob the true mean lies in [X,Y] | 14:49 |
@sonney2k | OK | 14:49 |
heiko | and if one wants tight intervals and high p-values | 14:49 |
heiko | one can use a large number of repetitions | 14:49 |
@sonney2k | that's neat - seems you are more expert in x-val business than I am | 14:51 |
@sonney2k | I mean you did already stratified x-val, the general one etc... | 14:51 |
@sonney2k | btw, cwidmer was suggesting yesterday that in addition to grid search random sampling would be a good thing | 14:52 |
@sonney2k | and also hot-starting of e.g. a grid-search | 14:52 |
@sonney2k | that is when you have done 50% of the experiments you could safely break and continue | 14:52 |
@sonney2k | if you knew the index | 14:52 |
heiko | i had these problems in my BA, having different evaluation results on same data, the confidence intervall stuff is very usefull for comparing classifier. | 14:53 |
heiko | ah ok | 14:53 |
heiko | yes, that would be cool (hot starting) | 14:53 |
heiko | also nice: having this standard 2level grid-search. First coarse then fine | 14:54 |
@sonney2k | true | 14:54 |
@sonney2k | he also pointed me to some paragraph about random sampling | 14:54 |
heiko | but I will do the kernel machine stuff first. Because if this is not there there is no x-vall at all :) | 14:54 |
heiko | ok | 14:54 |
heiko | which one? | 14:54 |
heiko | i dont know much about random sampling | 14:54 |
@sonney2k | he other is a grid search, i.e., | 14:54 |
@sonney2k | choosing a set of values for each hyper-parameter and training and evaluating a model for | 14:54 |
@sonney2k | each combination of values for all the hyper-parameters. Both work well when the number | 14:54 |
@sonney2k | of hyper-parameters is small (e.g. 2 or 3) but break down when there are many more 6 . | 14:55 |
@sonney2k | More systematic approaches are needed. An approach that we have found to scale better | 14:55 |
@sonney2k | is based on random search and greedy exploration. The idea of random search (Berstra | 14:55 |
@sonney2k | and Bengio, 2011) is simple and can advantageously replace grid search. Instead of forming | 14:55 |
@sonney2k | a regular grid by choosing a small set of values for each hyper-parameter, one defines a | 14:55 |
@sonney2k | distribution from which to sample values for each hyper-parameter, e.g., the log of the | 14:55 |
@sonney2k | learning rate could be taken as uniform between log(0.1) and log(10−6 ), or the log of the | 14:55 |
-!- sonney2k was kicked from #shogun by bettyboo [flood] | 14:55 | |
-!- sonney2k [~shogun@7nn.de] has joined #shogun | 14:55 | |
-!- mode/#shogun [+o sonney2k] by ChanServ | 14:55 | |
heiko | hi again :) | 14:55 |
heiko | Also, bisection would be cool for model selection. is about twice as fast as a 2-level grid-search and more precise | 14:55 |
@sonney2k | I didn't know betty got empowered again | 14:55 |
@bettyboo | sonney2k: sorry got disconnected again | 14:55 |
@sonney2k | har har | 14:55 |
heiko | hehe ;) | 14:56 |
@sonney2k | heiko, you are an R person right? | 14:56 |
@sonney2k | I wish this having one extension would help for R too | 14:56 |
heiko | sonney2k, yes | 14:56 |
heiko | but more on the statistics side, | 14:57 |
heiko | I once had a lot of struggle with C-code, called from R, arhgh! | 14:57 |
@sonney2k | yes sure from the R side not the extension side - that is what I meant | 15:00 |
@sonney2k | I would wish R was working rock stable too | 15:00 |
heiko | sonney2k, do you know by heart how to tell SG_PRINT to write more digits after the coma? | 15:00 |
@sonney2k | %.18g | 15:00 |
@sonney2k | 18 digits | 15:00 |
heiko | like this? | 15:02 |
heiko | SG_SPRINT("%18g\n", 3.345678987654321234); | 15:02 |
heiko | works, thanks | 15:03 |
heiko | so precision is at least 10^-16 | 15:04 |
@sonney2k | perfect :) | 15:05 |
heiko | t.test in R and in shogun | 15:05 |
heiko | so, I will get a coffee and start on the kernel machine stuff then. | 15:06 |
@sonney2k | yes | 15:06 |
@sonney2k | my suggestion would be to just add a function to kernel / distance machine | 15:06 |
heiko | concrete? | 15:07 |
@sonney2k | that does a conversion of the features based on the support vector idx's | 15:07 |
@sonney2k | so it returns a new broken down feature object | 15:09 |
@sonney2k | I guess you could do this by setting the sv idx as subset of the features (but beware that when there is already a subset defined it needs to be a subset of the subset) | 15:09 |
@sonney2k | maybe best is to just have a get_feature_matrix_subset(subset) function | 15:10 |
@sonney2k | etc | 15:10 |
@sonney2k | then you only need to specify m_svs and overwrite features in some reshape code. | 15:11 |
* sonney2k is thinking how to do that in the most memory efficient way | 15:11 | |
@sonney2k | of course one more way would be to actually modify the input data | 15:12 |
@sonney2k | but that should be a separate function then (squeeze to subset inplace or so) | 15:12 |
@sonney2k | that could then in case of e.g. stringfeatures just delete[] the unneded strings | 15:12 |
@sonney2k | unneeded | 15:12 |
@sonney2k | then you need to set sv_idx to 0...len(svidx)-1 and all good | 15:14 |
@sonney2k | and we need a flag to disable/enable that behavior in kernel machine | 15:14 |
@sonney2k | it might be that someone doesn't want to do this due to memory overhead | 15:14 |
@sonney2k | that's it... | 15:15 |
* sonney2k thinks he is babbling too much w/o anyone listening | 15:15 | |
blackburn | I'm reading ;) | 15:16 |
-!- gsomix [~gsomix@109.169.132.112] has quit [Quit: Ухожу я от вас (xchat 2.4.5 или старше)] | 15:20 | |
* sonney2k thinks that heiko had some coffee machine incident | 15:33 | |
CIA-87 | shogun: Soeren Sonnenburg master * r9cd2e5e / (5 files in 2 dirs): | 15:35 |
CIA-87 | shogun: Merge pull request #217 from sploving/master | 15:35 |
CIA-87 | shogun: add some more lua examples - https://github.com/shogun-toolbox/shogun/commit/9cd2e5e1378c72c57b4a8711a7721d3bb28ae804 | 15:35 |
blackburn | sonney2k: what about CoffeeMachine? | 15:35 |
@sonney2k | CCoffeeMachine you mean | 15:35 |
blackburn | yeap | 15:35 |
@sonney2k | methods cook(), incident(), crash_and_burn() | 15:35 |
@sonney2k | blackburn, btw how far are you with the russian translation? | 15:40 |
blackburn | 5% I guess | 15:41 |
blackburn | :D | 15:41 |
@bettyboo | :) | 15:41 |
blackburn | we should update our english doc btw | 15:41 |
heiko | heavy coffee machine GAU here | 15:42 |
heiko | ;) | 15:42 |
@bettyboo | lol | 15:42 |
-!- gsomix [~gsomix@109.169.132.112] has joined #shogun | 15:44 | |
heiko | sonney2k, sorry, i wasnt listening, but just read through all of what you wrote .) | 15:44 |
heiko | sonney2k, all this index translation stuff is possible. | 15:45 |
heiko | but we also talked about kernel machines that do just stort their SVs | 15:46 |
heiko | (copies of them) | 15:46 |
@sonney2k | heiko, yeah but that is what I was also talking about | 16:05 |
@sonney2k | you just create a clone of the feature objects and store it | 16:05 |
heiko | yes ok | 16:05 |
heiko | but only the SV features | 16:06 |
@sonney2k | yes | 16:06 |
@sonney2k | that is what you need these index stuff for | 16:06 |
@sonney2k | subset -> features function | 16:06 |
heiko | ehm ... | 16:06 |
* heiko dont gets it | 16:06 | |
heiko | the machine learns on all features | 16:07 |
heiko | then has the sv indices | 16:07 |
@sonney2k | yes, then you will have m_svs filled | 16:07 |
heiko | then copies all these features | 16:07 |
heiko | yes | 16:07 |
heiko | ah ok | 16:07 |
@sonney2k | that is the index with the non-zero alphas | 16:07 |
heiko | in the (possibly subsetted) features | 16:07 |
heiko | and then the conversion function to the LOCAL array of features (the copied features) | 16:08 |
@sonney2k | so you write a function in the feature object that enables you to get a subset of the subset copy of the features | 16:08 |
@sonney2k | you take these and store them locally in kernelmachine | 16:08 |
heiko | ok | 16:08 |
heiko | I think somiething like this is already there in some feature classes | 16:08 |
@sonney2k | and modify apply* code to use that | 16:08 |
heiko | ok then | 16:09 |
@sonney2k | don't forget to SG_REF the newly created one and SG_UNREF in destructor | 16:09 |
heiko | sure thing :) | 16:09 |
@sonney2k | this all has to be optional - to enable the old behavior in low-memory cases | 16:12 |
CIA-87 | shogun: Soeren Sonnenburg master * reeec47b / src/interfaces/csharp_modular/swig_typemaps.i : comment these typemaps - they just don't work yet - https://github.com/shogun-toolbox/shogun/commit/eeec47b285abbed7c9123737589f015b03243c7f | 16:30 |
heiko | sonney2k, are you there? | 18:30 |
-!- heiko [~heiko@main.uni-duisburg.de] has quit [Ping timeout: 258 seconds] | 18:58 | |
CIA-87 | shogun: Alesis Novik master * rdc33d39 / (2 files in 2 dirs): Example and fix - https://github.com/shogun-toolbox/shogun/commit/dc33d39f1697de5fd52662911d1ac7e54b114264 | 19:57 |
CIA-87 | shogun: Soeren Sonnenburg master * ref357f2 / (2 files in 2 dirs): | 19:57 |
CIA-87 | shogun: Merge pull request #219 from alesis/gmm | 19:57 |
CIA-87 | shogun: Example and fix - https://github.com/shogun-toolbox/shogun/commit/ef357f2a6ff48da25c96f19b17242742c0a87555 | 19:57 |
CIA-87 | shogun: Soeren Sonnenburg master * r541e13e / (4 files in 3 dirs): | 19:59 |
CIA-87 | shogun: Merge pull request #218 from karlnapf/master | 19:59 |
CIA-87 | shogun: use SGVector for KernelMachine svs and alphas - https://github.com/shogun-toolbox/shogun/commit/541e13e1d053411e8b8e742fc405fcb4312bab38 | 19:59 |
-!- gsomix [~gsomix@109.169.132.112] has quit [Quit: Ухожу я от вас (xchat 2.4.5 или старше)] | 20:09 | |
-!- blackburn [~blackburn@188.122.253.215] has quit [Read error: Connection reset by peer] | 23:39 | |
-!- blackburn [~blackburn@188.122.253.215] has joined #shogun | 23:39 | |
--- Log closed Thu Jul 21 00:00:50 2011 |
Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!