IRC logs of #shogun for Wednesday, 2017-06-14

--- Log opened Wed Jun 14 00:00:12 2017
-!- mikeling [uid89706@gateway/web/irccloud.com/x-gfijddxvyivyyzyj] has quit [Quit: Connection closed for inactivity]00:22
-!- olinguyen [81615ad9@gateway/web/freenode/ip.129.97.90.217] has quit [Quit: Page closed]00:42
-!- OXPHOX [92bd305b@gateway/web/freenode/ip.146.189.48.91] has quit [Quit: Page closed]01:13
-!- TingMiao [uid229534@gateway/web/irccloud.com/x-aexwfarcpagulqsp] has quit [Quit: Connection closed for inactivity]01:28
-!- lisitsyn_ [~lisitsyn@37.139.2.75] has joined #shogun03:02
-!- lisitsyn [~lisitsyn@37.139.2.75] has quit [Write error: Broken pipe]03:03
-!- lisitsyn_ is now known as lisitsyn03:03
-!- mikeling [uid89706@gateway/web/irccloud.com/x-vejvlfbugbsyqyxg] has joined #shogun04:20
-!- OXPHOS [401e476d@gateway/web/freenode/ip.64.30.71.109] has joined #shogun04:58
@sukeyPull Request #3832 "use std::vector instead of DynArray(on going)" - https://github.com/shogun-toolbox/shogun/pull/383205:09
@wikingmikeling, ping05:13
@wikingaround?05:13
mikelingwiking: pong05:13
mikelingyes...05:13
@wikingmikeling, so i was looking into the bug yesterday05:13
@wikingcouldn't nail it down where the memory thing goes bad05:14
mikelingwiking: I got some clue05:14
@wikingbut it's definitely some memory problem05:14
@wikingas gdb fails with a simple malloc05:15
mikelingactually I find, first stupid thing I do is shuffle(a.begin(), a.end())05:15
mikelingI will make all the element become zero because05:15
@wiking?05:16
mikelingit will shuffle all the element in that vector rather than element been used05:16
@wikingok i dont understand :)05:16
@wikingcould you explain maybe with code or rephrase?05:17
mikelinglike, if we have vector and its size is 2005:17
@wikingyes05:17
@wikingstd::vector v(20); :)05:17
@wikingand you do std::shuffle(v.begin(), v.end());05:17
mikelingbut things in it actually is v{1,2,3,4,5,6,0,0,0,0,0,0,0,0,0,0,0,0},05:18
mikelingbecause we only use first 6 element05:18
mikelingthe number of element is 605:18
mikelingbut the size is 2005:18
mikelingsee?05:18
@wikingah ok05:18
mikelingso I barly found the problem05:18
@wikingso you are saying std::shuffle(v.begin(), v.begin()+num_elements)05:18
@wiking?05:18
mikelingI really don't know why it failed until I output them one by one05:18
mikelingyep05:19
mikelingit works05:19
mikelingand I'm working on the other one05:19
@wikingi see :)05:19
mikelinganother problem s05:19
mikelingis05:19
mikeling(let me paste it on somewhere05:20
@wikingk05:21
mikelinghere is the output of CrossValidation_multithread.LibSVM_unlocked https://pastebin.com/0xu7PP2a. And the vector in line 17 is the breakpoint in https://github.com/shogun-toolbox/shogun/blob/develop/src/shogun/evaluation/CrossValidation.cpp#L30805:33
mikelingyou can see the last element05:33
mikelingbecome extremely large05:33
mikelingwhich make thing went wrong I guess05:33
@wikingyes05:37
mikelingSplittingStrategy look like the starting point broken everything, and I display things like https://pastebin.com/hnHcxwpJ05:39
mikelingyou can see05:39
mikelingthere always has a element looks like null05:39
mikelingor something else05:39
mikelingI guess I'm trying to figure it out now05:41
mikeling* I'm trying to figure it out now05:42
mikelingall the output is in StratifiedCrossValidationSplitting.cpp like  https://github.com/shogun-toolbox/shogun/blob/develop/src/shogun/evaluation/StratifiedCrossValidationSplitting.cpp#L11705:44
-!- sonney2k [~shogun@7nn.de] has quit [Ping timeout: 260 seconds]06:30
-!- sonney2k [~shogun@7nn.de] has joined #shogun06:31
mikelingwiking: ping06:35
mikelingI got a question06:35
@wikingpong06:35
mikelingif we have a SGVector like SGVecotr<10, true>06:35
mikelingwhat's the content in there? I will got 10 random element in there at begining ?06:36
mikelingbeginning06:36
@wikingyes06:36
@wikingif you want it to be all 006:36
@wikingthen you need to call .zero()06:36
@wikingas it just calles SG_MALLOC06:36
mikelingalright, so I probably know why we have those extremely large number in the list06:37
@wikingbut we didnt touch SGVector :P06:37
mikelingno06:39
mikelingnot directly related with it, yes. But https://github.com/shogun-toolbox/shogun/blob/develop/src/shogun/evaluation/SplittingStrategy.cpp#L11206:39
mikelingwhat if I haven't init result with right length06:40
mikelingby return a wrong num_elements of to_invert06:40
@wikinghow?06:41
@wikingi  mean ok06:41
@wikinglook06:41
@wikingthere's an assertation missing06:42
@wikingbecause06:42
@wikingwhat if to_invert->find_element(i) returns with -1 more times than result.vlen06:43
mikelingmmmm06:43
@wikingyou see what i mean?06:43
mikelingok,06:44
mikelingI see06:44
mikelingforget it :)06:44
@wikingso say to_invert contains only -106:44
@wikingin that case there's gonna be a serious problem06:44
@wikingas if (to_invert->find_element(i)==-1) will be true06:44
@wikingalwasy06:44
@wikingso it wants to set result m_labels->get_num_labels() times06:45
@wikingwhich is not possible06:45
@wikingbecause the size of result is only m_labels->get_num_labels()-to_invert->get_num_elements()06:45
@wiking:S06:45
mikelingmmmm alright06:45
mikelingI see06:45
@wikingso there should be a check06:46
@wikingthat06:46
@wikingindex < result.vlen06:46
@wikingbecause if that's not the case06:46
@wikingthere will be a memory problem06:46
@sukeyIssue #3844 "Out of memory error caused by linalg with vs2017 " opened by OXPHOS - https://github.com/shogun-toolbox/shogun/issues/384407:00
ironstarkwiking: I have anaconda installed on my system and I am using the python that comes bundled with that07:12
@wikingironstark, ok i see07:14
@wikingif you can give me instructions07:14
@wikinghow to setup your setup07:14
@wikingi can replicate it locally07:14
@wikingand then i can fix it07:14
@wikingironstark, cloud.shogun.ml is actually running on such setup07:15
@wikingso it shoudl work07:15
@wiking;)07:15
ironstark:) I just installed it using the following command bash  ~/Downloads/Anaconda3-4.4.0-Linux-x86_64.sh07:21
@wikingok cool07:35
@wikingthnx07:35
-!- OXPHOS [401e476d@gateway/web/freenode/ip.64.30.71.109] has quit [Ping timeout: 260 seconds]07:59
-!- geektoni [~geektoni@93-34-234-212.ip52.fastwebnet.it] has joined #shogun09:01
mikelingwiking: oh my god09:16
@wikingwhat'sup? :)09:17
mikelingstd::vector's implementation cheat me again09:17
@wikinglol09:18
mikelingso, when I use std::find09:18
mikelingstd::find(m_array.begin(), m_array.end(), e);09:18
@wikingyes09:18
mikelingit doesn't works actually, because all the elements in the array will been init as 009:18
@wiking?09:19
@wikingwhat do you mean by it doesn't work09:19
@wikingstd::vector<int> v = {0, 1, 2, 3, 4}09:19
@wikingshoudl return you v.begin()09:19
@wikingif you look do find(v.begin(), v.end(), 0)09:20
@wikingright?09:20
mikelingno, actually if it will be std::vector<int> v = {0, 1, 2, 3, 4, 0,0,0,0,0} if it has size 1009:20
mikelingyep09:20
@wikingah you mean that you should actually only do find09:20
@wikingif num_elements > 009:20
@wikingotherwise return -109:20
@wiking?09:20
@wikingand just do the09:20
mikelingso, if I want it return -1 for std::vector<int> v = {5, 1, 2, 3, 4}09:21
mikelingyes09:21
@wikingfind(v.begin(), v.begin()+num_elements, e)09:21
mikelingyes.....09:21
@wikingokok09:21
@wikingi see09:21
micmnwiking: when you have a minute09:23
@wikingmicmn, here09:23
@wikingwrite and i'll try to get back asap09:23
micmncan you explain me why in linalg do we need that macro/define_for_all_types thing instead of using templates?09:23
@wikingmicmn, how :)09:24
@wikingthat was the problem09:24
@wiking:>09:24
@wikingi wanted to have templates as well09:24
@wikingas it's nicer09:25
micmnwhich is the problem exactly?09:26
@wikinghow do you do it09:29
micmni mean: template <typename T> virtual void add_scalar(SGMatrix<T>& a, T b)09:32
micmnlike in linalgnamespace.h09:33
@wikingand so you would have09:34
@wikingvirtual void add(SGVector<T>& a, SGVector<T>& b, T alpha, T beta, SGVector<T>& result);09:35
@wikingvirtual void add(SGMatrix<T>& a, SGMatrix<T>& b, T alpha, T beta, SGMatrix<T>& result);09:35
@wiking?09:35
@wikingof course i myself as well cannot recall why the hell we had to do that macro hack09:36
@wikingbut there was a particular reason for it :S09:36
@wikingah yeah09:37
@wikinginheritance :D09:37
-!- HeikoS [~heiko@host-92-0-178-129.as43234.net] has joined #shogun09:37
-!- mode/#shogun [+o HeikoS] by ChanServ09:37
@wikingmicmn, so templating virtual functions is09:38
@wikingbla09:38
@wikingthat's why :09:39
@wiking:(09:39
micmnok I'll read something about that thx09:40
@wikingif you find a way around that09:41
@wikingthen we should definitely clear that macro hack09:41
@wiking:)09:41
@wikingas it's obviously super aweful09:42
micmnnonetheless compiling all that stuff for each file that includes linalg it's insane XD09:42
@wikingindeed09:42
@wikingit is very fucking insane09:42
@wikingbad design :(09:43
@sukeyNew branch feature/premature-stopping created on shogun-toolbox/shogun09:43
@sukeyNew Commit "Merge pull request #3833 from MikeLing/add_unittest_for_CDynamicArray09:43
@wikinggeektoni, ^09:43
@sukeyunit test for DynamicArray" to shogun-toolbox/shogun by vigsterkr: https://github.com/shogun-toolbox/shogun/commit/9efa3b77147ccbab345c76832fc6fe4534dd4dab09:43
geektoniwiking: thnx ;)09:44
-!- HeikoS [~heiko@host-92-0-178-129.as43234.net] has quit [Ping timeout: 268 seconds]09:50
-!- johklu [c1abba08@gateway/web/freenode/ip.193.171.186.8] has joined #shogun09:54
johkluwiking, hi, I had a closer look at the saved svm in ascii format09:58
johkluthere is a section called "dictionary weights"09:59
johkluwhich seems to be what I'm looking for (the feature weights for the k-mers assigned when training the svm)10:00
johkluhowever it only contains {0}10:00
johkluno proper numbers10:00
johkluDo you think something went wrong while exporting, or is that section something completely different?10:01
@wikingmmm10:01
@wikingjohklu, you are using CommWordStringKernel right?10:02
johkluyes10:04
@wikingokey10:04
@wikingso once you have eveyrthing trained10:05
@wikingyou should be able to get the weights10:05
@wikingby10:05
johkluI was thinking, maybe I should use svm$set_store_model_features(TRUE)10:05
johklubefore svm$train()10:05
@wikingkernel$get_dictionary(size, weights);10:05
@wikingwhere size is an integer10:06
@wikingweights is gonna be a float array10:06
@wikingi'm just wondering how this would look like in R10:06
@wikingjust a sec10:06
@wikingbecause the c++ function looks like this: void get_dictionary(int32_t& dsize, float64_t*& dweights)10:07
@wikingi'm not so sure if R can handle that10:07
@wiking:S10:07
@wikingprobably not10:07
@wikingjohklu, it should be enough to basically serialize the kernel10:08
@wikingafter training10:08
@wikingas that one should contain the data10:08
@wikingwhat you are looking for10:08
@wikingjohklu, after train10:09
@wikingjohklu, call this kernel$print_serializable()10:09
@wikingand see what's on the output10:10
johkluok10:10
johkluwiking, it looks like a log10:12
@wikingyes10:12
johkluwith definitions10:12
@wikingit should dump a lot of things10:12
johkluno numbers10:12
@wikingmmm10:13
@wikingok try then just to serialise it to a file10:13
@wikingso10:13
@wikingkernel$save_serializable(....10:14
@wikingthe same way you did with the svm10:14
@wikingmake sure you save after svm$train10:14
johkluoh, sorry10:14
johklui thought by kernel you meant the svm10:14
johklui'll try again with the kernel10:14
johkluok, kernel$print_serializable() looks pretty much the same10:16
@wikingtry saving it10:17
johkludone10:24
johklulooks pretty much the same though10:24
johkluexcept for the label section at the top10:24
mikelingwiking: ping10:25
mikelingping10:25
mikelingall the tests passed !https://pastebin.mozilla.org/902455710:26
* mikeling started crying 10:27
lisitsynhaha congrats10:28
mikelinglisitsyn: thank you!10:30
johkluwiking, don't you think i might need svm$set_store_model_features(TRUE)10:33
johkluwhat is it for?10:33
-!- travis-ci [~travis-ci@ec2-54-224-88-30.compute-1.amazonaws.com] has joined #shogun10:35
travis-ciit's Viktor Gal's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: https://travis-ci.org/shogun-toolbox/shogun/builds/24272391410:35
-!- travis-ci [~travis-ci@ec2-54-224-88-30.compute-1.amazonaws.com] has left #shogun []10:35
@wikingmikeling, ? :)10:36
@wikingmikeling, so what was it?10:36
mikelingwiking: all the tesrs passed10:36
mikelinghttps://pastebin.mozilla.org/902455710:36
@wikingi mean what was the bug? :)10:36
mikelingno bug comes out for now10:37
mikeling:)10:37
mikelingwiking: oh10:37
mikelingyou mean what's the bug? right?10:37
mikelingok, so the bug is in find_element10:37
mikelinglike I said, the std::find(m_array.begin(), m_array.end(), e) will search all the element10:38
@wiking:)10:38
@wikingok10:38
@wikingso extend the unit test with that10:38
@wikingwith a case where if we would use10:38
mikelingok, I will10:38
@wikingstd::find(m_array.begin(), m_array.end(), e)10:39
@wikingthen that test would fail10:39
@wikingthis way the next time somebody touches DynamicArray10:39
@wikingthis pops up right away10:39
mikelingwiking: I see, I will do it right away :)!10:40
@wikinggood10:40
@wikingthen push it into the pr10:40
mikelingand I owe you a blog10:41
@wikingand let's see what happens on CIs :)))10:41
mikelingand a weekly report10:41
@wikingyeah let's have first this pushed into the PR10:41
mikelingok10:41
@wikingand have it all green10:41
@wikingand then you can write a lot of things10:41
@wikingwhat have you learnt about c++ :))10:41
@wikingand shogun :D10:41
@wikingjohklu, you could try10:42
@wikingit wont hurt :)10:42
@wikingbut imo it's not that10:42
@wikinglisitsyn, m_parameters->add_vector(&dictionary_weights, &dictionary_size, "dictionary_weights",10:42
@wiking"Dictionary for applying kernel.");10:42
@wikingthis should add the dictionary_weights to the parameter fw and it should be serialized10:42
@wikingright?10:42
lisitsynwell I'd guess so10:43
johkluwiking, I tried, but it gives this error: get_num_bits_in_histogram()=4 > get_num_bits()=3 [1;31m[ERROR][0m In file /scratch/adm_informatics/rootbuild/shogun/shogun-shogun_6.0.0/src/shogun/features/Alphabet.cpp line 647: ALPHABET too small to contain all symbols in histogram10:45
-!- HeikoS [~heiko@89.105.104.229] has joined #shogun10:45
-!- mode/#shogun [+o HeikoS] by ChanServ10:45
@wikingHeikoS, ping10:45
johkluthese are the commands: svm$set_store_model_features(TRUE)10:45
johklusvm$train()10:45
johkluand then comes the error10:45
@wikingjohklu, how do you set the kernel for the svm?10:45
@wikingor where?10:45
@HeikoSwiking: pong, sorry cant talk as of now10:45
@HeikoShas to be in 2 hrs10:46
@wikingok i'll email u10:46
@HeikoSkk10:46
-!- HeikoS [~heiko@89.105.104.229] has quit [Remote host closed the connection]10:46
-!- WangWang [uid231047@gateway/web/irccloud.com/x-mlezjzdzjfwklcrd] has quit [Quit: Connection closed for inactivity]10:46
johkluwiking, kernel <- CommWordStringKernel(feats, feats, param$usesign)10:46
johklusvm <- SVMLight(param$C, kernel, labels)10:46
@wikingok10:47
@wikingweird10:47
@wikingjohklu, but the trained model is good?10:47
@wikingsvm$apply gives you any good result?10:47
@wikingor any reasonable result? :)10:48
johkluyes10:49
@wikingalthough whathever... the10:49
johkluauc > 0.810:49
@wikingweight should be there10:49
@wiking:S10:49
@wikingoh good10:49
@wikingjohklu, btw since we have barely users10:49
@wikingcan you share in what context are you using shogun?:)))10:49
@wiking:D10:49
@wikingor at least our users barely contact us :)10:50
@wikingif you can share10:50
@wikingjohklu, lemme try to see if i can get it work somehow on my end with some dummy data10:50
johkluits in the context of genomics10:52
johkluI have sequences of ~ 50 bases (A,G,C, or T) lenths10:52
johklueach sequence is either categorized as methylated or unmethylated10:53
johkluand I want to predict the methylation status from the sequence10:53
johklusince the presiction seems quite successful (auc >0.8)10:54
@wiking:)10:54
johkluI would like to know the k-mers within my sequences that are the most important for prediction10:54
@wikingcool10:54
@wikingawesome application :D10:54
@wikinglemme try to help10:55
@wikingwill need some minutes10:55
@wikingok?10:55
johklu:) Thanks!10:55
johkluI could also give you the model or data if that would help10:57
@sukeyPull Request #3845 "[PrematureStopping] Add CMake support to search or install RxCpp."  opened by geektoni - https://github.com/shogun-toolbox/shogun/pull/384510:58
@wikingjohklu, data would definitely help :)10:59
johkluwiking, i could give you a matrix containing the sequences in one column and the labels (-1,1) in another11:02
@wikingsure11:02
johkluok, give me some minutes and I will give you a link to download11:03
@wikingk11:03
johkluwiking, here you can download the data: http://medical-epigenomics.org/bocklab/jklughammer/share/sequence_table.tsv11:33
@wikingjohklu, great11:39
@wikingyou used the whole dataset for training?11:39
johkluno, half of it for training and the other half for testing11:42
@wikingk11:42
@sukeyPull Request #3845 "[PrematureStopping] Add CMake support to search or install RxCpp."  synchronized by geektoni - https://github.com/shogun-toolbox/shogun/pull/384511:47
-!- geektoni [~geektoni@93-34-234-212.ip52.fastwebnet.it] has quit [Quit: Leaving.]13:00
-!- geektoni [~geektoni@93-34-234-212.ip52.fastwebnet.it] has joined #shogun13:00
-!- geektoni [~geektoni@93-34-234-212.ip52.fastwebnet.it] has quit [Client Quit]13:00
@iglesiasglisitsyn geektoni, any news about swig and some?13:12
-!- geektoni [~geektoni@93-34-234-212.ip52.fastwebnet.it] has joined #shogun14:19
geektoniiglesiasg: nope, no news14:20
@wikingmikeling, ping14:31
mikelingwiking: pong14:40
@wikingany news on your push?14:43
@sukeyPull Request #3832 "use std::vector instead of DynArray(on going)"  synchronized by MikeLing - https://github.com/shogun-toolbox/shogun/pull/383214:43
-!- leagoetz [~leagoetz@pat-231-65.external.eduroam.ucl.ac.uk] has joined #shogun14:46
@wikingmikeling, does this run for you without error14:54
@wiking?14:55
mikelingwiking: For unit test, yes.  I found it has serializ error, but I don't know how to solve it.14:55
-!- leagoetz [~leagoetz@pat-231-65.external.eduroam.ucl.ac.uk] has quit [Remote host closed the connection]15:02
mikelingwiking: I write a mock test for the serialization https://gist.github.com/MikeLing/665b961fae759a58535ac07b1b93e39a, but it passed without error15:02
-!- leagoetz [~leagoetz@pat-231-65.external.eduroam.ucl.ac.uk] has joined #shogun15:03
-!- leagoetz [~leagoetz@pat-231-65.external.eduroam.ucl.ac.uk] has quit [Remote host closed the connection]15:06
-!- leagoetz [~leagoetz@eduroam-int-pat-8-52.ucl.ac.uk] has joined #shogun15:24
-!- tctara_ [~quassel@128.199.61.169] has quit [Ping timeout: 245 seconds]15:39
-!- leagoetz [~leagoetz@eduroam-int-pat-8-52.ucl.ac.uk] has quit [Remote host closed the connection]15:47
-!- HeikoS [~heiko@untrust-out.swc.ucl.ac.uk] has joined #shogun16:12
-!- mode/#shogun [+o HeikoS] by ChanServ16:12
-!- leagoetz [~leagoetz@eduroam-int-pat-8-52.ucl.ac.uk] has joined #shogun16:15
micmnwiking: ping16:20
@wikingpong16:20
@HeikoSwiking: jojo16:20
micmnso the linalg insanity is due mostly to the combination of two things16:21
micmnit's header only plus the define_for_all_type macro16:21
micmnhence each translation unit recompile EVERYTHING16:22
@wikingyes16:22
@wikingheader libraries tend to have that drawback16:22
micmnI see no obstacles in separating the implementation from the headers16:22
micmnis there any reason for not doing that?16:23
@wikingimo no...16:24
micmnin fact I tried splitting the eigen implementation and now the files that include linalg compile in reasonable time16:26
micmnI'll push to my repo and put a link in the journal16:27
-!- leagoetz [~leagoetz@eduroam-int-pat-8-52.ucl.ac.uk] has quit [Remote host closed the connection]16:31
-!- leagoetz [~leagoetz@eduroam-int-pat-8-52.ucl.ac.uk] has joined #shogun16:32
-!- leagoetz [~leagoetz@eduroam-int-pat-8-52.ucl.ac.uk] has quit []16:45
@wikingmicmn, sounds like a plan16:46
-!- olinguyen [81615ad9@gateway/web/freenode/ip.129.97.90.217] has joined #shogun16:59
@wikingolinguyen, ping?17:05
olinguyenhi!17:08
olinguyenwiking: i'm here17:08
@wikingolinguyen,ok i realised i wrote u an email17:09
@wiking:DD17:09
@wikingnevermind17:09
olinguyenyea, I got it :). I'll add XGboost17:09
@wikingthnx17:09
-!- tctara_ [~quassel@128.199.61.169] has joined #shogun17:22
-!- yamz [400789b6@gateway/web/freenode/ip.64.7.137.182] has joined #shogun17:24
yamzHi all, I  am using the C++ interface with a GaussianNaiveBayesModel. I am wondering if it is possible to save/serialize a trained model to disk.17:37
yamzI've found the save_serializable() function, but it does not save any of the trained state. My goal is to have my application which does machine learning not have to retrain models at application startup17:39
yamzthank you17:39
@HeikoSyamz: that should definitely work17:51
@HeikoSCan you put your code up as a github issue so that we can investigate?17:51
@HeikoSgeektoni, micmn, mikeling one of you guys should definitely look into this at some point with wiking. *Any* Shogun model should be serializable. I think we even had some sort of unit test started for that....17:52
@wikingyamz, should work17:53
@wikingyamz, can u share the part of the code17:53
@wikingthat does the serialization17:53
@wiking?17:53
@wikingjohklu, ok so17:53
@wikingone more question17:53
@wikingsorry had a long working day17:53
@wikingjohklu, can u mail me your R snippet for features and training svm?17:54
@HeikoSolinguyen, wiking you mean SGBoost using a python framework right?17:55
@HeikoSwiking: I will talk to OXPHOS now about linalg17:57
@HeikoSwiking: wanna join?17:57
olinguyenwiking, HeikoS: were you referring to this XGBoost library (https://github.com/dmlc/xgboost) popularly used in Kaggle competitions?17:57
@wikingHeikoS, there's python wrapper for xgboost17:57
@wikingolinguyen, yes17:59
@wikingHeikoS, where?17:59
@HeikoSwiking: hangout17:59
@wikingyamz, if u can share the snippet of serialization we might be able to help17:59
@wikingHeikoS, 11:57pm17:59
@HeikoSwiking: you dont have to17:59
@HeikoSwiking: just asking18:00
@wikingHeikoS, irc i can18:00
@wikingtalking i cannot18:00
@HeikoSah I see18:00
@HeikoSwe will talk18:00
@HeikoSmore efficient18:00
micmnHeikos: *Any* Shogun model should be serializable, yeah I was working on that sometime ago https://github.com/shogun-toolbox/shogun/pull/375118:01
@wikingmicmn, i've pinged u on that :)18:01
@wikingHeikoS, micmn should join18:01
@HeikoSmicmn: maybe a good idea to pick that up soon, especially the test at least18:01
@wikingif he can18:01
@HeikoSwiking: for the meeting?18:01
@HeikoSok18:01
@wikingas he had some ideas mentioned previously18:01
@wikingin fact18:01
@HeikoSmicmn: you want to join a hnagout with Pan and me discussing linalg stuff?18:01
@wikinghe has a proposal18:01
@wikingmicmn, right?18:01
@HeikoShttps://gist.github.com/lisitsyn/a6d8ff6e8690431f967c5318c3750919#file-gistfile1-txt-L12918:01
@HeikoSwiking: so I was gonna talk about this idea here18:01
@HeikoSi.e. have un-templated linalg interface18:02
-!- OXPHOS [92bd15c8@gateway/web/freenode/ip.146.189.21.200] has joined #shogun18:02
@wikingyeah18:02
@HeikoSso that the algos do not need to have templates inside18:02
@wikingbut that requires18:02
@HeikoSwiking: there is a few open questions ..18:02
@wikingrefactor18:02
@wikingof the features18:02
@HeikoSwiking: I know18:02
@wiking:)18:02
@HeikoSfeatures?18:02
@wikingyeah that one as well18:02
@HeikoSyeah sure18:03
@wikingand we need Matrix18:03
@wikingand Vector18:03
@HeikoSi requires quite a bit of stuff18:03
@wikingclasses18:03
@HeikoSbut would be good to design that soon18:03
@wikingi mean to decouple from18:03
@wikingSGV18:03
@wikingand SGM18:03
@HeikoSwe can keep templated linalg for now, and then transition once the design is done18:03
@HeikoSwiking: yeah features need iterator access18:03
@wikingyeah18:03
@wikingindeed18:03
@HeikoSno explicit vectors/matrices18:03
@wikingbut18:03
@wikingyeah18:03
@HeikoSOXPHOS: hi there18:04
@wikingmicmn, can u share the idea u had?18:04
@HeikoSmicmn: you wanna join the meeting ?18:04
@wikinganything that is half working is fine as well18:04
micmnsorry18:04
OXPHOSHeikoS hi18:04
@wikingjust to see what is your take on the refactor18:04
@wikingof the horrendeous18:04
@wikinglinalg18:04
@HeikoSmicmn: feel free to share things, this is just about brainstorming a few ideas and get OXPHOS up to speed with the latest sruff18:04
@HeikoSstuff18:04
micmnterrible headache I don't think I'm able to join the meeting18:04
@HeikoSmicmn: ok dont worry18:05
@HeikoSOXPHOS: lets just talk the two of us then18:05
@HeikoSill call you18:05
@HeikoSsee you later wiking, micmn18:05
OXPHOSHeikoS sure18:05
@wikingmicmn, no worries18:05
@wikingmicmn, in any case u have something18:06
@wikingfor HeikoS and OXPHOS that'd be great18:06
micmnyep18:06
yamz@wiking18:06
yamzauto features_train = some<CDenseFeatures<float64_t>>(f_feats_train); auto labels_train = some<CMulticlassLabels>(f_labels_train); auto gnb = some<CGaussianNaiveBayes>(features_train, labels_train); gnb->train(); auto saveFile = some<CSerializableAsciiFile>("/home/myamada/tmp/shogun_model2.out", 'w'); gnb->save_serializable(saveFile);18:06
micmnI'll make a recap18:06
yamzoops18:06
@wikingyamz, it's good18:06
yamzso i've noticed the save_serializable() function has the same output regardless of if i call it before or after calling train()18:07
@wikingyamz, and what's in shogun_model2.out?18:07
@wikingyamz, :O18:08
yamz*long lines incoming*18:08
yamz<<_SHOGUN_SERIALIZABLE_ASCII_FILE_V_00_>> max_train_time float64 0 solver_type int32 0 labels SGSerializable* MulticlassLabels [ subset_stack SGSerializable* SubsetStack [ active_subset SGSerializable* null [] active_subsets_stack SGSerializable* DynamicObjectArray [ array Vector<SGSerializable*> 0 () resize_granularity int32 128 use_sg_malloc bool t free_array bool t dim1_size int32 1 dim2_size int32 1 dim3_size int32 1 ] ] labels SGVec18:08
@wikingyamz, use18:08
@wikingyamz, pastebin.com18:08
yamzright18:08
@wikingyamz, for big pastes18:08
-!- geektoni [~geektoni@93-34-234-212.ip52.fastwebnet.it] has quit [Remote host closed the connection]18:08
yamzhttps://pastebin.com/3n8qwNGx18:08
@wikingyamz, 980 = datapoints?18:10
yamzyes 980 rows18:10
yamzand here is the source: https://pastebin.com/2wSb4MsV18:10
@wikingok18:11
@wikingthe data is shareable?18:11
yamzsure18:11
@wikingif u could18:12
@wikingthen i could debug right away18:12
@wikingmaybe we have some serious problem18:12
@wikingwith the serializaiton fw18:12
@wiking(would not be surprised)18:12
@wiking:DDDDDDDDD18:12
yamzhang on. I think I may just need to specify the separator in the csv18:12
@wikingyamz, sure... have u tested the model itself after trainign?18:13
yamzNot with this data.18:14
@wikinganything is fine actually18:14
yamzTraining data: https://pastebin.com/7xAKzdLx18:15
yamzLabels: https://pastebin.com/nDA3xqR218:15
@wikingk18:17
yamzOK so. i've changed my program to use the shogun supplied data, data/classifier_4class_2d_linear_features_train.dat18:20
yamzand modified my program to dump model before and after training18:21
yamzauto saveFile = some<CSerializableAsciiFile>("/home/myamada/tmp/shogun_model_before_train.out", 'w');18:21
yamzgnb->save_serializable(saveFile);18:21
yamzgnb->train();18:21
yamzauto saveFile2 = some<CSerializableAsciiFile>("/home/myamada/tmp/shogun_model_after_train.out", 'w');18:21
yamzgnb->save_serializable(saveFile2);18:21
@wikingand?18:22
@wiking(btw i'm just debugging18:22
@wiking)18:22
yamzboth files are the same18:22
yamz[myamada@wtl-lbuild-1 tmp]$ diff shogun_model_before_train.out shogun_model_after_train.out [myamada@wtl-lbuild-1 tmp]$18:22
yamz:(18:22
@wikingk18:22
@wikingyep18:24
@wikingi can see that the model has some info18:25
yamzseems to have the label info only18:25
@wikingyes18:26
@wikingi have to debug a bit18:26
yamzok. i appreaciate the help very much18:26
micmnHeikoS, OXPHOS, wiking: sent an email with my thoughts on the current state of linalg :)18:27
@wikingmicmn, got it thnx18:27
@wikingyamz, that's alright18:27
@wikingyamz, ok i see what's the problem18:36
-!- OXPHOS [92bd15c8@gateway/web/freenode/ip.146.189.21.200] has quit [Ping timeout: 260 seconds]18:37
yamzgreat18:37
yamzdetails?18:38
-!- OXPHOS [92bd305b@gateway/web/freenode/ip.146.189.48.91] has joined #shogun18:41
OXPHOSmicmn: thx!18:42
@wikingjust trying to fix it18:42
@wikingyamz, will push the fix soon after i tested it18:42
@wiking:P18:42
@wikingHeikoS, we need to do something aobut serialization18:42
@wiking:D18:42
yamzawesome18:42
@wikingwe fail biiiiiiiiiiiiiig time18:42
@wiking:>18:42
@HeikoSwiking: yep18:42
@HeikoSwiking: thats why i initiated the unit test thing in the first place18:42
@HeikoSbecause I wanted to know how many models dont work actually18:42
@wikingNaiveBayes has 0 params registered :D18:43
@HeikoSwiking: yep18:46
@HeikoSit was written before that was possible18:46
@HeikoSI made a few model serializable as part of my pre GSoC contributions in 2011 :D18:46
@HeikoSwiking: so first step here is to get the unit test running for that, so that as much as possible, we automate detecting such problems18:47
@wikingHeikoS, micmn should finish :)18:52
@wikingmiju18:52
@wikingyamz, ok fix is coming18:52
micmn:)18:53
@HeikoSmicmn: hi18:53
@HeikoSsaw my email?18:53
micmnfrom what I remember it was more or less working18:53
micmnyep18:53
@HeikoSmicmn: I guess the main thing is to get this serialization testing working for *all* models in shogun18:54
micmnok, I'll look into that18:55
micmnis there a list of *all* models? :D18:56
@wiking:>18:57
@wikingHeikoS, you should run ./bin/shogun-unit-test18:57
@wikingto see the progress heaven18:57
@wiking;)18:57
@HeikoShaha18:57
micmnif I remember correctly i was working on CMachine18:57
@HeikoSmicmn: yeah linear machine18:58
@HeikoSwiking: was curious what happens in ipython :D18:58
@HeikoSmicmn: we would want all classes of: linear machine, svm, kernel machine, gaussian process, multiclass, preprocessing, something like that order18:58
@wikingHeikoS, didn't u say that you wanna ditch python?18:58
@sukeyNew Commit "Fix serialization of GaussianNaiveBayes" to shogun-toolbox/shogun by vigsterkr: https://github.com/shogun-toolbox/shogun/commit/25c71a509cb6746d2b9085fbc1e4190cacbb170b18:58
@wikingyamz, ^ that's your fix18:59
@HeikoSwiking: ? ditch?18:59
@HeikoSwiking: haha18:59
@wikingHeikoS, one day you've joined that you never wanna touch python again18:59
@wiking:)18:59
@HeikoSwould love to18:59
@HeikoSwiking: yeah, just had another nightmare about (portable) exclusive file access18:59
@HeikoSand used a "library"  for that19:00
@wiking:>19:00
@HeikoSwhich messed up our NFS :D19:00
@wikingnice one19:00
@wikingHeikoS, ok so we havea nother candidate19:00
@wikingwho needs help19:00
shogun-buildbotbuild #279 of trusty - libshogun - viennacl is complete: Failure [failed test]  Build details are at http://buildbot.shogun-toolbox.org/builders/trusty%20-%20libshogun%20-%20viennacl/builds/279  blamelist: Viktor Gal <viktor.gal@maeth.com>19:00
@wikingeeeeeeeeeeeee19:00
@wikingwtf?19:00
@wikingSERIALIZATION error19:01
@wikinglloooooooooool19:01
@HeikoSwiking: these tests serialize everything19:01
@HeikoSall classes19:01
@wikingyeah19:01
@wikingbut seems there's every now and then an error19:01
@wiking:S19:01
@HeikoSif you register a parameter and they fail19:01
@wikingneed to fucking fix this19:01
@wiking:(19:01
@wikingnono unrelated19:01
@HeikoSthis usually means that the praameter wasnt initialised in default constructor19:01
@wikingSerializationAscii.SpectrumMismatchRBFKernel19:01
@HeikoSyeah?19:02
@HeikoSah ok19:02
@wikingi haven't even touched that19:02
@HeikoShaha19:02
@HeikoSnice19:02
@wikingHeikoS, so i was saying19:02
@wikingCCommWordStringKernel19:02
@wikingonce done19:03
@wikingone would like to get access to the weights19:03
@wikingnow the api for that atim is19:03
@wiking*atm19:03
@wikingvoid get_dictionary(int32_t& dsize, float64_t*& dweights)19:03
@wiking{19:03
@wikingdsize=dictionary_size;19:03
@wikingdweights = dictionary_weights;19:03
@wiking}19:03
@wikingthis will just not work in SWIG interface19:03
@wikingshould we switch it to19:04
@wikingSGVector<float64_t> get_dictionary() const;19:04
@wiking?19:04
yamzThank you @wiking!19:09
@wikingyamz, no worries19:10
-!- HeikoS [~heiko@untrust-out.swc.ucl.ac.uk] has quit [Ping timeout: 240 seconds]19:21
@sukeyNew Commit "Expose dictionary of CommWordStringKernel to the modular interfaces" to shogun-toolbox/shogun by vigsterkr: https://github.com/shogun-toolbox/shogun/commit/c65c2c54362c1964d79df67a651e71f55fefd2be19:23
@wikingjohklu, so this commit (https://github.com/shogun-toolbox/shogun/commit/c65c2c54362c1964d79df67a651e71f55fefd2be) should fix your problem of getting the dictionaries via the API in R19:23
@wikingjohklu, once you have your svm trained you should be able to call: weights <- kernel$get_dictionary()19:24
shogun-buildbotbuild #280 of trusty - libshogun - viennacl is complete: Success [build successful]  Build details are at http://buildbot.shogun-toolbox.org/builders/trusty%20-%20libshogun%20-%20viennacl/builds/28019:24
@wikinglet's see what does buildbot say19:25
@wikingsukey flip19:25
@wiking:)19:25
@sukey┬─┬ ノ( ゜-゜ノ)19:25
@wikingjohklu, of course this means that you should use the latest shogun from github19:27
@wikingto be able to have this in your R interface19:27
-!- johklu [c1abba08@gateway/web/freenode/ip.193.171.186.8] has quit [Ping timeout: 260 seconds]20:41
-!- mikeling [uid89706@gateway/web/irccloud.com/x-vejvlfbugbsyqyxg] has quit [Quit: Connection closed for inactivity]20:59
-!- HeikoS [~heiko@host-92-0-178-129.as43234.net] has joined #shogun21:59
-!- mode/#shogun [+o HeikoS] by ChanServ21:59
--- Log closed Thu Jun 15 00:00:14 2017

Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!