IRC logs of #shogun for Friday, 2019-01-18

--- Log opened Fri Jan 18 00:00:32 2019
-!- Netsplit *.net <-> *.split quits: shogun-buildbot03:00
-!- Netsplit over, joins: shogun-buildbot03:07
-!- shubham808 [~atom@14.139.240.247] has joined #shogun08:29
-!- gf712 [90520892@gateway/web/freenode/ip.144.82.8.146] has joined #shogun09:42
-!- gf712 [90520892@gateway/web/freenode/ip.144.82.8.146] has quit [Ping timeout: 256 seconds]09:47
-!- Lefteris [836fb90d@gateway/web/freenode/ip.131.111.185.13] has quit [Ping timeout: 256 seconds]09:47
-!- gf712 [90520892@gateway/web/freenode/ip.144.82.8.146] has joined #shogun09:52
-!- shubham808 [~atom@14.139.240.247] has quit [Ping timeout: 245 seconds]11:36
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun12:04
-!- mode/#shogun [+o wiking] by ChanServ12:04
-!- gf712 [90520892@gateway/web/freenode/ip.144.82.8.146] has quit [Ping timeout: 256 seconds]12:06
-!- gf712 [8028b333@gateway/web/freenode/ip.128.40.179.51] has joined #shogun12:42
-!- gf712 [8028b333@gateway/web/freenode/ip.128.40.179.51] has quit [Quit: Page closed]13:05
-!- gf712 [8028b333@gateway/web/freenode/ip.128.40.179.51] has joined #shogun14:24
-!- HeikoS [5aae0436@gateway/web/cgi-irc/kiwiirc.com/ip.90.174.4.54] has joined #shogun15:47
-!- mode/#shogun [+o HeikoS] by ChanServ15:48
@HeikoSgf712 yoyo15:48
@HeikoSwiking yo15:48
@wikingho15:48
@HeikoSwiking wanted to ping you on the website report thingi15:49
@wikingi know15:49
@wikinghad to jump in eth to some stuff15:49
@HeikoSokok15:50
gf712Hi!15:51
@HeikoSgf712 hi!15:52
gf712I'll work on the PR for the Python **kwargs15:52
@HeikoScool patch, this will make the python examples finally use all your stuff15:52
gf712Yup, and it works!15:52
gf712But yes I am not sure what caused the issue with KMeans15:52
@HeikoSyep the one I merged already used it in the legacy examples15:53
@HeikoSand notebooks (though I didnt run those)15:53
@HeikoSyeah the kmeans is weird15:53
gf712I tried out in my machine and I had the same issue15:53
gf712I didn't look too much into it because it works fine with the new API15:54
@HeikoScan you check the example python code that is generated15:54
@HeikoSbefore and after?15:54
@HeikoSbecause it seems like the .put("distance", d) is missing15:54
@HeikoSwiking you had a look at this c++ committee doc?15:55
@wikingHeikoS: yep, today i'll add stuff15:55
@wikingit's a good initiative15:55
@wikingdunno how long it'll take15:55
@HeikoSc++ 31? :D15:55
@wikingbut basically they are solving all the things15:55
@wikingthat we are doing15:55
@wiking:D15:55
@wikinglinalg etc etc15:55
@wiking:)15:55
@HeikoSthat is good15:57
@HeikoSautpdiff?15:57
@wiking:>15:58
@wikingit has been just started15:58
@HeikoSwould be cool to have linear algebra part of c++, at least the basic stuff15:59
@HeikoSand then also build those graphs15:59
@wikingthere's a proposal for graph as well16:02
@wikingcheck the google group16:02
@wikingit's active and interesting16:02
@HeikoScool will do16:04
@HeikoSgf712 are the rhs/lhs of the distance oject also null  in what you posted?16:05
gf712no, distance populated rhs and lhs16:07
gf712EuclideanDistance(disable_sqrt=false,lhs=DenseFeatures(...),m_lhs_squared_norms=Vector<double>(0): [],m_rhs_squared_norms=Vector<double>(0): [],rhs=DenseFeatures(...))16:07
@HeikoSmmh16:08
@HeikoSI wonder whether "put" is called16:08
@HeikoScan you python debug this?16:08
@HeikoSand see inside monkey patch code what happens?16:08
gf712yup, I'll have a better look16:11
gf712but put is not called with sg.KMeans is it?16:11
gf712it seems like m_rhs_squared_norms and m_lhs_squared_norms aren't initialised when using the distance factory16:14
gf712could that be the issue?16:14
@HeikoSthe default ctor of distance doesnt do that?16:20
@HeikoSgf712 ?16:20
gf712doesn't seem like it16:20
gf712but that isn't the issue16:21
gf712kmeans=sg.KMeans(k=2, distance=d)16:21
gf712doesn't work16:21
gf712but then kmeans=sg.KMeans(2, d) works...16:21
@HeikoSok needs a cleanup the thing you rwrote above. all properties need to be initialized by the default ctors16:21
@HeikoSanyways16:21
@HeikoSwe need to look at what happens in the python code you wrote16:22
@HeikoSthat calls all the put16:22
@HeikoSbecause I think16:22
@HeikoSwith the way things work now16:22
@HeikoSsg.KMeans accepts kwargs16:22
@HeikoSbut just simply ignores them16:22
@HeikoSsince it is never replaced16:22
@HeikoSah yeah that must be it16:23
@HeikoSyou see what I mean?16:23
@HeikoSsg.KMeans just happens to accept kwargs without throwing an error16:23
gf712Ah ok!16:23
gf712so it caused these issues when I changed python.json16:23
@HeikoSyes, pretty subtle16:24
gf712But then everything is ok if we use the factory instead of KMeans?16:24
@HeikoSit is bad that sg.KMeans doesnt moan upon receiving wrong arguments16:24
@HeikoSyes it is16:24
@HeikoSbut changing the behaviour might lead to undefined behaviour of the examples16:25
@HeikoShere we were lucky16:25
@HeikoSthat kmeans asserts for a distance16:25
@HeikoSsome algorithms dont16:25
gf712hmm, how do the other algorithms handle kwargs in python?16:25
@HeikoSidk16:25
@HeikoSusually all ctors throw error16:25
@HeikoStry calling sg.KMeans with some random arugments16:26
@HeikoS(Pdb) sg.KMeans(10, 10, 10)*** NotImplementedError: Wrong number or type of arguments for overloaded function 'new_KMeans'.  Possible C/C++ prototypes are:    shogun::CKMeans::CKMeans()    shogun::CKMeans::CKMeans(int32_t,shogun::CDistance *,bool)    shogun::CKMeans::CKMeans(int32_t,shogun::CDistance *)    shogun::CKMeans::CKMeans(int32_t,shogun::C16:26
@HeikoSDistance *,shogun::SGMatrix< float64_t >)16:26
@HeikoSgf712 stuff like that16:26
@HeikoSbut in fact16:26
@HeikoS(Pdb) sg.KMeans(foo=1, bar=2)KMeans(cluster_centers={function},data_locked=false,dimensions=0,distance=null,k=3,labels=null,max_iter=10000,max_train_time=0,radiuses=Vector<double>(0): [],solver_type=0,store_model_features=true)16:26
@HeikoSI never noticed that16:26
@HeikoSit is bad16:26
gf712yup, that's what was happening with the example16:27
@HeikoSwell ok, in your case, changing to the factory solves it16:27
gf712but if we use put we don't have these issues16:27
gf712since it raises an error if the arg doesn't exist16:27
@HeikoSbut we do have places in the examples where explicit ctors are used16:28
@HeikoSwe would need to change all of those to factories first, or not use kwargs16:28
gf712but the plan is to use factories right?16:28
@HeikoSso explicit ctors AND kwargs is a nogo in meta examples16:28
@HeikoSyes it is16:28
@HeikoSjust talking about what there is16:28
@HeikoSbefore we merge your PR, we need to briefly look at all examples16:29
gf712ok, but when we check against the cpp results we would get an error right?16:29
@HeikoSmeta examples that is16:29
gf712because the results would be different16:29
@HeikoSsince only those use the kwargs stuff16:29
@HeikoSlegacy dont16:29
@HeikoSah good point16:29
@HeikoSwell16:29
@HeikoSnot all examples have a test file16:29
gf712true, another reason to move all to meta16:29
@HeikoSI suggest: let's merge and solve this in another PR16:30
@HeikoSlike: quickly look at them all and fix the ones that need16:30
@HeikoSmaybe there are none16:30
gf712yup, it's on the todo list for 7.0.0 right? :p16:30
gf712as in change all to the new API16:31
@HeikoSyep16:31
gf712OK, I'll just make some minor changes for the documentation in case we all get hit by a bus...16:31
@HeikoShehe16:31
@HeikoSyeah and this double string name thing would be good to get rid of16:32
@HeikoSotherwise let16:32
@HeikoS's go :)16:32
gf712btw, did you see my "did you mean..." implementation?16:32
@HeikoSlink?16:35
gf712https://github.com/shogun-toolbox/shogun/pull/447316:35
gf712inspired by my lack of ability to spell perceptron16:36
@HeikoSah!16:37
@HeikoShaha :D16:37
@HeikoSchecking16:37
@HeikoSunderstand16:38
@HeikoSgood idea in general16:38
@HeikoSbut doing it in python I am not so sure16:38
@HeikoSan autocomplete would be something done in python16:38
@HeikoSbut this string parsing stuff .... maybe rather import a lib in c++ and do in via that?16:39
gf712yup, was more of a proof of concept16:39
@HeikoSthis would be implemented in swig c++16:39
@HeikoSso not within libshogun16:39
gf712Just didn't have levenshtein implementation in C++ at hand16:39
@HeikoSbut in the swig extension16:39
@HeikoSI think this is a really good idea16:40
@HeikoSwe can even have this for put/get16:40
@HeikoSSo I like the idea16:40
gf712Yup, since we can get the param names16:40
@HeikoSin Python, I would love to see some tab autocomplete16:40
@HeikoSbut I think we shouldn't do this correcting just for the python apiu16:40
@HeikoStoo much pollution for the effect (imo)16:41
gf712not sure how that works.. would have to dig in to ipython16:41
@HeikoSlisitsyn was talking about this16:41
@HeikoSso I suggest we leave it to him16:41
@HeikoSbut in fact16:41
@HeikoSthis thing you did in python there16:41
@HeikoSwould be a very neat entrance taslk for GSoC16:41
@HeikoSfor a student to do this in shogun.i16:41
gf712OK, I can leave the PR open as a template16:42
@HeikoSyes, would you mind writing an issue with this?16:42
@HeikoSIll edit it and tag it16:42
gf712Yup, I can do that16:42
@HeikoSAnd say that it related to the "user experience" GSoC project16:43
@HeikoS(no link needed as those do change)16:43
wuwei[m]HeikoS: hey, I just added some roadmap16:43
@HeikoSwuwei[m] cool will check16:44
@HeikoSnice, yeah this will make it easier for this PR to evolve16:45
@HeikoSwuwei[m] so speaking of GSoC16:45
@HeikoSwhat would you like to see being done in a project by a student?16:45
wuwei[m]I hope the student to help with the ongoing refactor, and better xval (continue the last year's story)16:47
wuwei[m]distance/kernel api16:48
wuwei[m]these might be a bit difficult, but important to the machine refactor16:48
@HeikoScool, so let's make a list maybe?16:49
@HeikoSxvalidation16:49
@HeikoSdistance/kernel16:49
@HeikoSwhat else?16:50
wuwei[m]and maybe string features, but this one can have lower priority16:50
wuwei[m]adding EmbedStringFeatures, and removing SGString16:51
@HeikoSyeah both good points16:51
@HeikoSmaybe some of the substitution errors ir not a failure pattern stuff? you saw the recent PRs on this right?16:52
@HeikoSwuwei[m] how do we make the project sound more interesting that simply "refacotring"? :D16:52
wuwei[m]yeah I saw that pr16:52
wuwei[m]does "moderizing" sound better :)16:53
@HeikoSyes that is already a bit better16:54
@HeikoSwe can maybe think a bit why you were interested last year?16:54
@HeikoSand then try to use this also for this year's project16:54
wuwei[m]something like "hacking into a huge codebase"16:55
wuwei[m]that's why I'm interested16:55
@HeikoSAlso, could you remove/prune the things that have been done on the wiki page: https://github.com/shogun-toolbox/shogun/wiki/GSoC_2019_project_detox16:57
@HeikoSI will add something in those hacking a huge codebase lines16:57
wuwei[m]sure will update it16:57
@HeikoSoh wait i pasted wrong project16:57
@HeikoS1 sec16:57
@HeikoSok done16:58
@HeikoSas a first step: remove all that has been solved16:59
@HeikoSnext step. add bullet points of what we want to do16:59
@HeikoSnext step: add links to all open PRs and other ressources that we can show so that people have an idea of the work that lies ahead16:59
@HeikoSI will then write some text etc16:59
@HeikoSlisitsyn yo!17:05
wuwei[m]re immutable features, last year I dropped a few non-const methods, the rest can be done after we use feature iterators17:06
@HeikoSI guess we need to package and structure this somehow17:10
@HeikoSso feel free to heavily edit things17:10
@HeikoSput bullet points17:10
@HeikoSand I will turn them into some story later17:10
-!- wiking [~wiking@huwico/staff/wiking] has quit [Remote host closed the connection]17:24
-!- Lefteris [836fb90d@gateway/web/freenode/ip.131.111.185.13] has joined #shogun17:25
@HeikoSLefteris yo!17:25
Lefterishello17:25
LefterisI thought I was in in the channel but it was just stuck17:25
@HeikoSlol no worries17:25
@HeikoSso17:26
@HeikoSwe now have this single get method gil added17:26
@HeikoSI put a link to the commit in your PR17:26
Lefterisok17:26
@HeikoSthis tried all the options get_real, get_real_vector, get_int ... etc17:26
@HeikoSit doesnt try one for srring list17:26
@HeikoSand actually, that method doesnt even exist17:26
@HeikoS(since we never exposed sting lists to the new API yet)17:26
@HeikoSso we need to add this to the swig interfaces and this single get method17:27
@HeikoSso need to map17:27
@HeikoSSGObject::get<SGStringList<bool>> to get_bool_string_list17:27
@HeikoSetc17:27
LefterisI see17:28
@HeikoSSGObject::get<SGStringList<char>> to get_char_string_list17:28
@HeikoSI suggest we start with char and see how it goes?17:28
Lefterisgreat thank you17:28
Lefterisyes. there is an issue with the java version too but I will solve the python first17:28
@HeikoSyeah that is usually the best approach17:29
@HeikoSthe shogun.i file contains a lot of magic between c++ and the interfaces17:29
@HeikoSin particular translation of templated methods to python names etc17:30
Lefterisok17:30
@HeikoSLefteris it should be relatively quick to add this method17:34
@HeikoSIIRC all swig interfaces support string lists as a type17:35
Lefterisyes. I am testing now17:35
@HeikoSIt would be great to have this exposed to the API actually, major step forward in porting the string examples/applications17:35
Lefterisyes, many of the examples use it.17:35
gf712Posted the issue in https://github.com/shogun-toolbox/shogun/issues/447517:36
Lefterisa lot of yak shaving for this one17:36
@HeikoSLefteris big codebase for you :D17:37
Lefterisprobably17:37
@HeikoSgf712 btw you think we could clean this thing up a bit, rather than doing all this copy-pasting: https://github.com/shogun-toolbox/shogun/blob/develop/src/interfaces/swig/shogun.i#L23317:37
gf712yup, can use sg._rename_python_function17:38
gf712I can put it all in the python typemaps.i17:38
gf712will be easier to maintain17:38
@HeikoSgood idea17:39
Lefteristhe interfaces confused me a bit but now I amd starting to understand it. It is a good idea to implement when developing scientific softwware17:39
gf712good thing _rename_python_function works for both modules and classes :D17:39
@HeikoSLefteris yeah the interfaces came a long way17:40
@HeikoSgf712 when I was a boy a always dreamed of writing code that modifies itself :D17:40
gf712haha it works until at some point you hit some recursion and everything crashes17:41
@HeikoSLefteris the idea now kinda is that we (the shogun devs) do all the annoying work that that researches can easily add algorithms17:41
gf712btw shouldn't levenshtein be implemented in shogun (at some point)?17:41
@HeikoSgf712 my inspiration came from "Short Circuit"17:42
LefterisHeikoS: I agree!!17:42
@HeikoSgf712 maybe, but only if it would serve some other purpose as well I guess17:42
gf712It just tends to be more useful hamming, which has been implemented17:43
@HeikoSgf712 it would be overkill to instantiate CDistance using string features and then do the error msg passing using that or?17:43
@HeikoSoh I see17:43
@HeikoSmaybe then it should be done in a similar way17:43
@HeikoSkinda funny if shogun used itself to generate error msgs17:43
gf712maybe, but can return CStringFeatures instead of std::set<std::string> to get all the shogun objects17:44
gf712that would be halfway there17:44
gf712and it would be more optimised using shogun operations (hopefully)17:45
gf712so it might almost be worth it :D17:45
@HeikoSwe can have a ctor for string features from std17:45
@HeikoSwhere is hamming distance implemented?17:45
gf712and yes, generating it's own error messages would almost be a application17:45
gf712in distances17:46
@HeikoSah yes17:46
@HeikoSCHammingWordDistance17:46
@HeikoSsigh17:46
@HeikoSclass CHammingWordDistance: public CStringDistance<uint16_t>17:46
@HeikoSI hate those17:46
gf712https://github.com/shogun-toolbox/shogun/blob/develop/src/shogun/distance/HammingWordDistance.h17:46
gf712yup17:46
@HeikoSfixing template parameters when inheriting17:46
@HeikoSand the algorithm doesnt even make use of that in any way17:47
@HeikoSso completely pointless just creating problems ...17:47
gf712haha17:47
gf712I thought about working from that for lv, but yea I lost motivation.. :D17:48
@HeikoSfor lv?17:48
gf712levenshtein17:48
gf712too complicated to write..17:49
@HeikoSah I see17:49
@HeikoSI am sure there is some c++ out there17:49
@HeikoSwell entrance task it17:49
gf712yup, I linked it on the issue17:49
@HeikoSsometimes there are quite clever people coming along solving those things if these are documented clearly17:49
gf712but I am not sure if it is the dynamic programming solution17:50
gf712which should be much quicker17:50
@HeikoSfor error msg that shouldnt matter17:50
gf712like the one I wrote in Python17:50
@HeikoSfor the ml it does17:50
gf712Yup, exactly17:50
@HeikoSthere we go17:51
@HeikoShttp://www.jmlr.org/papers/volume17/rieck16a/rieck16a.pdf17:51
@HeikoSthe authors are also shogun committers17:52
@HeikoShttp://www.mlsec.org/harry/17:52
@HeikoSthey also have another tool called "sally" which is complementary17:53
@HeikoSlots of good trivia17:53
gf712The implementation is easy, just need to decide which data structure to use17:53
@HeikoSbtw any updates on the openml glueing script?17:53
@HeikoSI am really curious how that will go17:53
gf712I haven't started on that yet17:54
gf712Was focusing on getting stuff ready for 7.0.017:54
gf712but I can give it a go this weekend/next week17:54
gf712need this to go through first, I think https://github.com/openml/openml-python/pull/54317:55
@HeikoSLefteris any luck with the swig?18:05
Lefterisyes, it is working18:05
@HeikoScool18:05
LefterisI will correct the other mistakes I have and try to make java work18:05
@HeikoSgf712 that is better actually the release is pressing :)18:05
@HeikoSgf712 as said you can just start doing a simple python script18:06
@HeikoSwhere you collect all the calls they do in there18:06
@HeikoSdont worry about the java for now Lefteris18:06
@HeikoSit is ok if the CI fails18:06
Lefterisok18:07
@HeikoSwe can fix that in a second step18:07
@HeikoSonce it works locally send it18:07
@HeikoSand then we can iterate18:07
gf712OK, I can write something to get started18:07
@HeikoSalso the CI tests all the interfaces for you faster than your own machine18:07
@HeikoSgf712 things like description api etc18:07
gf712assuming viktors PR is merged18:07
@HeikoSgf712 sorry that previous was for Lefteris18:10
@HeikoSgf712 I think you can get started quite a bit before viktors PR is merged as we already can see things that are needed from the shogun side18:10
gf712Oh, I just meant using the changes that he is proposing18:10
gf712but yes, I 'll see what is required18:11
gf712and get on it18:11
@HeikoScool18:11
@HeikoSI will write some gsoc projects this coming18:11
@HeikoSweek18:11
@HeikoSoh and btw we might end up geting giovannit to support both of you gf712 and Lefteris18:12
@HeikoSfrom another angle on the same project18:12
@HeikoSlet's see how that goes18:12
@HeikoSOk, I'mm off for now18:12
@HeikoSsee you guy later18:12
Lefterisbye, thanks!!18:13
gf712ok!18:15
gf712see you later18:15
-!- HeikoS [5aae0436@gateway/web/cgi-irc/kiwiirc.com/ip.90.174.4.54] has quit [Ping timeout: 246 seconds]18:22
-!- Lefteris [836fb90d@gateway/web/freenode/ip.131.111.185.13] has left #shogun []18:25
-!- witness [uid10044@gateway/web/irccloud.com/x-dxmrcayopwhxspcd] has joined #shogun18:39
-!- witness [uid10044@gateway/web/irccloud.com/x-dxmrcayopwhxspcd] has quit [Quit: Connection closed for inactivity]20:48
--- Log closed Sat Jan 19 00:00:34 2019

Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!