IRC logs of #shogun for Wednesday, 2013-11-20

--- Log opened Wed Nov 20 00:00:32 2013
-!- zxtx [~zv@129-79-241-148.dhcp-bl.indiana.edu] has quit [Ping timeout: 248 seconds]04:13
-!- zxtx [~zv@c-98-193-83-24.hsd1.il.comcast.net] has joined #shogun04:50
-!- sonne|osx [~sonne@f053043202.adsl.alicedsl.de] has joined #shogun08:15
-!- benibadman [~benibadma@94.135.236.129] has joined #shogun08:42
-!- benibadman [~benibadma@94.135.236.129] has quit []08:59
-!- benibadman [~benibadma@94.135.236.129] has joined #shogun09:03
-!- benibadman [~benibadma@94.135.236.129] has quit [Read error: Connection reset by peer]09:08
-!- benibadman [~benibadma@94.135.236.129] has joined #shogun09:09
-!- benibadman [~benibadma@94.135.236.129] has quit [Read error: Connection reset by peer]09:10
-!- benibadman [~benibadma@94.135.236.129] has joined #shogun09:10
-!- sonne|osx [~sonne@f053043202.adsl.alicedsl.de] has quit [Quit: sonne|osx]11:03
-!- sonne|osx [~sonne@f053043202.adsl.alicedsl.de] has joined #shogun11:03
sonne|osxwiking: lisitsyn1 - guys I am shocked11:07
sonne|osxben taskar died!11:08
-!- lisitsyn [~lisitsin@mxs.kg.ru] has joined #shogun11:09
sonne|osxlisitsyn?11:09
lisitsynsonne|osx: ich11:09
sonne|osxyou what is with your *1 ?11:09
-!- Saurabh7 [~Saurabh7@115.248.130.148] has quit [Read error: Connection reset by peer]11:09
lisitsynsonne|osx: oh I keep forgetting to turn off irc at other machine11:09
sonne|osxlisitsyn: I am shocked - Ben Taskar died!11:10
lisitsynsonne|osx: yes I told you yesterday11:10
lisitsyndid you know him?11:10
sonne|osxlisitsyn: didn't notice that you told me ...11:10
sonne|osxyes of course11:10
sonne|osxhe was the structured output learning god11:10
lisitsynsonne|osx: http://www.shogun-toolbox.org/irclogs/%23shogun.2013-11-18.log.html in the end of the log11:11
lisitsynsonne|osx: I don't know any of his works unfortunately11:11
sonne|osxall the SO stuff in shogun is using his formulation11:12
lisitsynoh11:12
lisitsynI see11:12
-!- mode/#shogun [+o lisitsyn] by ChanServ11:13
-!- lisitsyn1 was kicked from #shogun by lisitsyn [lisitsyn1]11:13
* sonne|osx heard gunfire11:13
@lisitsynhaha11:13
sonne|osxhis tutorials at nips or icml about SO were really excellent11:13
sonne|osxlisitsyn: and he was even younger than me o_O11:14
@lisitsynsonne|osx: yeah that's kind of unfair people die that young11:14
sonne|osxin particular when you don't smoke and your bmi is perfect11:15
sonne|osxohh well11:15
@lisitsynsonne|osx: sometimes I think it doesn't matter whether you smoke or drink or whatever11:17
sonne|osxI guess it matters but of course such sudden young deaths are more shocking and so more remembered11:18
@lisitsynso many things affect you and you anyway just reduce chances not eliminate them11:18
@lisitsynsonne|osx: are you back from being sick?11:19
sonne|osxno :(11:20
sonne|osxstill fever11:20
@lisitsynsonne|osx: oh I hope you will recover soon, you are sick for quite a few days already11:20
sonne|osxlisitsyn: I only get the toughest sicknesses nowadays11:21
@lisitsynsonne|osx: is there any reason?11:21
sonne|osxlisitsyn: I am too strong for the weak ones :D11:22
sonne|osxlisitsyn: and actually I got 2 diseases in a row one opening the doors to dr soeren :/11:25
-!- Saurabh7 [~Saurabh7@115.248.130.148] has joined #shogun11:30
sonne|osxbesser82: any updates?11:31
@wikingi saw it yesterday12:51
-!- Saurabh7 [~Saurabh7@115.248.130.148] has quit [Remote host closed the connection]12:56
sonne|osxwiking: do we have some dev meeting now? or when?13:15
@wikinganybodcould13:26
@wikingcould13:26
@lisitsynwiking: ich!13:29
sonne|osxno idea what this means13:37
-!- sonne|osx [~sonne@f053043202.adsl.alicedsl.de] has quit [Quit: sonne|osx]13:39
@wikinghoh13:40
@wikingso13:40
@wikingtalk13:40
@wikinglisitsyn: here?13:40
@wikingsonney2k_: ping13:40
@lisitsynwiking: yeahs13:40
@wikingok sonney2k_ will be back i think in 2 minutes :)13:40
@wikingas i know from his 'quits'13:40
@lisitsynwiking: lets see if you learnt good hypothesis13:41
@wiking:D13:43
-!- iglesiasg [~iglesias@2001:6b0:1:1da0:28a3:b90c:4a85:f150] has joined #shogun13:44
-!- mode/#shogun [+o iglesiasg] by ChanServ13:44
@iglesiasghello hello13:44
@wikinghellooo13:45
@wikingwe are still w8ing for sonney2k_13:45
@iglesiasgook13:46
@iglesiasgaaah wait13:46
@iglesiasgwe are doing the dev meeting now??13:46
@iglesiasgmy very very bad totally missed that it was going to be now, sorry guys13:47
@wikingno worries13:48
@iglesiasgat 15:15 I am attending a thesis presentation, but I think we will done by that time13:50
@wikinghopefully13:53
@wikingdepending on sonney2k_13:53
@lisitsyniglesiasg: wiking: how can I knew that we will have meeting now? ;)13:53
@lisitsynam I missing some mail?13:54
@wikinghehehe no13:54
@iglesiasgthere was the doodle13:54
@wikingsonney2k_: just decided but he didn't even filled out13:54
@wiking:)))13:54
@iglesiasghehehe13:54
@wikingand then he left13:54
@wiking:P13:54
@lisitsynhahah13:54
@wikinganyhow there's tons of stuff we can talk now13:54
@wikingand sonney2k_ can read the logs no?13:54
@wiking+ join later13:54
@iglesiasgI am fine either way, waiting a bit longer or starting now13:54
@wikingeither this or wait or postpone the meeting for another day?13:54
@lisitsynwell lets just talk about stuff that matters instead of managing meeting :D13:55
@wikingok13:55
@wikingcool13:55
@wikingso we were already starting a discussion13:55
@wikingand sonney2k_ had an insight of that as well13:55
@wikingabout chunking up libshogun13:56
@wikingas currently it's really redicolous how big the .so is13:56
@lisitsynhow do we chunk up?13:56
@wikinglisitsyn: he had the good idea of one lib per subdir under shogun13:56
@wikingso13:56
@wikingshogun/io13:56
@lisitsynwe should have been using some plugin architecture13:56
@wikingshogun/features13:56
@wikingshogun/label13:56
@wikingetc13:56
@wikinglisitsyn: shoulda coulda woulda13:56
@lisitsynI don't really agree13:57
@wikingnow we'll do the opposite way13:57
@wikinglisitsyn: well of course we cannot force this arch13:57
@wikingbut we can go along this line13:57
@wikingtrying to keep and then handle the exception13:57
@lisitsynwell I don't think it is a good way13:57
@wikinglibshogun_machine would be a good idea13:57
@lisitsynthere is a bunch of relations between these .so's13:57
@wikingand then after libshogun_multiclass another?13:57
@lisitsynyes13:58
@lisitsynthere is at least shogun base13:58
@wikingand of course there's a huge cross dependency between feature-machine-label13:58
@lisitsynshogun linear machine13:58
@lisitsynshogun kernel machine13:58
@lisitsynshogun multiclass13:58
@wikinglisitsyn: yeah the question is how fine grained do we want to be13:58
@lisitsynshogun instance based (knn and shit)13:58
@wikingi would go with the more the better13:58
@wikingto be something like gstreamer on the end13:59
@wikingeach little fucking module is a separate .so13:59
@wikingand then if it's needed it's loaded into memory space13:59
@wikingbut if not then it's not loaded at all13:59
@wikingthis would be a vertical split i would say13:59
@lisitsynand all 'apply' modules shoud be bsd13:59
@lisitsyn(I believe)13:59
@wikingbut then i would go on the end (taking this in mind during design) to have a horizontal split13:59
@wikingexactly14:00
@wikingtrain-apply split14:00
@lisitsynand they are AS fast as possible14:00
@wikingindeed14:00
@wikingmultiproc14:00
@wikingmultieverything14:00
@lisitsynopencl whatever14:00
@lisitsynopenvx no idea14:00
@wikingyeah14:00
@lisitsynsome kind of adapters14:00
@wikingin whatever means14:00
@wikingbut again14:00
@wikinghave the gstreamer like pipelining14:00
@lisitsynand one realtime module may be ;)14:00
@wikingthat actually allows piping some modules to GPU14:01
@wikingsome modules to openmax14:01
@wikingand shit like that14:01
@lisitsynwiking: iglesiasg: I have one idea about training part then14:01
@wikingbut to have this option we need to start drawing up an architecture14:01
@lisitsynso actually a plenty of days is spent on porting shit together14:02
@lisitsynI think we should develop a few 'adapters'14:02
@lisitsynto allow training in say matlab14:02
@lisitsyne.g.14:02
@lisitsynyou have shitload of matlab code14:02
@lisitsynmade by famous X researcher14:02
@lisitsynif we want to be the most state of the art thing around14:02
@lisitsynwe could develop an interface14:03
@iglesiasgwould an adapter be pretty much the same idea of a static interface?14:03
@lisitsynthat allows you to tie that matlab function14:03
@lisitsynthat trains model14:03
@lisitsyniglesiasg: well I mean some skeleton that keeps static14:03
@lisitsynlike shogun_train_machine(..) function14:03
@lisitsynbut you tie things in that function14:03
@lisitsynto that downloaded code14:04
@lisitsynsee what I mean?14:04
@iglesiasgmore or less14:04
@wikingbut what would u do with the return code?14:04
@wikingi mean how could u use that in shogun after14:04
@wikingor u dont care14:04
@lisitsynwiking: well it runs matlab externally and retrieves learned model14:05
@wikingjust somehow be able to push features+labels into an external code14:05
@wikinglisitsyn: but how do u know how to interpret that model?14:05
@wikingor it's up to the actual implementation14:05
@wiking?14:05
@lisitsynwiking: you write that code14:05
@wikingah ok got it14:05
@lisitsynyou just add one more layer14:05
@wikingyeah yeah i get it now14:05
@lisitsynlike shogun_train(..) and then retrieve the result14:05
@lisitsynthis way we can have basically everything14:05
@lisitsynas 'plugins'14:06
@wikingyep14:06
@lisitsynif there is no performance enough you rewrite it14:06
@wikingwell yeah this would be then part of the whole new shogun slice-up task14:06
@wikingyou can plug it in14:06
@wikingif it's really good stuff14:06
@wikingu can reimplement it in a more effective code14:06
@wikingor something14:06
@lisitsynwiking: another kind of such an adapter is python adapter14:08
@lisitsynand you run python code14:08
@lisitsynwiking: so it is like integration platform14:08
@lisitsynwiking: it is kind of big change14:09
@lisitsynbut I see it is like a good way to beat every library around14:09
@wikingyeap14:09
@lisitsynif scikits has X14:10
@wikingthat would be great to give some kind of a way14:10
@wikingto be able to call any sorts of external code14:10
@wikingyeah exactly14:10
@wikingnot always reimplement everything14:10
@wikingrather just plug in some stuff that's already available elswhere14:10
@lisitsynI don't know what to do about distributed computing14:10
@lisitsynI don't have clear vision how should it be done14:10
@wikinglisitsyn: it's already on it's way14:10
@wikingneeds a little be more interfacing14:11
@wikingbut at the momemnt it's getting there14:11
@lisitsynhow?14:11
@wikingwith the computing fw14:11
@wikingas part of one gaussian project14:11
@wikingwe have this14:11
@lisitsynah14:11
@wikingsrc/shogun/lib/computation/14:11
@lisitsynI need to check if it fits modern shit like hadoop and etc14:11
@wikingwell that's it14:11
@wikingit's just an abstract interface14:12
@wikingheiko implemented that using some batch system14:12
@wikingto do parallel stuff14:12
@wikingi think he even shared the repo14:12
@wikingbut i've started to try to use that14:12
@wikingon a hadoop env14:12
@wikingit'll need some change14:12
@wikingthe thing is14:12
@wikingthat we dont have a clear way to have for example views14:12
@wikingon features14:12
@wikingthe current shogun/lib/computation/ fw works in a way14:13
@wikingthat we dont share anything14:13
@wikinga JOB gets all the data in one package14:13
@wikingand that's it14:13
@lisitsynwell there is a lot of sharing14:13
@wikingand we dont have support for like14:13
@wikingok here's a big feature14:13
@wikingfeature matrix14:14
@wikingand i want just the first n element of it14:14
@wikingbut in the meanwhile another node wants the second n element of it14:14
@wikingwe dont support that atm14:14
@wikingjust by copying14:14
@wikingsee what i mean14:14
@wiking?14:14
@lisitsynyeah sure we don't support views14:14
@wikingand for multitask14:14
@wikingeither it is cluster14:14
@wikingor just14:14
@wikingmore cores14:14
@wikingin one machine14:14
@wikingwe need to support that14:14
@wikingviews that are thread safe14:15
@wikingthe substack arch at the moment is not thread safe at all14:15
@lisitsynyeah and some things like parameters are completely unsupportable14:15
@wikinghehe14:16
@wikingbut yeah i think we need to modify stuff in shogun/lib/computation/ in a way to be able to support in a unified way14:16
@wiking1 machine with several cores or n machine with n cores14:16
@wikingi mean the whole interface should be the same14:16
@lisitsynI still think we should get rid of getters and setters :D14:16
@wikinglisitsyn: and then?14:16
@wikingpublic properties?14:17
@lisitsynno, names14:17
@lisitsynor objects representing them14:17
@lisitsynor both14:17
@wikingah ok14:17
@wikingso your previous idea?14:17
@wikingset('whatever', value)14:17
@wiking?14:17
@lisitsynwell I had a chance to try it14:17
@lisitsynyeah14:17
@wikingor get('whatever property')14:17
@lisitsyncan't say I liked everything about it14:18
@lisitsynthe main thing is namespaces14:18
@lisitsynyou can't name that keyword 'width'14:18
@lisitsynotherwise you disallow this word in the user code14:19
@wikingmmmm14:19
@lisitsynbut strings give you no info about type14:19
@lisitsynand you have to get('whatever').as_integer() or whatever14:19
@wikingih14:19
@lisitsyndoesn't look good either14:19
@wikingyeah that's nogood14:19
@lisitsyngetBy(keyword.whatever) is like an alternative14:20
@lisitsynor any other namespace14:20
@wikingwell we had an issue for this no?14:21
@lisitsynyes14:21
@wikingwe should continue brainstorming about this further there14:21
@wikingi'm just afraid of this typing problem14:21
@lisitsynjust trying to describe what I've learnt from experience14:21
@lisitsyn:)14:21
@lisitsyntyping is no problem if we use these instances to name parameters14:21
@lisitsynit works in C++/java/python/whatever14:22
@wikingbecause afaik there's no way you can have something like: float get(string); int get(string);14:22
@lisitsynyes there is no way14:22
@wikingc++ would die on this14:22
@lisitsynno, I am speaking about14:22
@lisitsynT get(const Keyword<T>& kw);14:22
@iglesiasggtg guys, I will catch up later14:22
@lisitsynit works in any language14:22
@wikingiglesiasg: okey14:22
@lisitsynsee ya14:22
@wikinglisitsyn: mmm14:23
@lisitsynwiking: this works but you have to keep these 'keywords' in some separate namespace to avoid clashes14:23
@wikingso then you could get a Keyword<float> mykeyword? :)14:23
@lisitsynyes of course14:23
@wikingcool14:23
@lisitsynand get float14:23
@lisitsynwith no need to cast14:23
@wikingok yeah14:23
@wikingi see now the problem14:24
@wikingi mean u know what's the problem with hits14:24
@lisitsynwiking: name clashes is the most troublesome issue here14:24
@wiking*this14:24
@wikingthat if u want to develop for this kind of program14:24
@wikingit's a fucking bitch14:24
@wikingas there's no way u can use any normal autocompletion14:24
@lisitsynwhy?14:24
@wikingso u are like constantly reading the API14:24
@lisitsynautocomplete should work not that bad14:25
@wikingwhat keywords does CKernel supports14:25
@lisitsynah14:25
@wikingso say you have CKernel k;14:25
@wikingand then you do14:25
@wikingk.<tab>14:25
@wikingu r fucked :)14:25
@wikingu get14:25
@wikingget(Keyword...)14:25
@wikingset(Keyword...)14:25
@wikingand then?14:26
@wikingu need to go to API ref manual14:26
@lisitsynwiking: and k.cache_size14:26
@wikingto find out what keywords CKernel has in the first place14:26
@wikinglisitsyn: i suppose u can do that in python14:26
@wikingor u want to do this automapping?14:26
@wikingi mean autogen for c++ interface?14:26
@lisitsynwiking: I don't know yet14:27
@lisitsynwiking: autogen what?14:27
@wikingsay CKernel has like n keywords14:27
@wikingwell let's say we have an class14:27
@wikinglet's call it CPlay14:27
@wikingand we know that it supports 3 different keywords14:27
@wikingthen of course there's a way14:27
@wikingwe could generate the interface14:27
@wikingthat adds14:27
@wikingthe methods for those keywords in c++14:27
@wikingto support14:28
@lisitsynwiking: you don't have to generate any methods14:28
@wikingCPlay c; c.keyword_1 = 11.0;14:28
@lisitsynah no no I don't like it14:28
@lisitsynit won't work in java14:28
@wikingor float k1_value = c.keyword_114:28
@wikinglisitsyn: why not?14:28
@lisitsync.get(c.keyword_1)14:28
@lisitsynwiking: you can't overload = in java14:29
@lisitsynso won't work14:29
@wikingfucker14:29
@wikinglisitsyn: ah ok so then u still generate somehow the class interface14:29
@wikingto have14:29
@wikingc.keyword_114:29
@wikingi mean i would go with generate it14:29
@wikinginstead of relying on the developer14:29
@lisitsynwiking: yeah probably it makes sense14:29
@wikingthe developer should just do14:29
@lisitsynbut c.get(c.keyword) is ugly14:30
@wikingSG_ADD(parameter)14:30
@wikingand then from there we could generate what keywords are supported14:30
@wikingand that's it14:30
@wikingof course this is quite tricky14:30
@wikingbecause we rely on cpp implementation14:30
@wikingto generate the .h of the class14:30
@lisitsynwell not really tricky, it is ok I think14:30
@wikingsee what i mean14:30
@wiking?14:30
@lisitsynwe put that into .h14:31
@wikingso you rely on implementation to have a definition of a class14:31
@wikingmmm14:31
@lisitsynno, you declare them once in h14:31
@wikingi mean this would make things more clean i would say14:31
@lisitsynwiking: but again, c.get(c.keyword) is ugly14:31
@wikinglisitsyn: ok let say there's a way to define them in .h14:31
@lisitsynclassifier.get(classifier.time_limit)14:32
@wikinglisitsyn: better idea? :)14:32
@lisitsynugly!14:32
@lisitsynno14:32
@wikingtrue that it's shit14:32
@lisitsynI doubt there is a way14:32
@wikingi mean better than14:32
@wikingc.funky_function_name_because_i_had_vodkaz_set(a)14:32
@wiking;)14:32
@wikingi mean somehow we should have an autogen between registered parameters14:33
@wikingand their setter/getter14:33
@wikingthat's for sure14:33
@lisitsynthat's eay14:33
@lisitsyneasy14:33
@wikinglisitsyn: we could start with that14:33
@wikingand then take the next step14:33
@lisitsynbut I'd like to find a way around that c.get(c.shit)14:33
@wikingand in the meanwhile start thinking about d-ptrs14:33
@wikingbecause this switch14:34
@wikingshould include that as well14:34
@wikingand then14:34
@wikingactually14:34
@wikingit's noooot that hard anymore14:34
@wikingbecause we could generate the whole api of the public14:34
@wikingclass14:34
@wikingfrom the private class14:34
@wikingor something like that14:34
@wikingif private class gets more params14:35
@lisitsynno, api should be public14:35
@wikingyeah but what i mean here14:35
@wikingis that we use the d-ptrs arch14:35
@wikingi.e. we have private classes14:35
@wikingwhere the real magic is being done14:35
@lisitsynwiking: actually14:35
@lisitsynremember scikit learn14:35
@lisitsynthat we compare to each minute14:35
@lisitsyn:D14:35
@lisitsynhow do they handle it14:36
@wikingdunno14:36
@lisitsynthey have keywords to set parameters14:36
@lisitsynSVC(C=1.0)14:36
@lisitsynlike that14:36
@wikingah yeah14:36
@wikingbut that's really easy within python14:36
@wikingor we go with vargs as well? :)14:36
@lisitsynwiking: but you don't know what parameters14:36
@wikingset(vargs...)14:36
@lisitsynwiking: ah I have a way to support that in C++ actually14:36
@lisitsynsome kind of14:37
@lisitsynI mean C=1.014:37
@wikinghttp://www.cplusplus.com/reference/cstdarg/va_arg/14:37
@wikingthat's it14:37
@lisitsynno, trickery lies in overloading =14:37
@lisitsyn:)14:37
@lisitsynwiking: but what I mean is14:38
@lisitsynyou have to check docz14:38
@lisitsynto know how to call that sklearn.SVC14:38
@lisitsynand people feel ok about it14:38
@lisitsynmay be we can be comfortable too14:38
@wikingmmm yeah true.. i just like if there's autocomplete14:39
@wikingso that i dont have to alt-tab all the fucking time14:39
@lisitsynwiking: I know one way14:39
@lisitsynwiking: we can put that into the doc of the get method14:39
@lisitsynof each class14:39
@lisitsynso adding new parameter doesn't change API14:39
@lisitsynit changes doc14:39
@lisitsynyou look up what you need14:40
@lisitsynand write it14:40
@wikingshiatz14:40
@wikingfucking docs14:40
@lisitsynwhy not?14:40
@wikingyeah i get it14:40
@lisitsynwiking: not that bad I'd say14:41
@wikinglet's think about that more14:42
@lisitsynwiking: ah and if we put that into docs14:42
@lisitsynwe don't put them to classes14:42
@lisitsynwe have separate namespace14:43
@lisitsynyou do14:43
@lisitsynclassifier.<tab>14:43
@lisitsynand get14:43
@lisitsynget(Keyword)14:45
@lisitsynReturns the value of the specified parameter. Supported parameters are:14:45
@lisitsynshogun.parameters.caching.size14:45
@lisitsynshogun.parameters.caching.shit_ratio14:45
@lisitsynshogun.parameters.shit.total_shit_ratio14:45
@lisitsynwiking: ^14:45
@wikingmmm14:45
@wikingcould work14:45
@lisitsynyes it looks a bit better14:45
@wikingcool14:46
@wikingwe just need to implement it :)14:46
@wikingwhat do we do with the slicing up?14:46
@wikingbecause as sonney2k_ pointed out14:47
@wikingit's all great14:47
@wikinguntil we have only c++14:47
@wikingand the problem rises14:47
@wikingwhen we start swiging14:47
@wikingbecause he tried swig modular thing14:47
@wikingbut that was crashing for sonney2k_14:48
@lisitsynwiking: yes it is kind of problem14:48
@wikingas swig has a support to generate separate .cxx instead of one monster cxx14:48
@wikingbut apparently it's really unstable14:48
@wikingbut maaaybe that's like swig 1.x14:48
@wikingand maybe it has matured ever since then14:48
@wikingi mean none of tried that nowadays14:48
@wiking*none of us14:51
@lisitsynwiking: I see some drawback that is related to modularity14:58
-!- Saurabh7 [~Saurabh7@115.248.130.148] has joined #shogun14:59
@lisitsynif everything is that loosely coupled may be we should find a way to support that properly14:59
@lisitsyni.e. everything is defined by global names but how do we call third party (or backported from newer versions) shogun classes15:00
@wikinghehehe15:00
@lisitsynI have an algorithm called15:01
@lisitsynSVCTOTOTOTOTOTO15:01
@lisitsynhow do I import it15:01
@lisitsynif it wasn't around15:01
@lisitsynbut it is here now as .so15:01
@wiking:>15:08
@wikingwell i think there's a lot we could learn abot this15:08
@wikingfrom gstreamer15:08
@wikingas they have a way15:12
@wikingi mean of course some standard ways15:13
@wikingyou can develop any sorts of gstreamer plugin15:13
@wikingand it'll make sure that if 2 modules can work together it'll make it work together15:13
-!- iglesiasg [~iglesias@2001:6b0:1:1da0:28a3:b90c:4a85:f150] has quit [Ping timeout: 245 seconds]15:14
-!- hushell [~hushell@c-50-188-141-210.hsd1.or.comcast.net] has quit [Ping timeout: 265 seconds]15:43
@wikingwoah fuck16:11
@wikingthis is great16:11
@wikinghttp://docs.docker.io/en/master/api/docker_remote_api_v1.6/#docker-remote-api-v1-616:11
@wikingwe could easily use this for distrib stuff :)16:14
-!- lisitsyn [~lisitsin@mxs.kg.ru] has quit [Quit: Leaving.]16:14
-!- iglesiasg [~iglesias@n181-p223.kthopen.kth.se] has joined #shogun16:23
-!- iglesiasg [~iglesias@n181-p223.kthopen.kth.se] has quit [Client Quit]16:25
-!- sonne|osx [~sonne@f053043202.adsl.alicedsl.de] has joined #shogun17:20
sonne|osxwiking: no I just slept until now...17:37
@wikingsonne|osx: hihi17:44
@wikingsonne|osx: read the logs17:44
@wikingquite entertaining17:44
-!- benibadman [~benibadma@94.135.236.129] has quit [Ping timeout: 272 seconds]17:52
sonne|osxwiking: ok so I see some brainstorming about modularizing and setters/getters17:53
sonne|osxwiking: I would want to add some more down to earth stuff17:53
@wikingcool17:53
@wikingadd17:53
sonne|osxallowing to compile interfaces w/o shogun src dir being available17:53
sonne|osxand the d-ptr stuff17:53
sonne|osxI think we need to have clean interfaces first - and I think *we* need to do this17:54
sonne|osxnot some student17:54
sonne|osxit needs to be well thought through17:54
sonne|osxwhen we have this the split up might be much more obvious17:54
-!- Saurabh7 [~Saurabh7@115.248.130.148] has left #shogun ["Leaving"]17:56
@wikingindeed18:03
@wikingwell this is interconnected imo18:03
sonne|osxwiking: yeah but it is a lot of work too and I think we have to work on this together somehow - because it is a lot of work (the d-ptr stuff)18:05
sonne|osxI mean it is if done well18:05
@wikingwell18:05
sonne|osxit is not if you just copy paste what we have now18:05
@wikinglet's sketch up18:05
@wikingthe work18:05
@wikingand we can slowly start crunching on it imo18:05
sonne|osxyeah but before that I think the most important thing right now is to get into distributions18:06
sonne|osxto become a standard18:06
@wikingyey true18:06
@wikingthat means18:07
@wikingwe need to be able to18:07
@wiking"17:53 < sonne|osx> allowing to compile interfaces w/o shogun src dir being available"18:07
@wiking:D18:07
sonne|osxfor debian yes - for fedora besser82 won't need it18:07
@wikingwe dont have a problem with that "i dont know how many gigs of ram we can consume while compiling the package"18:07
@wiking?18:07
@wikingas far as i remember u told something about a limit18:07
@wikingthat actually swig generated interfaces just eats up too much ram18:08
sonne|osxwiking: we have a problem but at least each interface has different memory requirements18:08
@wikinghence we cannot have a package of those18:08
@wikingi mean official packages18:08
sonne|osxIIRC last time compiling octave and swig made things go >3.5 G or so18:08
sonne|osxso octave_modular might be problematic18:08
@wikingas obviously we can roll our own deb packages18:08
@wikingand there's no such limitation ;P18:08
sonne|osxbut we could try to limit things (say not wrap all classes / all data types) and use clang to compile18:09
@wikingwell i guess first things first18:09
@wikingcmake patch18:09
@wikingto get things compiled separately18:09
@wikingbesser82 is just overloaded with work so i guess we have to roll our own cmake hack :D18:10
@wikingsonne|osx: u still on sick leave?18:10
sonne|osxwiking: yes18:10
@wikingsonne|osx: till?18:10
@wikinggod knows only?18:10
@wikinghope nothing serious18:10
sonne|osxyeah :/18:10
@wikingjust a stupid flu18:10
@wikingget well18:11
@wikingand dont hack too much18:11
@wikinganyhow i'll check on this shiatz18:11
@wikingi just have too many things18:11
@wikingaround me lately and too little time18:11
sonne|osxwiking: no Tonsillitis18:11
@wikingbut will try to give it a go today18:11
@wikingbuuuuh18:12
@wikingi hate that18:12
@wikingsonne|osx: am i right that we are still failing with protobuf as well?18:13
sonne|osxwiking: yes18:13
@wikingcool18:13
@wikingneeds a fix as well18:13
sonne|osxwiking: besser82 said taht he wanted to do this by monday18:13
@wikingwell i guess he is just overwhelmed with other shiatz18:13
sonne|osxbut then no idea...18:13
sonne|osxyeah18:13
sonne|osxI know how to fix it18:14
@wikinghow?18:14
sonne|osxso I would just do it also avoiding the static lib18:14
sonne|osxwiking: well I call the protobuf compiler my own18:14
@wikingsonne|osx: with the right flags ;)18:14
@wikingyeah i thought to do the same18:14
sonne|osxwell it only has one flag18:14
@wikingas i was really fed up with the inflexibility of that cmake wrapper18:14
sonne|osx--cpp-out = directory to put output18:15
sonne|osxand that's it18:15
@wikingsonne|osx: go for it i'd say18:15
@wikingas that script is just good for detecting the protobuf itself18:15
@wikingand do the rest normally18:15
@wikinglike u suggested18:15
@wikingi had the same kind of problem with SWIG18:15
sonne|osxwiking: yeah and all that is missing then is to add the 3 generated .h files to the shogun src's18:15
sonne|osxand the .cc's to the stuff to be compiled18:16
sonne|osxthat's it18:16
@wikingthat there was noooo fucking way to add ccache-swig to that cmake swig macro18:16
@wikinghence i've hacked my own18:16
sonne|osxin some way shogun is special - it just has *many* deps ...18:16
@wikingwell normal18:16
-!- thoralf [~thoralf@enki.zib.de] has joined #shogun18:17
@wikinghahaha great bug by thoralf18:17
@wiking:DDD18:17
thoralfHey.18:18
thoralfWhatever I do. ;)18:18
sonne|osxguilty18:18
sonne|osxby definition!18:18
sonne|osxwiking: this reminds me that we need to finish the SGString* referenced data transition18:18
sonne|osxwiking: and I guess Heiko is totally away until january18:18
sonne|osxtoo bad18:18
sonne|osxbecause we have like 5 pages of ideas written on some notebooks18:19
sonne|osxwhat we could should do18:19
@wikingas well as supporting views for Features and Labels18:19
sonne|osxand also for gsoc etc18:19
@wikingand of course try to think about how to port std::shared_ptr ;)18:19
@wikinglalala18:19
@wikingtoo many good stuff that is needed18:19
sonne|osxexactly such things18:19
sonne|osxall tons of work18:19
@wikingyeps18:19
@wikingi dont know if it's worth it18:19
@wikingas per se18:19
@wikingi dont see that many ppl using shogun18:20
@wikingand of course that's because we dont have package18:20
sonne|osxyeah18:20
@wikingand eeeevery other ML library is like either18:20
@wikingpip install18:20
sonne|osxand I think the best way to increase number of users18:20
@wikingor apt-get install18:20
@wikingand yes18:20
sonne|osxis to have it prepackaged18:20
sonne|osxand then also18:20
@wikingwe neeed fucking native windows port:P18:20
sonne|osxto get it used in teaching!18:20
@wikingsonne|osx: indeed18:20
@wikingsonne|osx: i'm already pushing some profs i know to do this18:21
@wikingthey liked the notebooks18:21
sonne|osxso I think our demos/notebooks should cover all textbook materials18:21
@wikingso lets see18:21
thoralfsonne|osx, wiking: A good way to get more users is something like libsvm-train, libsvm-predict.18:21
sonne|osxI am doing the same18:21
sonne|osxand they all liked it18:21
@wikingsonne|osx: yeah for that we are like missing basic stuff like decision tree :)18:21
thoralfWe have tons of algorithms, but nothing out-of-the-box.18:21
@wikingthoralf: yeps18:21
sonne|osxthoralf: naa18:21
sonne|osxthen they could use libsvm :D18:21
@wikingthoralf: i'm trying to push something like gstreamer pipelining18:21
thoralfIt would be easy to change some examples to look like this.18:21
sonne|osxbut sure one could have libsvm compatible interface but supporting all svms and kernels in shogun18:22
sonne|osx(very easily)18:22
thoralfwiking: gstreamer?  Isn't it audio stuff?18:22
@wikingthoralf: ./shogun input ! reader ! preprocessor ! ML algo ! outputmodel18:22
@wikingthoralf: yes but you can do a command line like that18:22
thoralfHehe.18:22
@wikingthoralf: gstreamer input ! demux ! decode ! push it into videobuffer18:22
@wikingand the library itself will make it sure18:23
@wikingthat the stuff is converted into the right format etc18:23
sonne|osxwiking: the more flexibility you allow the slower it gets :D  but I think underneath it is already like that18:23
@wikingand that the different modules are actually connected18:23
sonne|osxyou can choose the reader18:24
@wikingsonne|osx: heheh yeah but still this is pretty easy stuff18:24
@wikingwe just need a handler for it18:24
sonne|osxfeatures an preprocessors18:24
sonne|osxand ml algo and get output18:24
@wikingindeed18:24
sonne|osxand get it evaulated by perf measure the way you want18:24
sonne|osx...18:24
@wikingyeps18:24
@wikingthat's the other pipeline18:24
sonne|osxthere just isn't a cmdline thing for that18:24
thoralfSounds a bit like over-engineering. :)18:24
@wikingthoralf: still there's a good reason why gstreamer is being deployed almost on any linux based multimedia machine18:25
@wikingthoralf: it's just well thought out and very modular18:25
@wikingyou can easily plug in and out stuff18:25
@wikingand things are really loaded in dynamically18:25
@wikingso we dont need like a 500 megz shared lib to hang around in the memory18:25
@wikingjust because we want to do evaluation on a model18:26
@wikingthat actually would require 3 things18:26
thoralfEvery time I try something with shogun, I get need valgrind/gdb in the end.18:26
@wikingthoralf: hahahahah18:26
thoralfThat's no good user experience.18:26
@wikingthoralf: welcome to opensource :)18:26
@wikingthoralf: but true18:26
thoralflol18:26
@wikingthoralf: well the best thing is to generate for each gdb session another unit test :)18:26
@wikingjust that it never happens again18:26
@wiking:P18:26
thoralfI don't care about command line stuff as long it's segfaulting as hell. ;)18:26
@wiking:D18:27
thoralfMy example can easily converted to a test - but my primary objective is to check that it's not (Soerens words ;)) self-inflicted.18:28
thoralfBtw., StructuredLabels is sucking as well.  Wasting memory and not giving it back.19:02
-!- zxtx [~zv@c-98-193-83-24.hsd1.il.comcast.net] has quit [Ping timeout: 272 seconds]19:07
sonne|osxthoralf: problem really is that the streaming feeatures did not survive the SGVector* refcount refactoring in healty shape - this whole thing needs conversion19:14
thoralfsonne|osx: All my minimal examples do not even involve streaming features.19:15
* thoralf is just entering another mine field: struct output stuff.19:16
sonne|osxwiking: I think we have this modularity code wise but no good separation into packages19:16
sonne|osxthoralf: that is hardly sth minimal though19:16
thoralfsonne|osx: Which one are you talking about?19:17
sonne|osxthoralf: structured output learning19:17
thoralfhttps://github.com/shogun-toolbox/shogun/issues/1758, https://github.com/shogun-toolbox/shogun/issues/175919:17
thoralfThat's what I found so far. ;)19:18
thoralfThere are other things related to StructuredLabels, but it's hard to track it down.19:18
thoralfCStructuredLabels * sl = new CStructuredLabels(100);19:19
thoralfOops.19:19
thoralfThis one is my next suspect: CStructuredLabels * sl = new CStructuredLabels(num);  for (int idx=0; idx<num; idx++) { sl->set_label(idx, new CRealNumber(idx)); }}19:20
thoralfEvery added label consumes 5.8k of memory.19:20
thoralfOnly RealNumbers.19:21
thoralfStops working with 3M output labels on my laptop, since eats my 16G for breakfast.19:22
thoralfAnd 3M outputs shouldn't be a big deal.19:23
thoralfWoha!19:59
thoralfnew CRealNumber(1); <-- Does 5 (!) allocating of 1024 Bytes.19:59
-!- zxtx [~zv@129-79-241-148.dhcp-bl.indiana.edu] has joined #shogun20:05
thoralfDamn.20:08
thoralfsonne|osx: I think we have a problem.20:09
thoralfshogun/base/SGObject.cpp lines 1066-106920:10
thoralfEach parameter creates a DynArray of 1024 bytes.20:10
thoralfAnd struct label inherits from SGObject20:10
-!- benibadman [~benibadma@port-92-206-116-153.dynamic.qsc.de] has joined #shogun20:20
-!- sonne|osx [~sonne@f053043202.adsl.alicedsl.de] has quit [Quit: sonne|osx]20:49
-!- benibadman [~benibadma@port-92-206-116-153.dynamic.qsc.de] has quit []20:50
-!- sonne|osx [~sonne@f053043202.adsl.alicedsl.de] has joined #shogun21:10
@wikingthoralf: we need serialization :S21:19
-!- sonne|osx [~sonne@f053043202.adsl.alicedsl.de] has quit [Quit: sonne|osx]21:28
-!- lisitsyn [~lisitsyn@80.252.20.67] has joined #shogun21:31
@wikinglisitsyn: somebody just logged in to cloud from here: http://zeliade.com/21:44
@wiking:P21:44
lisitsynwiking: it happens :)21:44
@wikinglisitsyn: heheh .net quant fw :P21:45
@wikingso i guess he would be more interesed in the c# interface :P21:45
lisitsynwiking: you still thinking of quant stuff, are you? ;)21:46
@wikinghttp://www.risk.net/journal-of-risk-model-validation/technical-paper/2161296/model-validation-theory-practice-perspectives21:46
@wikingone of the authors ;P21:46
@wikinglisitsyn: yeah but fuck man21:47
@wikingwe need like a super new language21:47
@wikingto do all those things we want21:47
@wikingin 1 clikc21:47
@wiking;P21:47
@wikingfor sure the wolfram shit will solve it for us21:47
@wiking:DDD21:47
@wiking</irony>21:47
lisitsynwiking: I don't have any things I want :D21:47
lisitsynahhah yeah wolframed21:47
@wikinglisitsyn: what do u mean?21:47
lisitsynwiking: I have absolutely no idea what is needed21:47
@wikinglife21:48
@wikingand ML21:48
@wiking:D21:48
@wikingbut the stuff we were talking about21:48
@wikingwould be great to have21:48
@wikinglike yesterday21:48
@wikingnot in another 1 year21:48
@wikingalthough we must give some credit to ourselves21:48
lisitsynwiking: no idea! :D21:48
@wikingcheck this21:49
@wikinghttps://github.com/shogun-toolbox/shogun/wiki/Future-of-Shogun-Brainstorming21:49
lisitsynwiking: we should have been kaggling or whatever21:49
lisitsynto have some real tasks21:49
@wikinglisitsyn: well there's still some shiatz we could do with kaggle21:49
lisitsynI am currently having troubles with time but I still want to get to that some day21:50
lisitsynI can't be java programmer for ever :D21:50
@wikinglisitsyn: http://www.kaggle.com/c/dogs-vs-cats21:50
@wiking;P21:50
@wikingheheh time is a biatch21:51
@wikinghere21:51
@wikinglet's make 9k usd21:51
@wikinghttp://www.kaggle.com/c/yandex-personalized-web-search-challenge21:51
@wiking;d21:51
@wikingwe have ziltch to do with ALS21:51
lisitsynyandex21:51
lisitsynhah21:51
lisitsynthey were interested of hiring me21:52
@wikinghehehe21:53
@wikinggreat :)21:53
@wikingbtw we could apply to be an apache incubator proj if we r interested21:53
lisitsyndon't know about it21:55
@wikingwell there are ups and downs for being such project21:58
-!- thoralf_ [~thoralf@91-66-33-4-dynip.superkabel.de] has joined #shogun22:02
thoralf_Heyhey.22:02
@wikingthoralf: hola amigo22:02
thoralf_wiking: I read your comment, but I don't know whats best...22:02
@wikingthoralf_: well is it leaking now?22:02
@wikingor 'just' consuming too much memory? :P22:03
@wikingi mean i understand that having 1 fucking float22:03
@wikingis just crazy to have so much overhead22:03
@wikinghence we should do something about this22:03
thoralf_wiking: The leak is caused by something else, so yes, it's leaking now.  But it's not related to this bloat. ;)22:03
@wikingyeah22:03
@wikingwe are becoming a bloat machine :)22:04
@wikingit's almost like matlab22:04
thoralf_My problem is that I cannot evaluate my data set.22:04
@wiking1 float = 1MB :P22:04
thoralf_lol22:04
@wikingso yeah we should kill that22:04
@wikingor something22:04
@wikingcannot evaluate because?22:05
thoralf_1 float = 5*1024 Bytes + something small to hold the float. ;)22:05
thoralf_No enough memory.22:05
thoralf_2M entries take 20GB of RAM.22:05
@wikingzep22:06
@wikingyep22:07
@wikingunderstand the pain22:07
thoralf_I would try to solve it, but it's to close to shoguns internals...22:08
-!- iglesiasg [~iglesiasg@s83-179-44-135.cust.tele2.se] has joined #shogun22:10
-!- mode/#shogun [+o iglesiasg] by ChanServ22:11
thoralf_Hey iglesiasg22:11
@iglesiasgthoralf_, hi!22:11
@iglesiasgI came around because of your mails :)22:11
@iglesiasgso something is pretty bad with structured labels and dynamic object array, isn't?22:12
thoralf_Yes.22:13
thoralf_Different things.22:13
thoralf_But you see the tickets. ;)22:13
thoralf_Labels are bad as well.22:13
thoralf_StructuredLabels inherit from CSGObject, but this need 5k/instance.22:14
thoralf_So 2M RealNumber labels are taking 11GB.22:15
@iglesiasgShit22:15
@iglesiasgI never thought about memory overload due to CStructuredData being a CSGObject22:16
@iglesiasgthoralf_, where do the 5k/instance come from? Is it because SGObject has many attributes?22:16
thoralf_new Parameter(...)22:16
thoralf_:)22:17
thoralf_Parameter internally uses DynArray.22:17
thoralf_Which pre-allocates 1024 Bytes.22:17
thoralf_128 entries, 8 bytes each.22:17
@iglesiasgare all these 128 entires used?22:17
thoralf_I could make it 16 entries, but that's all. ;)22:17
thoralf_No, only pre-allocation.  I didn't check, but I'd say the array will hold about 5 entries.22:18
thoralf_But reducing the pre-allocation is a poor fix. ;)22:18
@iglesiasgDo you think so?22:18
@iglesiasgWhy?22:18
@iglesiasgnot in general for DynArray of course22:19
@iglesiasgbut maybe is something we have to fix in this use case22:19
@iglesiasgshrink the DynArray to the memory used22:19
thoralf_But DynArray is used in many places.22:20
lisitsynoh we have one something like one virtual machine for a float22:22
-!- sonne|osx [~sonne@f053043202.adsl.alicedsl.de] has joined #shogun22:22
lisitsynwe are like hypervisor haha22:22
thoralf_But I admit that your fix is pragmatic as hell. :)22:22
thoralf_iglesiasg:22:22
thoralf_lisitsyn: lol22:22
@iglesiasgthoralf, http://en.cppreference.com/w/cpp/container/vector/shrink_to_fit22:22
thoralf_lisitsyn: Pooled floats with auto-failover.22:22
@iglesiasgI mean something like this22:22
lisitsynthoralf_: yeah why not, one instance of QNX to handle floats22:23
@iglesiasgthoralf, a method that allows to shrink, no that all DynArrays are automatically shrinked22:23
lisitsynimagine how stupid some parts of C++ are22:23
thoralf_iglesiasg: Yeah, but we first need to decide what exactly is the bug. ;)22:24
lisitsynshrink to fit just appeared in C++1122:24
-!- FSCV [~FSCV@201.161.7.110] has joined #shogun22:24
lisitsynbefore you had to copy it22:24
lisitsyn:D22:24
@iglesiasglisitsyn, "just"22:24
thoralf_iglesiasg: Parameter stuff?  StructLabels?  CSGObject?22:24
@iglesiasgwe are already in 2013 :D22:24
@iglesiasgfinishing actually22:24
lisitsyniglesiasg: 2011 is way too late for such stupid things :)22:24
@iglesiasglisitsyn, hehe I agree22:25
sonne|osxiglesiasg: why does RealNumber have to be an SGObject?22:25
@iglesiasgsonne|osx, CStructuredLabels are basically a dynamic object array of CStructuredData22:26
@iglesiasgand CStructuredData is CSGobject22:26
@iglesiasgone could not make CStructuredData inherit from CSGObject, but then we lose advantages like using SG_REF/UNREF22:27
@iglesiasgthoralf_, I am actually not aware how the Parameter stuff works, do you know about it?22:29
thoralf_iglesiasg: No, nothing.22:29
@iglesiasgok22:30
thoralf_I just debugged into it.22:30
thoralf_sonne|osx: What about a class that only cares about SGREF/UNREF methods to inherit from?22:31
@iglesiasgthoralf_, I am checking CSGObject atm, I see there several Parameter*22:31
thoralf_sonne|osx: Would instantly solve the complete issue (exept that we're putting floats into objects, but it's the way SO works ;))22:32
@iglesiasgthoralf_, yeaah, putting float into objects is overkill indeed. But there is actually no real reason to use SO with the CRealNumber apart from debugging purposes, right?22:33
@iglesiasgif your labels can be put into a real number, then you don22:33
@iglesiasgyou don't have structured output, why to use SO then :P22:33
thoralf_No, RealNumber was just for the minimal example.22:34
thoralf_I'm using something else.22:34
@iglesiasgI understand22:34
@iglesiasgthoralf_, soo, let's see if I got the issue correctly, the problem is that Parameter has a DynArray inside, and every CSGObject has some Paremeter attributes22:35
thoralf_Yes.22:36
-!- shogun-notifier- [~irker@7nn.de] has joined #shogun22:36
shogun-notifier-shogun: Soeren Sonnenburg :develop * 0e503de / src/shogun/base/Parameter.h: https://github.com/shogun-toolbox/shogun/commit/0e503dee1eaef2038b6bcd86b5271c4a612605b522:36
shogun-notifier-shogun: slightly decrease memory requirements of (unused) parameters22:36
@iglesiasgand due to memory allocation of the DynArray, that takes much memory22:36
thoralf_sonne|osx: Yeah. ;)22:36
@iglesiasghehehe we just got the fix22:37
thoralf_iglesiasg: S?ren was tired of argueing ;)22:37
@iglesiasgso what are these m_params in Parameter suppose to hold22:38
thoralf_iglesiasg: Okay, now the next problem: StructuredLabels->set_label(...)22:38
@iglesiasgI think it is related to the model selection stuff22:38
@iglesiasgall right, next then!22:38
shogun-buildbot_build #2476 of deb1 - libshogun is complete: Failure [failed compile test]  Build details are at http://buildbot.shogun-toolbox.org/builders/deb1%20-%20libshogun/builds/2476  blamelist: Soeren Sonnenburg <sonne@debian.org>22:38
thoralf_iglesiasg: https://github.com/shogun-toolbox/shogun/issues/175922:38
thoralf_set_label does not increase num_labels.  This has different side effects.22:39
thoralf_first: since it thinks, the array is empty, it doesn't free it.22:39
thoralf_second: I cannot check how many entries are in it (found it accidently with paranoid assertions)22:40
@iglesiasgthoralf_, well, the thing is that in this case add_label should be used22:40
@iglesiasgthoralf_, but I understand that the API should support set_label well22:40
@iglesiasglet me try to remember why I decided to separate add_label and set_label22:40
thoralf_set_label maybe should be renamed "replace_label" and assert that the entry is already set. ;)22:41
@iglesiasgit makes lot of sense22:42
sonne|osxthoralf_: that is not really a fix - it is just not a good thing to use the framework like this - IMHO it should rather be high-level solved22:43
thoralf_Why I need set_label(): When doing computations in parallel, the order of add_label() is not determined.22:43
sonne|osxas in you have a vector of real numbers22:43
sonne|osxnot just a single number22:43
thoralf_sonne|osx: My data is some self-cooked Multilabel stuff.22:43
sonne|osxI mean it is clear that SGObject has a huge overhead22:43
thoralf_RealNumber was just a show-case.22:44
sonne|osxI think realnumber should only be used when you have a handful of dims22:44
sonne|osxfor all the rest you should introduce other high-level objects22:44
thoralf_I have a MultiLabel-Output per Input.  Multilabel internally uses vectors of ints, no objects.  So my problem is just having outputs for 2M inputs.22:46
sonne|osxanyway memory footprint should be down quite a bit22:47
@iglesiasgthoralf_, so I agree with renaming set_label. However, thinking about the parallel computations, DynamicObjectArray does not seem to be thread-safe at all. Maybe I am wrong?22:47
thoralf_sonne|osx: You just removed 4k/instance.  Not bad. :)22:48
thoralf_iglesiasg: If the array-size is known in advance, no resizing takes place.  So threads are no big deal.22:48
thoralf_iglesiasg: Problem occurs when resizing.22:48
@iglesiasgtrue true22:48
@iglesiasgout of curiosity in any case, does it become relevant to parallelize label insertion??22:49
sonne|osxiglesiasg: of course not22:49
thoralf_Creating *one* multilabel is expensive22:49
sonne|osxnothing is thread safe if not otherwise noted22:49
thoralf_When creating 8 at a time, this helps a lot.22:49
thoralf_1 Multilabel consists of >>100 dimensions.22:50
sonne|osxiglesiasg: it is like any java collection - not thread safe22:50
@iglesiasgI see (both of your points guys :)22:50
shogun-notifier-shogun: Soeren Sonnenburg :develop * c9e6013 / src/shogun/base/ParameterMap.h: https://github.com/shogun-toolbox/shogun/commit/c9e60139ba35e55a3734d0a67c868ae9addbf69d22:51
shogun-notifier-shogun: reduce overhead in parameter map22:51
shogun-buildbot_build #2477 of deb1 - libshogun is complete: Failure [failed compile test]  Build details are at http://buildbot.shogun-toolbox.org/builders/deb1%20-%20libshogun/builds/2477  blamelist: Soeren Sonnenburg <sonne@debian.org>22:53
@iglesiasgthoralf_, I am thinking about this22:55
@iglesiasgthoralf_, so how does DynamicObjectArray handles if you try to set, say, element 5 but none of the previous elements 0-4 are already set?22:56
thoralf_iglesiasg: https://github.com/shogun-toolbox/shogun/issues/1758 ;)22:56
thoralf_I'm about to create an array and assign it to dynamicobjectarry just after prediction.22:57
thoralf_A minefield.22:57
@iglesiasgwell that is not exactly what I said, but it is indeed another issue :D22:58
thoralf_set_element() obviously assumes that the element already exists, but doesn't check.22:58
thoralf_Didn't know all this up front.  Running into one mine after another... as usual.22:59
sonne|osxthoralf_: I think they just need to be null'd and all good22:59
thoralf_iglesiasg: But, answering your question: It's possible to set it - the array will be extended to the needed size.22:59
thoralf_sonne|osx: Yeah.23:00
-!- travis-ci [~travis-ci@ec2-54-205-106-73.compute-1.amazonaws.com] has joined #shogun23:00
travis-ci[travis-ci] it's Soeren Sonnenburg's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: http://travis-ci.org/shogun-toolbox/shogun/builds/1427629323:00
-!- travis-ci [~travis-ci@ec2-54-205-106-73.compute-1.amazonaws.com] has left #shogun []23:00
thoralf_sonne|osx: But it's not that easy due to the resizing.23:00
thoralf_sonne|osx: Many error-sources.23:01
@iglesiasgthoralf, extended using nulls I guess23:01
sonne|osxthoralf_: actually the get element there is scary23:01
sonne|osxno resizing will happen there23:02
sonne|osxso it can be an out of bounds access indeed23:02
thoralf_Nono, two different points. ;)23:02
sonne|osxif you in your example do set_element(NULL, 500); it would be a real issue23:02
thoralf_first: setting elements to null prevents the UNREF(undefined value)23:02
@iglesiasgthoralf, I think that for your use case the easiest is going to be if you create your own thread-safe queue :D23:03
thoralf_iglesiasg: Well, having an array and using openmp to iterate; every iteration stores one index.  No problem with threads. ;)23:04
thoralf_second: no checking is done at all23:05
@iglesiasgthoralf, easy peasy then --- use than openmp array with several threads for the object creation, and then just one thread that takes elements from this array and puts them into StructuredLabels using add_label23:07
@iglesiasgthoralf_, what do you think_23:07
@iglesiasg?23:07
thoralf_iglesiasg: I know - this thing is already solved.23:07
thoralf_iglesiasg: Just the set_label() just nagged me.23:08
@iglesiasgall right then23:08
@iglesiasgyeah sure, we should fix this23:08
@iglesiasgbut first the DynamicObjectArray issue probably23:08
-!- travis-ci [~travis-ci@ec2-54-205-106-73.compute-1.amazonaws.com] has joined #shogun23:10
travis-ci[travis-ci] it's Soeren Sonnenburg's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: http://travis-ci.org/shogun-toolbox/shogun/builds/1427721523:10
-!- travis-ci [~travis-ci@ec2-54-205-106-73.compute-1.amazonaws.com] has left #shogun []23:10
-!- hushell [~hushell@c-50-188-141-210.hsd1.or.comcast.net] has joined #shogun23:18
shogun-notifier-shogun: Soeren Sonnenburg :develop * f037955 / src/shogun/base/ (4 files): https://github.com/shogun-toolbox/shogun/commit/f0379552057d98cbee46e56ac3ebb4a269449e3c23:21
shogun-notifier-shogun: dynamically set reduced granularity23:21
shogun-buildbot_build #2478 of deb1 - libshogun is complete: Success [build successful]  Build details are at http://buildbot.shogun-toolbox.org/builders/deb1%20-%20libshogun/builds/247823:24
thoralf_iglesiasg: So am I right that we won't fix set_label() -- would it make sense to remove it?23:25
shogun-buildbot_build #1960 of bsd1 - libshogun is complete: Failure [failed compile test]  Build details are at http://buildbot.shogun-toolbox.org/builders/bsd1%20-%20libshogun/builds/1960  blamelist: Soeren Sonnenburg <sonne@debian.org>23:27
thoralf_iglesiasg: It can only be used properly if we either initialize the array with NULL or forcing the caller to take care of what has been added so far...23:28
@iglesiasgthoralf_, yes, I think it is best to remove it23:28
@iglesiasgmaybe set_element filling in with nulls makes sense for DynamicObjectArray, but I don't see how it would for StructuredLabels23:29
@iglesiasgthoralf_, let me try to remove it and see if there are many dependencies23:29
thoralf_One second.23:29
@iglesiasgok23:30
thoralf_If we only have add_label() - does it make sense to initialize the StructLabels with a target size?23:30
thoralf_No checking is done; the array grows automatically...23:30
-!- FSCV [~FSCV@201.161.7.110] has quit [Quit: Leaving]23:31
thoralf_No more difference between storage size and number of elements.23:31
@iglesiasgwell, the num_labels is used in the DynamicObjectArray constructor23:32
@wikingnyihaaa23:32
@wikingfuckshitatz23:32
@iglesiasgI guess it reduces the number of resizes23:32
@iglesiasgwiking, everything all right there? :D23:32
@wikingnada23:33
@wikingany solutions? :)23:33
@wikingiglesiasg: been workng on 5 different things today23:33
@wikingfuuuck yeeaaah!!!23:35
@iglesiasgsuperman23:35
@wiking45G     indexing/destination/indexes/default/freebase/data/index/23:35
@wikingso i have 45 gigs which is just fucking inverted index for solr23:36
@wikingniiiize23:36
thoralf_iglesiasg: Right, well.  Mind to add a line of documentation for that?23:36
@wikinganybody has a spare 45 gigs23:36
@wiking? :)23:36
thoralf_iglesiasg: The API doc for the constructor.  num_labels -> reallocate_size23:36
@wikingwith link unlimited BW23:36
thoralf_preallocate23:37
thoralf_wiking: Ehr.  How much traffic to you expect?23:37
sonne|osxthoralf_:  could you please submit this as a test https://github.com/shogun-toolbox/shogun/issues/175823:38
@wikingthoralf_: well i guess 10-20 downloads/day23:38
shogun-notifier-shogun: Soeren Sonnenburg :develop * 6bfb8fc / src/shogun/lib/DynamicObjectArray.h: https://github.com/shogun-toolbox/shogun/commit/6bfb8fc42ba384b8435917629e76d5c780d2245f23:38
shogun-notifier-shogun: potential fix for #175823:38
@iglesiasgthoralf_, the StructuredLabels doc already mention it!23:39
@iglesiasgthoralf_, is it the DynamicObjectArray one which doesn't?23:39
@iglesiasg@param dim1 dimension1 is not very deep indeed...23:39
thoralf_iglesiasg: Oh!23:40
thoralf_iglesiasg: I read this, but I understood it differently.23:40
thoralf_iglesiasg: I though it's the storage size - wasn't aware that it might grow.23:41
thoralf_wiking: Too much, sorry. ;)23:41
@wikingthoralf_: heheh thought so23:42
thoralf_wiking: Having unlimited traffic at work, but 500G/day... hmm.23:42
@wikingthoralf_: well i guess that would the harder days23:42
@wikingbut i cannot assure23:42
@wikingthat it's much less23:42
thoralf_What's in this index?23:42
@wikingas it's the index half of freebase.com23:42
@wikingcool stuff man23:42
thoralf_Oh, wow.23:43
@wikingsemantic web is my second favourite thing after shogun23:43
thoralf_Crazy shit.23:43
thoralf_How did I miss that?23:43
@wikingthoralf_: heheh dunno23:45
@wikingit's fucking cool23:45
@wikingapache has some cool tools for semantic web23:45
@wikingonly shame is that it's java23:45
sonne|osxthoralf_:  please check if that fixes the issue preferably by making it a test!23:46
thoralf_sonne|osx: The memory issue?  It's hard to test...23:47
thoralf_A test could be to set ulimit to something low and then let the script create 100000 RealNumbers. ;)23:49
thoralf_==3226==   total heap usage: 44,000,328 allocs, 322 frees, 2,032,034,974 bytes allocated23:50
thoralf_for 4M RealNumber entries23:50
thoralf_That's nice.23:50
thoralf_Only 1016 bytes per float.23:51
@wikingthat's crazy lot23:51
@wikingsizeof(double) = 8 bytes :)23:51
@wikingit's just 127 times more :P23:52
@wikingthat's far from optimal i would say23:52
@wikingthoralf_: do u use swig or directly c++?23:53
thoralf_C++23:53
@wikingthoralf_: u could just throw out the public CSGObject for StructuredData23:53
thoralf_It saved me 80% compared to before.  :)23:54
@wikingthoralf_: still...23:54
@wikingmmm w823:54
thoralf_wiking: Yes, that's what everyone said... I'll try tomorrow. ;)23:54
@iglesiasgwiking, mmm I just rebased I am getting something weird with cmake23:54
thoralf_Losing SG_*REF would be bad.23:54
@wikingthoralf_: but r u DynamicObjectArray23:55
@wikingi mean23:55
@iglesiasgcannot find source file: OBJECT23:55
@wikingiglesiasg: update your cmake23:55
@wiking:P23:55
@wikingthoralf_: so you'll have problem with DynamicObjectArray23:55
@wikingas it's storing                 DynArray<CSGObject*> m_array;23:55
@iglesiasgwiking, 2.8.7 here and minimum is 2.8.423:55
@iglesiasgwiking, which one should I use?23:55
@wikingso it won't be able to do store StructuredData if it's not inherited from SGObject23:55
@wiking:<23:56
@wikingiglesiasg: 2.8.8+23:56
thoralf_wiking: Damn.23:56
@wikingiglesiasg: we are breaking develop branch everywhere23:56
@wikingin any possible way23:56
@wiking;)23:56
thoralf_wiking: Why?  How does it depend on SGObject?23:56
@wikingthat's like sonne|osx and my work for the last couple of days23:56
thoralf_Because of ref/unref or why?23:56
@wikingthoralf_: this is in src/shogun/lib/DynamicObjectArray.h23:56
@wikingprivate: /** underlying array */ DynArray<CSGObject*> m_array;23:56
@wikingso u see23:57
thoralf_Oh.23:57
thoralf_void*? ;)23:57
@wikingihehehehe23:57
@wikinguse a different DynArray and u r fine23:57
@wikingi mean23:57
@wikingdont use DynamicObjectArray23:57
@wikingbut use DynArray23:57
@wikingand u r done23:57
@iglesiasgwiking, arrgh the one in ubuntu repos is 2.8.723:57
@wikingmore or less23:57
@iglesiasgyou killed me for 0.0.123:57
thoralf_wiking: Nice.23:57
@wikingiglesiasg: see the hack in .travis23:57
@iglesiasghaha23:57
@wikingthoralf_: then again we need to do something about this bloat machine23:58
@wiking:)))23:58
thoralf_iglesiasg: wiking is right.  No need to wrap dynarray by dynobjectarray.23:58
@wikingiglesiasg: sudo apt-add-repository -y ppa:kubuntu-ppa/backports23:58
@wikingiglesiasg: and u r good to go23:59
@wikingafter that just23:59
@wikingsudo apt-get update23:59
@wikingsudo apt-get upgrade23:59
@wikingand it'll install u cmake 2.8.923:59
thoralf_wiking: I'll check tomorrow.23:59
@iglesiasgdoing doing23:59
thoralf_Good night!23:59
@iglesiasgwiking, shall we change minimum cmake version then?23:59
@wikingthoralf_: thnx for reporting this major bloat23:59
@wikingiglesiasg: eventually yes :D23:59
@iglesiasgthoralf_, Good night! Thanks for detecting all this mess :)23:59
--- Log closed Thu Nov 21 00:00:12 2013

Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!