IRC logs of #shogun for Thursday, 2019-05-16

--- Log opened Thu May 16 00:00:17 2019
-!- anvan [~androirc@103.252.200.48] has quit [Ping timeout: 268 seconds]05:09
-!- anvan [~androirc@20.203-211-155.idc-office.qala.com.sg] has joined #shogun05:36
-!- anvan [~androirc@20.203-211-155.idc-office.qala.com.sg] has quit [Ping timeout: 250 seconds]05:49
-!- AndroUser2 [~androirc@137.132.214.3] has joined #shogun06:17
-!- AndroUser2 [~androirc@137.132.214.3] has quit [Ping timeout: 255 seconds]06:28
-!- AndroUser2 [~androirc@10.203-211-155.idc-office.qala.com.sg] has joined #shogun06:37
-!- wiking [~wiking@huwico/staff/wiking] has quit [Remote host closed the connection]06:46
-!- AndroUser2 [~androirc@10.203-211-155.idc-office.qala.com.sg] has quit [Ping timeout: 252 seconds]06:50
-!- AndroUser2 [~androirc@58.185.251.86] has joined #shogun06:59
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun07:15
-!- mode/#shogun [+o wiking] by ChanServ07:15
-!- AndroUser2 [~androirc@58.185.251.86] has quit [Ping timeout: 255 seconds]07:15
-!- wiking [~wiking@huwico/staff/wiking] has quit [Ping timeout: 258 seconds]07:20
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun07:37
-!- mode/#shogun [+o wiking] by ChanServ07:37
-!- essam [c5351150@gateway/web/freenode/ip.197.53.17.80] has joined #shogun07:54
-!- anvan [~androirc@103.252.200.48] has joined #shogun08:18
-!- gf712 [c13cdcfd@gateway/web/freenode/ip.193.60.220.253] has joined #shogun09:46
-!- geektoni [c1cdd253@gateway/web/freenode/ip.193.205.210.83] has joined #shogun09:57
-!- HeikoS [~heiko@176.pool85-48-188.static.orange.es] has joined #shogun10:21
-!- mode/#shogun [+o HeikoS] by ChanServ10:21
@HeikoSgf712:  yo10:25
-!- gf712 [c13cdcfd@gateway/web/freenode/ip.193.60.220.253] has quit [Ping timeout: 256 seconds]10:34
-!- wiking [~wiking@huwico/staff/wiking] has quit [Remote host closed the connection]10:34
-!- geektoni [c1cdd253@gateway/web/freenode/ip.193.205.210.83] has quit [Ping timeout: 256 seconds]10:35
-!- gf712 [c13cdcfd@gateway/web/freenode/ip.193.60.220.253] has joined #shogun10:40
gf712HeikoS: hey10:40
@HeikoSgf712: hi!10:40
gf712sorry had a building evacuation at the ati10:40
@HeikoSwhat?10:40
@HeikoSpractice?10:40
gf712fire alarm10:40
@HeikoSah10:40
gf712no, actual thing10:40
gf712but nothing happened10:40
gf712this is when you realise how many ppl go to the British library :D10:41
gf712HeikoS: managed to get a speed up on the parsing btw10:41
@HeikoSyeah I saw10:41
@HeikoSquite nice10:41
@HeikoSgf712: all of euston road blocked?10:42
gf712no, just the courtyard was full of people10:42
gf712but staff can use the side entrance so managed to get back quickly10:42
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun10:44
-!- mode/#shogun [+o wiking] by ChanServ10:44
@HeikoSgf712 wiking you have thoughts on adding domains to varialbes10:45
@HeikoSlike "positive"10:45
-!- wiking [~wiking@huwico/staff/wiking] has quit [Ping timeout: 248 seconds]10:48
@HeikoSgf712: btw what is the state of the model selection stuff10:51
@HeikoSanything I can help with there?10:52
gf712HeikoS: what do you mean with "positive"?10:52
@HeikoS>=010:52
gf712ah I haven't touched that for a while10:52
@HeikoS>010:52
gf712just getting openml in shogun done10:52
@HeikoSgf712: yeah no worries10:52
@HeikoSjust wondering10:52
gf712to have a nice example in all targets10:52
gf712languages10:52
@HeikoSok cool10:53
gf712mhhh when would you need positive?10:53
@HeikoSmeta example right?10:53
gf712yup10:53
@HeikoSgf712:  like K in KNN10:53
@HeikoSor Kmeans10:53
@HeikoSput("k", -1) // kaboom10:53
gf712oh you mean to enforce that the user gives positive?10:53
gf712ah ok10:54
@HeikoSthere is the option to assert that in the "train" method10:54
@HeikoSwhich is more effective10:54
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun10:54
-!- mode/#shogun [+o wiking] by ChanServ10:54
@HeikoSbut also more indirect10:54
gf712yea, we can add yet another param descriptor10:54
gf712in anyparameter10:54
gf712not sure what the cost is10:54
gf712or do you mean have a separate class for each domain10:55
gf712?10:55
gf712because with the anyparameter I can see having a lambda that does such a check10:55
@HeikoSgf712: yeah that is what I was wondering10:55
gf712and that can be put in anyparameter10:55
@HeikoSif we add more properties10:55
@HeikoSthen we need to check all of them in every put10:55
@HeikoS-> slow10:56
gf712but put is a setter10:56
gf712it doesn't need to be super fast10:56
@HeikoSyep also true10:56
gf712and it's just for interfaces10:56
@HeikoSwell yes and no10:56
@HeikoSalgorithms are supposed to change the internals of models using "put"10:56
gf712ah10:56
gf712but not inside the loop right?10:57
-!- geektoni [c1cdd253@gateway/web/freenode/ip.193.205.210.83] has joined #shogun10:57
@HeikoSiterative methods once per iteration10:57
@HeikoSso that might be once per passing all data10:57
@HeikoSstill neglectable I guess10:57
gf712hmm I think a check for an enum is still much cheaper than finding stuff in a map10:58
@HeikoSgf712: point is: more and more properties to be checked in put10:58
@HeikoSok10:58
@HeikoSand what about adding a lambda10:58
gf712I mean its just a switchtable10:58
@HeikoSthat is attached to a variable10:58
@HeikoSlike geektoni suggested10:58
gf712well lambdas are zero cost10:58
@HeikoSSG_ADD("k", &k, "bla", my_positive_lambda)10:59
gf712until they are casted to a function pointer10:59
@HeikoSand then we can have a list of general purpose lambdas where devs can pick from10:59
@HeikoSand if one is provided, then it is executed10:59
geektoniHeikoS: soo discussing put() with constraints? :)10:59
@HeikoSotherwise it isnt10:59
@HeikoSgeektoni: yes :)10:59
@HeikoSgf712: I like the lambda idea more tbh...what do you think?10:59
gf712HeikoS: that is what we did with the auto stuff10:59
@HeikoSgf712: since we can have as many as we want10:59
gf712the lambda I mean10:59
@HeikoSgf712: yes10:59
gf712we added these factories11:00
@HeikoSand then we can also have more complex checks11:00
@HeikoSlike PSD11:00
gf712yes, we can do that11:00
gf712I am not sure what the cost is11:00
@HeikoSand for the constraints, we would only need to have the option to have multiple ones, no need to offer users setting them11:00
@HeikoSso it is a bit easier than the auto stuff, where users should be able to change11:00
gf712from a developer side it is really useful11:01
@HeikoSgf712: yes that is the q. I imagine that all we need is a single check "is_there_a_lambda_attached()", and then execute if there is11:01
gf712from the performance it's hard to tell11:01
gf712yup, that is a single machine operation11:01
gf712it's just a bit shift11:01
@HeikoSwhereas with the properties, we need to check for every single property added11:02
gf712what do you mean?11:02
@HeikoSif we have two properties11:02
@HeikoSsay POSITIVE and PSD11:02
@HeikoSor rather11:02
@HeikoSPOSITIVE  and GREATER_1011:02
@HeikoSthen there need to be two checks11:02
gf712ah ok11:02
gf712that is a bit more difficult11:03
gf712but we can do some hackery to move stuff to compile time11:03
gf712and then the runtime check should be quick11:03
gf712basically we need a container of lambdas11:04
gf712which required function pointers11:04
gf712but in C++17 you can do some nice iterations over arrays11:04
gf712with lambdas11:04
@HeikoSyes that is what I thought11:04
gf712and then it's not casted to a pointer11:04
gf712and then should be fast at runtime11:04
@HeikoScool, I think that'd be ace11:05
gf712the only thing is that I don't know how we can do that in c++1411:05
@HeikoScan we do it in a way that it is runtime for now, and becomes compile time once we compile with c++17?11:06
gf712yes, we can just cast it to function pointers11:06
gf712put it in a vector11:06
gf712but I think the lambda will have to be stateless11:06
gf712I'll have to check how to do it11:06
gf712this is what I did for that in c++17 https://github.com/gf712/ProStruct/blob/master/src/prostruct/utils/tuple_utils.h11:07
-!- geektoni_ [c1cdd253@gateway/web/freenode/ip.193.205.210.83] has joined #shogun11:07
gf712that is all compile time11:07
gf712and then execute kernels at runtime11:08
-!- geektoni [c1cdd253@gateway/web/freenode/ip.193.205.210.83] has quit [Ping timeout: 256 seconds]11:08
@HeikoSgf712: stateless should be fine or?11:09
geektoni_HeikoS: LDA can go in the feature branch11:10
gf712HeikoS: well need to capture a value if we have something like GREATER_X11:11
@HeikoSgeektoni_: why not develop?11:11
@HeikoSgf712: ah sorry yes11:11
lisitsynplease no enums :P11:12
@HeikoSgf712: but the param would be read-only11:12
@HeikoSconst ref11:12
@HeikoSlisitsyn: hello!11:12
lisitsynhey11:12
lisitsynyou don't want enums11:12
@HeikoSlisitsyn: discussing to attach a lambda to parameters11:12
@HeikoSto check stuff11:12
lisitsynyes or11:12
lisitsyninterface Constraint11:12
lisitsynadd().positive().lessThan(10)11:13
geektoni_HeikoS: because it uses the observable stuff which is only in the feature branch11:13
lisitsynBuilder positive() { add(PositiveConstraint()); }11:13
@HeikoSlisitsyn: ok and that would be checked at runtime when calling ::put11:13
lisitsynenums will be PITA because you need parameters sometimes11:13
lisitsynyes11:13
lisitsynshould go into AnyParamete11:13
@HeikoSlisitsyn: yeah no enums dont worry11:13
lisitsynjust a list of requirements11:14
@HeikoSlisitsyn: the lambda thing gf712 suggested would be compile time11:14
lisitsyncompile time? que how11:14
lisitsynain't possible in python, no?11:14
@HeikoStrue11:14
gf712as in the lambda would be added at compile time to a tuple11:14
@HeikoSsorry11:14
@HeikoSwhat I mean is more11:15
@HeikoS ^11:15
gf712the requirements would be known at compile time11:15
gf712so for the K in KNN we know at compile time it has to be positive11:15
gf712no need to add a constraint like that at runtime11:15
gf712let the compiler optimise that call11:15
lisitsynohh you had quite a lot messages above11:15
lisitsyncan you outline how?11:15
gf712yes, but only in C++1711:16
@HeikoSgeektoni_: but it doesnt use observable stuff11:16
gf712basically have a tuple11:16
@HeikoSgeektoni_: only put11:16
lisitsynvalue + constraint?11:16
gf712you can then tell the compiler that the tuple has to be executed every time a value is changed11:17
lisitsynit might be a good idea to make them composable and lambdas are not composable11:17
gf712something like (apply(std::get<Idx>(lambda_tuple), lambda_args, result(Idx)), ...)11:17
gf712but why composable? they are independent operations no?11:17
geektoni_you're right indeed11:17
lisitsyne.g. lessThan(10).greaterThan(2)11:17
geektoni_HeikoS: ah! I see what you mean11:17
gf712but you still have two independent operations no?11:18
lisitsynI am not sure if checking constraints needs to be really fast11:18
lisitsynah so is it like you tuple a tuple?11:18
gf712ah no, probably not11:18
gf712mhh not sure what you mean. this just calls the lambdas11:19
gf712without casting them to pointers11:19
gf712so the compiler would inline it properly11:20
lisitsynI have very vague understanding atm11:20
lisitsynwhere exactly this apply thing happens?11:20
gf712well that would be inside the call that determines if there is something to apply, so in anyparameter?11:22
gf712in any case, this is c++17 stuff so might not be worth thinking about it for a while11:22
gf712let's just do some composition11:22
gf712it should be some light structs right?11:23
lisitsynI think so11:25
lisitsynwith something like virtual check();11:25
@HeikoSlisitsyn: I have another q for you11:33
@HeikoSlisitsyn: it is about labels11:33
lisitsynaha11:33
@HeikoSlisitsyn: you have a min?11:33
lisitsynyes!11:33
@HeikoSok11:33
@HeikoSso the current/old way of dealing with labels is11:34
@HeikoSwe have binary (-1+1), multiclass (0,1,2,...), regression, etc11:34
@HeikoSand then each algo enforces that the labels are exactly this type11:34
-!- gf712 [c13cdcfd@gateway/web/freenode/ip.193.60.220.253] has quit [Ping timeout: 256 seconds]11:34
@HeikoSwhich can be annoying say if someone has a binary labels instance and wants to run knn, they see an error11:35
@HeikoSfurthermore, multiclass labels need to be contiguous11:35
@HeikoShave to be integers11:35
lisitsynyeah thats stupid :)11:35
@HeikoSand also one cannot pass multiclasslabels(0,1,1,1,0) to a binary machine11:35
lisitsynespecially contiguous thing11:35
@HeikoSyep11:35
@HeikoSso ideally we would want11:36
@HeikoSdiscreteLabels11:36
@HeikoSwhich can be anything discrete, represented as say an int11:36
@HeikoSand then this replaces both binary and multiclass11:36
@HeikoSand the check rather is that it contains only two elements11:36
@HeikoSfor binary11:36
lisitsynI am not sure about the name11:36
@HeikoSwell doesnt matter11:36
@HeikoSClassificationLabels11:36
lisitsynhmmm11:37
lisitsynok nevermin11:37
@HeikoSnow the issues start11:37
@HeikoSSVM algorithm11:37
@HeikoSits math formulatio needs +1, -111:37
@HeikoSinternally11:37
@HeikoSand naturally, internal apply returns +1, -111:37
lisitsynoh that should never be visible to the user11:37
@HeikoS(sign of w*x)11:37
@HeikoSyes11:37
@HeikoSso we need a conversion11:38
@HeikoSto the user facing representation11:38
@HeikoSin both ways11:38
@HeikoSso the question now is: where to do that11:38
@HeikoSand the problem is: we have these meta-learning algorithms (xvalidaiton, parameter tuning, multiclass machine)11:38
@HeikoSif we did it in the ::train call, it would be done multiple times11:39
@HeikoSyou see the problem?11:39
lisitsynhmm11:39
lisitsynbut the conversion is really fast11:39
lisitsynso it might be that we do not call get() but get_svm_compatible()11:40
@HeikoSwe have something in place11:41
@HeikoSin some cases11:41
@HeikoS"binary_labels(m_labels)"11:41
@HeikoSthat could change11:41
@HeikoSto do the conversion11:41
@HeikoSand then apply converts back11:41
@HeikoSok now more problems11:41
@HeikoSxvalidation11:42
@HeikoSsay a fold doesnt contain one label instance11:42
@HeikoSi.e. it is missing class "2" of (0,1,2,3,4)11:42
@HeikoScan happen right?11:42
@HeikoSso now the mapping most likely changes11:42
@HeikoSlisitsyn: or you have an idea how to avoid that?11:42
lisitsynI don't get it yet11:43
lisitsynwhy a missing class is a problem?11:43
@HeikoSso when you compute a mapping11:43
@HeikoSsay you have11:43
@HeikoSlabels(0,0,1,1,2,2)11:43
@HeikoSor rather say11:44
@HeikoSlabels(A,A,B,B,C,C)11:44
@HeikoSand pass that to a multiclass machine that needs contiguous11:44
@HeikoSso then we map11:44
@HeikoSA->011:44
@HeikoSB->111:44
@HeikoSC->211:44
@HeikoSand run stuff internally11:44
@HeikoSbut now some fold in xvlalidation misses the B in the labels11:44
@HeikoSthen the mapping might become11:45
@HeikoSA->011:45
@HeikoSC->111:45
lisitsynahh ok11:45
lisitsynyes then mapping should not happen after the split11:45
lisitsynuhmm lets think how to ensure that11:45
@HeikoScould argue that xvalidation stuff is internal, but observers and stuff ...11:45
lisitsynI think in sklearn you do fit_transform11:45
@HeikoSlisitsyn: yes exactly11:45
lisitsynso we should do maybe11:45
@HeikoSlisitsyn: it should always happen at the "highest" level11:45
lisitsynah but in sklearn it is a property of model iirc11:46
lisitsynlike you pass a b c and the mapping is stored in the classifier11:46
@HeikoSthat could work11:46
lisitsynbut I don't like the approach11:47
lisitsynI think it should be in lables11:47
@HeikoSthe problem is that we want labels to be const11:47
lisitsynit sounds more reasonable11:47
@HeikoSthread safety etc11:47
@HeikoSso I mean CLabels could store a mapping, and then xvalidation invokes computation of the mapping, and then ::train just reads that11:48
@HeikoSi.e. lazy generation of the mapping11:48
@HeikoSbut again we dont want to modify labels in training, so maybe the model is a better place11:49
@HeikoSbut then what if the labels change ......11:49
@HeikoSand basically, that is where I was discussing this with Gil last time11:49
@HeikoSsuggestions? :D11:49
@HeikoSlisitsyn: disappeared? :)11:52
lisitsynsorry11:53
lisitsynback11:53
lisitsyn1 min :)11:53
lisitsynHeikoS: well immutable is solved with copies11:55
lisitsynI guess we can re-use the same original labels11:56
lisitsynand have a light-weight object that has a mapping11:56
@HeikoScan you pseudo code a bit?11:56
@HeikoSyou are essentially saying that inside ::train(..., const CLabels* labels), we first check whether labels is already in the right space, and if not we create a new instance with the mapped values?11:58
lisitsynso say12:03
lisitsynyou have Labels original_labels12:03
lisitsynah yeah12:03
lisitsynonce we train we create mapped_labels yes12:03
lisitsynHeikoS: something like that maybe12:03
@HeikoSOk and then xvalidation does this12:04
@HeikoSand then splits the converted labels12:04
@HeikoSand passes those on12:04
@HeikoSand every train call also does this, but it is a nop if the labels are already mapped12:05
-!- gf712 [c13cdcfd@gateway/web/freenode/ip.193.60.220.253] has joined #shogun12:06
-!- HeikoS1 [~heiko@221.pool85-48-188.static.orange.es] has joined #shogun12:09
HeikoS1lisitsyn: sorry got disconnected12:09
-!- HeikoS [~heiko@176.pool85-48-188.static.orange.es] has quit [Ping timeout: 258 seconds]12:11
lisitsynHeikoS1: yeah it seems that it could be the easiest way12:24
lisitsynit is important to re-use the memory though12:24
lisitsynbut it seems to be easy12:24
HeikoS1the copy would only store the mapping12:24
HeikoS1and the original vector12:25
HeikoS1although12:25
HeikoS1not sure12:25
HeikoS1as many algos are based on vectorized label access12:25
HeikoS1so there is at least twice the memory12:25
HeikoS1when converted12:25
lisitsynHeikoS1: it should not be a critical issue I think12:29
lisitsynlabels are just like one feature anyway12:29
lisitsynHeikoS1: ok then I guess we can do it lazy with explicit compute12:29
lisitsynso if a method uses no vectorized access we use get() that maps labels12:30
lisitsynonce it gets labels as vectors they are mapped into the new vector12:30
lisitsynI don't remember: what methods do use the vectorized access?12:31
HeikoS1quite a few12:31
HeikoS1kernel stuff mostly12:32
HeikoS1(not SVM)12:32
HeikoS1KRR12:32
-!- geektoni_ [c1cdd253@gateway/web/freenode/ip.193.205.210.83] has quit [Quit: Page closed]12:33
lisitsynHeikoS1: ah ok then yes12:43
lisitsynget_vector() maps12:43
lisitsynand get(i) does not12:43
lisitsynsounds valid :)12:43
-!- HeikoS [~heiko@239.pool85-48-188.static.orange.es] has joined #shogun12:53
-!- mode/#shogun [+o HeikoS] by ChanServ12:53
@HeikoSlisitsyn: here is another issue12:53
@HeikoSso say I have trained my svm12:53
@HeikoSI received user-labels (A,A,B,C)12:54
lisitsynaha12:54
-!- HeikoS1 [~heiko@221.pool85-48-188.static.orange.es] has quit [Ping timeout: 252 seconds]12:54
-!- wiking [~wiking@huwico/staff/wiking] has quit [Remote host closed the connection]12:54
@HeikoSI convert internally to CBinaryLabels12:54
@HeikoSI train12:54
@HeikoSand then the user wants to apply12:54
@HeikoSmy internal thing returns +1, +1, -112:54
@HeikoShow do I map back?12:54
@HeikoSI didn't store the labels12:54
lisitsynwe can basically store a bimap12:55
lisitsynso once we map we reverse it12:55
lisitsynthat might solve the problem12:55
@HeikoSyeah sure12:56
@HeikoSbut SVM doesnt store labels12:56
@HeikoSor say another model12:56
@HeikoSstoring labels you mean12:56
lisitsynahhh that's actually a point why mapping might correspond to a machine12:57
@HeikoSI guess so12:57
gf712HeikoS: for some reason mkl tests are failing?12:59
gf712as intel mkl build12:59
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun13:00
-!- mode/#shogun [+o wiking] by ChanServ13:00
@HeikoSgf712: since whne?13:00
gf712since the kernel merge I think13:04
gf712where the swig stuff was taken out13:04
gf712not sure how..13:04
gf712HeikoS: actually before that13:05
gf712must be some update with mkl?13:05
gf712I don't think it was caused by anything merged to shogun develop13:05
@HeikoSI guess some update then13:10
@HeikoSCI was green on all merged PRs iirc13:10
gf712yea it failed for the first time in the arff pr13:11
gf712so must be caused by an external lib13:12
gf712but mkl hasn't been update for 2 months https://anaconda.org/intel/mkl-devel13:12
gf712and eigen is from a specific release right?13:12
@HeikoSokok13:15
@HeikoSlisitsyn: http://collabedit.com/4qfwu13:35
@HeikoSgf712: yes eigen is by git hash13:35
lisitsynHeikoS: reading13:36
lisitsynHeikoS: well looks valid so far13:40
@HeikoSlisitsyn: cool13:44
@HeikoSI'll let it sink in a bit13:45
@HeikoSand then discuss again13:45
@HeikoSgf712: we also discussed labels, see above for an idea how to move between internal/user-facing space13:45
gf712HeikoS: having a look13:51
-!- essam [c5351150@gateway/web/freenode/ip.197.53.17.80] has quit [Quit: Page closed]13:52
@HeikoSgf712: cool, I'll have lunch now, might be back later13:53
-!- HeikoS [~heiko@239.pool85-48-188.static.orange.es] has quit [Quit: Leaving.]13:55
-!- wiking [~wiking@huwico/staff/wiking] has quit [Remote host closed the connection]14:00
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun14:07
-!- mode/#shogun [+o wiking] by ChanServ14:07
-!- wiking [~wiking@huwico/staff/wiking] has quit [Read error: Connection reset by peer]14:09
-!- wiking_ [~wiking@huwico/staff/wiking] has joined #shogun14:09
-!- mode/#shogun [+o wiking_] by ChanServ14:09
-!- wiking_ [~wiking@huwico/staff/wiking] has quit [Ping timeout: 252 seconds]14:13
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun14:21
-!- mode/#shogun [+o wiking] by ChanServ14:21
gf712wiking: ping14:48
@wikinghola14:48
gf712hey, do you use instruments much?14:48
gf712I.e. time profiler14:48
@wikingyep14:49
@wikingon the oher hand define 'much'14:49
@wikinguse it when i can14:49
@wiking:)14:49
gf712ok!14:50
gf712do you ever look at the highlighted code that is used heavily?14:50
gf712because it doesn't make a lot of sense to me14:50
@wikingah14:51
@wikingthat is tricky14:51
gf712I am seeing calls to function specialisations that weren't even use...14:51
@wikingi always just use the inverse tree shit14:51
@wikingand dont look at the code hightlighting14:51
gf712ah ok ok14:51
gf712I didn't see there was this thing14:51
gf712much clearer now14:51
-!- HeikoS [~heiko@73.red-83-46-178.dynamicip.rima-tde.net] has joined #shogun15:00
-!- mode/#shogun [+o HeikoS] by ChanServ15:00
@HeikoSlisitsyn, gf712 actually I realised one thing ... it is actually OK if the mappings are different during xvalidation, as the only thing that matters is the result of the "apply" function15:01
@HeikoSso this thing of precomputing the mapping in that case is only to save some cpu cycles15:01
-!- wiking [~wiking@huwico/staff/wiking] has quit [Remote host closed the connection]15:03
gf712HeikoS: i am not sure I follow15:06
gf712isn't the issue still that a label in test might not have been seen in train?15:06
gf712in one of the folds15:06
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun15:10
-!- mode/#shogun [+o wiking] by ChanServ15:10
-!- geektoni [973e0080@gateway/web/freenode/ip.151.62.0.128] has joined #shogun15:19
@HeikoSgf712: yes, I guess it doesnt matter if the mapping changes in between folds15:25
@HeikoSas the predictions will be mapped back into the original label space in ::apply15:26
gf712ah ok ok, I think I see what you mean15:29
gf712HeikoS: btw what are the types of strings that are supported in shogun then?15:30
@HeikoSover all ptypes15:30
gf712I added the templating now but I just have char15:30
gf712ah ok15:30
@HeikoSbut in practice15:30
@HeikoSchar15:30
gf712and then the EAlphabet?15:30
@HeikoSuint16_t15:30
@HeikoSyeah15:30
@HeikoScan add them one by one15:30
gf712isn't the largest type char?15:30
@HeikoSjust need to make sure that is easy15:30
@HeikoSI added some uin16_t string stuff recently15:31
@HeikoScheck out string.sg15:31
@HeikoSmeta example15:31
gf712ok! thanks :)15:31
geektoniping HeikoS15:37
@HeikoSgeektoni: pong15:37
geektoniHeikoS: when you say that we do not have testing for KMeans, you mean that also the meta example is not useful to check consistency between the two KMeans versions? :/15:38
@HeikoSah no15:38
@HeikoSI meant testing the observable stuff15:38
@HeikoSlike ... does it work what you add there :)15:38
geektoniahh I see I see15:38
@HeikoSno need to have a test15:39
@HeikoSjust check it and put some evidence15:39
geektonisure sure, that's easy then ;)15:39
@HeikoSshould be15:40
geektoniHeikoS: ah btw, what's going on with MKL stuff? :/15:42
@HeikoSwhat do you mean?15:43
geektoniHeikoS: like on the CI, MacOS MKL happens to fail on many test which are fine on the other environments15:47
geektoniI saw that you and gf712 were discussing about it earlier :)15:48
@HeikoSah ok15:50
@HeikoSyeah idk tbh15:50
@HeikoSstopped working at some point15:50
geektoniHeikoS: kk, I'll merge LDA then since it is the cause of those errors15:52
@HeikoSwhy "since it is the cause of those errors" ?15:53
geektoniahh no15:54
geektonisince it is *not* the cause15:54
geektonican't write today15:54
@HeikoShehe15:55
@HeikoSbut we want the refactors you are doing into develop or?15:55
@HeikoSsince that will compile15:55
geektonias you said before, if I'm using just put() they can go into develop. If I need to use also observe(), then they need to go in the feature branch15:58
@HeikoSah yes the observe15:59
@HeikoSthere was an observe in LDA15:59
@HeikoSbut I didnt get why that was needed15:59
geektoniHeikoS: mmh there is no observe in LDA, you mean KMeans?16:01
@HeikoSah yes16:01
@HeikoSsorry16:01
geektoniHeikoS: I use observe() sometimes since there may be methods which act directly on the registered variable (like mus for KMeans)16:03
geektonitherefore16:03
geektonithere is no need to "put" them again16:03
@HeikoSand so you avoid the put call16:03
geektoniyep16:03
@HeikoSwhich basically would be wasted cpi16:03
@HeikoScpu16:03
@HeikoSto copy16:03
geektoniwell the copy is still done by the observe() method16:04
geektonibut I do not want to have all the (possible) put overhead16:04
@HeikoSboth do an any_cast or?16:04
geektoniobserve() does not16:04
@HeikoSah ok then16:04
@HeikoSdoes observe also copy data if there is no observer registered?16:04
geektonimmh I guess it still does16:05
@HeikoScan that be avoided :D16:05
geektonisince there is no explicit check16:05
geektoniye ye16:05
@HeikoSmmmmh16:05
@HeikoSso just thinking16:05
@HeikoSI mean basically all algorithms directly modify member variables right?16:05
@HeikoSand the put can always be avoided16:05
@HeikoSinside an algorithm16:05
@HeikoS?16:05
geektoniyep, most of them so far16:07
@HeikoSit is a bit weird16:07
@HeikoSlisitsyn: you still here?16:08
-!- wiking [~wiking@huwico/staff/wiking] has quit [Remote host closed the connection]16:08
@HeikoSgeektoni: you see, we wanted to avoid adding all these observe calls for model parameters16:08
@HeikoSvia putting "observe" into put16:09
@HeikoSbut now we don't use it16:09
@HeikoSthen there wouldn't have been the need to add it (which simplifies things quite a bit, as this makes parameter framework and observable framework interdependent)16:09
@HeikoSbut also.... in general we want to observe all model parameters *by default* in every iteration16:10
@HeikoSwithout us changing the algorithms16:10
geektoniThe problem is that we would need to refactor those algorithms to use put() instead of access directly the member variables16:10
geektoniwe would need to touch them anyway :)16:11
@HeikoSyes but just once16:11
@HeikoSadding all those observe calls makes it much more likely that we need to touch everything again16:11
@HeikoSI wonder now whether we shouldnt just remove the observe call inside ::put16:12
@HeikoSand then instead put observe calls inside CIterativeMachine16:12
@HeikoSeither explicit16:12
@HeikoSor implicit with a filter on parameter properties16:12
@HeikoSbut we basically dont need to observe call inside put if we are adding those things anyways16:13
@HeikoSyou get my point?16:13
geektonimmh kind of16:13
geektonithe problem with the iterative machine approach is that we do not know exactly which variables it will have16:14
geektonisince it is a mixin16:14
@HeikoSwe know16:14
@HeikoSParameterProperties::MODEL16:15
geektoniahh I see what you mean16:15
geektonilike16:15
geektonievery iteration, we emit every registered variable, without actually caring if it was modified or not16:16
@HeikoSyes16:17
@HeikoSthis way we can avoid this explicit observe stuff for the registered model parameters at least16:18
@HeikoSand we dont need observe inside put16:18
geektoniokay soo16:18
@HeikoSmmh16:18
@HeikoSbut wait16:18
@HeikoSon the other hand16:18
@HeikoSemitting things that did not change is not good either16:18
@HeikoSso that is where the old design was nice in the sense that only if put is called, something is emitted16:19
@HeikoSbut now some algos dont call put but modify their members directly16:19
@HeikoSso now you do the "observe" call16:19
@HeikoSto avoid put overhead16:20
@HeikoSokok16:20
geektoniI guess we need to find the best tradeoff16:20
@HeikoSyeah seems like ups and downs16:21
@HeikoSmaybe leave it as it is for now16:21
@HeikoSand we see whether there are more problems16:21
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun16:21
-!- mode/#shogun [+o wiking] by ChanServ16:21
geektoniI mean, we would need to use at least two approaches to observe things16:21
geektoniHeikoS: ah btw, https://gist.github.com/geektoni/487fa1c3eac5fbd31c70c9dc54d67fb116:22
@HeikoSi think maybe use "put" instead of observe for the changed parameters at least16:22
@HeikoSbecause that makes it clear16:22
@HeikoSput means that a parameter was changed16:22
geektonikmeans works. at the end of the gist there is the output16:22
@HeikoSotherwise it gets convoluted16:22
@HeikoScool that it works :)16:22
@HeikoSbecause we might use the fact that put means parameter has been modified later on16:23
geektoniI see16:23
@HeikoSgeektoni: one thing: it would be good if there was no copy being performed if no observers are subscribed16:23
@HeikoSyou agree?16:23
geektoniI agree16:24
@HeikoSok cool16:24
@HeikoSso let's do that16:24
@HeikoSand also make observe->put16:24
@HeikoSfor model parameters16:24
geektonikk, even if there are change in-place?16:25
@HeikoScould you benchmark it?16:25
@HeikoSfor say kmeans on a reasonably sized problem?16:25
geektonisure, let me do some testing16:26
-!- wiking [~wiking@huwico/staff/wiking] has quit [Read error: Connection reset by peer]17:00
-!- wiking_ [~wiking@huwico/staff/wiking] has joined #shogun17:00
-!- mode/#shogun [+o wiking_] by ChanServ17:00
-!- wiking_ [~wiking@huwico/staff/wiking] has quit [Remote host closed the connection]17:05
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun17:07
-!- mode/#shogun [+o wiking] by ChanServ17:07
-!- geektoni [973e0080@gateway/web/freenode/ip.151.62.0.128] has quit [Quit: Page closed]17:09
-!- wiking [~wiking@huwico/staff/wiking] has quit [Ping timeout: 250 seconds]17:11
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun17:29
-!- mode/#shogun [+o wiking] by ChanServ17:29
-!- wiking [~wiking@huwico/staff/wiking] has quit [Ping timeout: 264 seconds]17:34
-!- HeikoS [~heiko@73.red-83-46-178.dynamicip.rima-tde.net] has quit [Ping timeout: 246 seconds]17:43
-!- geektoni [973e524e@gateway/web/freenode/ip.151.62.82.78] has joined #shogun17:49
-!- HeikoS [~heiko@73.red-83-46-178.dynamicip.rima-tde.net] has joined #shogun17:51
-!- mode/#shogun [+o HeikoS] by ChanServ17:51
-!- HeikoS [~heiko@73.red-83-46-178.dynamicip.rima-tde.net] has quit [Ping timeout: 255 seconds]17:59
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun18:32
-!- mode/#shogun [+o wiking] by ChanServ18:32
-!- wiking [~wiking@huwico/staff/wiking] has quit [Ping timeout: 264 seconds]18:36
-!- gf712 [c13cdcfd@gateway/web/freenode/ip.193.60.220.253] has quit [Ping timeout: 256 seconds]18:44
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun18:46
-!- mode/#shogun [+o wiking] by ChanServ18:46
-!- wiking [~wiking@huwico/staff/wiking] has quit [Ping timeout: 252 seconds]18:51
-!- lambday [a7dcee98@gateway/web/freenode/ip.167.220.238.152] has quit [Ping timeout: 256 seconds]19:22
-!- HeikoS [~heiko@73.red-83-46-178.dynamicip.rima-tde.net] has joined #shogun19:26
-!- mode/#shogun [+o HeikoS] by ChanServ19:26
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun19:31
-!- mode/#shogun [+o wiking] by ChanServ19:31
@HeikoSgeektoni: hi19:34
@HeikoSyou still here?19:34
geektoniHeikoS: yes yes still here19:34
@HeikoSI sent you an email19:34
geektonilet me check19:35
-!- wiking [~wiking@huwico/staff/wiking] has quit [Ping timeout: 246 seconds]19:35
geektoniHeikoS: okay, everything make sense to me19:37
geektonithis way we could also remove ObservedValue from SGObject19:37
@HeikoSyes19:37
@HeikoSI think this is worth the effort19:38
geektoniI mean yeah, surely it will bring less problems in the future19:38
geektonisoo I guess the benchmark for KMeans put is not exactly needed anymore, since we will just use put()19:39
@HeikoSyes19:39
@HeikoSthat's why I came back19:39
@HeikoShoping that you hadnt written that yet :)19:40
geektoniahaha too late man19:40
@HeikoSnooooooooo19:40
geektoniI mean19:40
geektoniI found a undocumented cpp example which was basically doing the job19:41
geektoniso nw19:41
@HeikoSah ok19:41
@HeikoSgood :)19:41
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun19:45
-!- mode/#shogun [+o wiking] by ChanServ19:45
@HeikoSlisitsyn: still here?19:50
-!- wiking [~wiking@huwico/staff/wiking] has quit [Remote host closed the connection]20:16
-!- geektoni [973e524e@gateway/web/freenode/ip.151.62.82.78] has quit [Quit: Page closed]20:20
-!- HeikoS [~heiko@73.red-83-46-178.dynamicip.rima-tde.net] has quit [Ping timeout: 258 seconds]20:34
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun20:48
-!- mode/#shogun [+o wiking] by ChanServ20:48
-!- wiking [~wiking@huwico/staff/wiking] has quit [Remote host closed the connection]21:06
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun21:13
-!- mode/#shogun [+o wiking] by ChanServ21:13
-!- HeikoS [~heiko@73.red-83-46-178.dynamicip.rima-tde.net] has joined #shogun23:12
-!- mode/#shogun [+o HeikoS] by ChanServ23:12
-!- anvan [~androirc@103.252.200.48] has quit [Read error: Connection reset by peer]23:41
-!- HeikoS [~heiko@73.red-83-46-178.dynamicip.rima-tde.net] has quit [Ping timeout: 245 seconds]23:42
--- Log closed Fri May 17 00:00:18 2019

Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!