IRC logs of #shogun for Thursday, 2018-07-05

--- Log opened Thu Jul 05 00:00:55 2018
-!- HeikoS [~heiko@host109-151-250-28.range109-151.btcentralplus.com] has joined #shogun10:01
-!- mode/#shogun [+o HeikoS] by ChanServ10:01
lisitsynHeikoS: the failed test in lazy any is interesting10:33
@HeikoSwhats up with it?10:33
lisitsynit seems I accidentally found some null derefernce10:33
lisitsynbecause I added a check to avoid that10:33
lisitsynand exactly that failed10:33
lisitsynI am debugging10:33
lisitsynhttps://github.com/shogun-toolbox/shogun/pull/4343/files#diff-8ea96286d95b52029d31636117e0fe55R34110:34
lisitsynthis one10:34
lisitsynHeikoS: once I am done, I will add a flag to ignore that in clone/equals10:37
lisitsynah sorry that's 'Empty'10:38
lisitsyn:D10:38
@HeikoSsure10:47
@HeikoSmmmh10:48
@HeikoSseems weird but I cannot see through all this atm10:48
wuweiheiko: hi, i still have some problems with string features meta examples13:05
wuweihttps://github.com/shogun-toolbox/shogun/blob/31ed13beba78984199a361756506178dd866ac57/examples/undocumented/python/kernel_comm_word_string.py#L1813:05
wuweimaybe we need a bit refactor on string features apis13:06
@HeikoSwuwei: hi16:09
@HeikoSah yes16:09
@HeikoSspecialized method16:09
@HeikoSwhat does it do?16:09
wuweihi16:09
wuweiit map the string list to high order real vectors16:10
@HeikoSwhat does that mean?16:11
wuweithere's actually a problem with alphabet, which i have asked viktor16:12
wuweihttps://github.com/shogun-toolbox/shogun/blob/31ed13beba78984199a361756506178dd866ac57/src/shogun/features/Alphabet.cpp#L75516:12
wuweiit will call CAlphabet::translate_from_single_order16:13
@HeikoSI feel like we will need another factory for this or?16:13
wuweitake a subsequence of string, and compute a real value16:13
@HeikoSsure ok16:14
@HeikoSthis feels like some user decision to represent the string in a particular way16:14
wuweii think we need some clean up with string features first16:15
@HeikoSI mean let's face it16:16
@HeikoSsome of the things that the string features API offers16:16
@HeikoSis actually required to be accessible by the user16:16
@HeikoSso there is very little way arouind somehow offering an API in swig that does similar things16:16
wuweiyou mean we should create factory method to wrap obtain_*?16:19
@HeikoSin these lines16:21
@HeikoSa factory for converting string features16:21
@HeikoSOR parametrize the existing string_features factory do be able to do such things16:21
@HeikoSyou see, we need an API for doing similar things16:21
@HeikoSso I guess a good idea would be to go through the string examples16:21
@HeikoSand see what is needed16:21
@HeikoSthen design an API for it16:21
@HeikoSand implement it :)16:22
wuweiyeah16:22
@HeikoSone thing to remember is that the new API is still experimental, so we can change it16:22
@HeikoSif the approach doesnt work16:22
wuweibut i have another problem with obtain*16:22
@HeikoSwe are currently changing it all the time16:22
@HeikoSok what is it?16:22
wuweithat's still not fixed in transformers16:22
wuweilet me check quickly16:23
@HeikoSsure16:24
wuweiheiko: that's histogram of alphabet16:28
wuweifor example, https://github.com/shogun-toolbox/shogun/blob/31ed13beba78984199a361756506178dd866ac57/examples/undocumented/python/preprocessor_sortulongstring.py#L1916:28
wuweiwhen you create a string features with some alphabet16:28
wuweiand then call obtain_from_char16:28
@HeikoSso the features are transformed to a space16:28
@HeikoSthat has one component per alphabet entry16:29
@HeikoSand then it just counts16:29
wuweithen you will have data beyond the alphabet16:29
@HeikoSi.e. the actual features are histograms16:29
wuweithat will cause a problem16:29
@HeikoSnot sure I understand16:29
wuweii mean, after calling obtain_from_char, what you have is actually real vectors, which are not in the alphabet16:31
@HeikoSi see16:31
wuweiso check_alphabet_size() will fail16:31
@HeikoSbut that is a problem that is already there right?16:32
wuweiwhen you try to create a copy of string features, it will call check_alphabet_size() then fails16:32
@HeikoSi get it16:32
wuweiyes that's a problem with transformers16:32
@HeikoSwell it would be the dimension of the real vectors or?16:32
wuweithe reason check_alphabet_size() fails is that the data is not in alphabet16:33
@HeikoSyes it is transformed16:33
wuweisince that's real values, instead of something like DNA characters16:33
@HeikoSit should be CDenseFeatures then or?16:34
@HeikoSor even sparse16:34
@HeikoSbecause string features by construction have an alphabet16:34
@HeikoSthey are discrete objects16:34
wuweiyes i think that should be dense features16:34
wuweibut i'm not sure, since many algorithms use transformered string features16:34
@HeikoSmmh16:35
wuweifor example string kernels16:35
@HeikoSbut string kernels use the alphabet as well no?16:35
@HeikoSso how does it work16:35
@HeikoSit just replaces the string list16:35
@HeikoSwith a list of SGString<index_t> ?16:35
@HeikoSand then the elements contain the counts?16:35
wuweiyes16:36
wuweiit replaces the string list16:36
@HeikoSso all strings are of the same length16:36
@HeikoSI would call this an embedding16:36
@HeikoSand yes, the resulting features should be dense/sparse16:36
@HeikoSi.e. real16:36
wuweicurrent workaround in transformers is to prevent creating a new copy, but that means string preprocessors work differently from other preprocessors16:37
@HeikoSthe concept of CDenseFeatures is that all vectors are of the same length16:37
@HeikoSand all vectors are dense16:37
@HeikoSin stringFeatures, we allow the elements to have different lengths16:37
@HeikoStbh I think in the case of say histogram embedding, the output should be dense/sparse16:37
@HeikoSanyways, you are right, the API is inconsistent16:38
@HeikoSif there is a way of StringFeatures to be defined in a real space, without an alphabet, then there should not be a method to access the alphabet16:39
@HeikoSand also I think there should be a distinction of discrete and numerical spaces16:39
wuweiah yeah the embedding will have different length as well16:39
@HeikoSwill they?16:40
@HeikoShow so?16:40
@HeikoSmaybe then we just need another base class16:40
@HeikoSDiscreteStringFeatures16:40
@HeikoSContinuousStringFeatures16:40
@HeikoSor something in the lines16:40
@HeikoSuh all this code looks scary16:41
@HeikoSlong time it wasnt touched :D16:41
@HeikoSso you have a suggestion how to proceed with this?16:41
@HeikoSseems like this is a different problem to the one with the API or?16:41
wuweiah no I don't have idea now16:43
@HeikoSusually, the best strategy is to not solve everything at once16:43
@HeikoSbut one thing after the other16:43
@HeikoSso maybe lets start with the factory API16:45
@HeikoSand then deal with the transformer branch later?16:46
wuweisure16:48
lisitsynHeikoS: check the latest commit of https://github.com/shogun-toolbox/shogun/pull/434316:57
@HeikoSlisitsyn: checking16:57
lisitsynoops continue missed16:57
lisitsynHeikoS: ok i basically add ignore ifs for the non-cloneable and non-visitable anys16:58
@HeikoSlisitsyn: commented!17:00
@HeikoSbut yes!17:00
@HeikoSthat is is!17:00
lisitsynHeikoS: uhmm I don't get your comment17:01
lisitsynah you mean re-using the message Cloning/Comparing?17:01
lisitsynin case of race it would be useful17:01
lisitsynif two objects are compared at the same time for example17:02
@HeikoSah i see17:02
@HeikoSyeah ok sure17:02
@HeikoSthe first one needs continue!17:02
lisitsynHeikoS: yes I fixed that17:02
@HeikoSotherwise I am fine!17:02
@HeikoSmerge it!17:02
@HeikoSwe can port more examples then17:02
lisitsynlet me build once again17:03
lisitsynI would not wait for travis17:03
lisitsynbecause it will take infty17:03
@HeikoSsure17:03
@HeikoSjust merge17:03
@HeikoSdont wait17:03
wuweiHeiko: btw currently mixin base classes broke swig, e.g. in python, perceptron.train is undefined17:07
@HeikoSreally17:15
@HeikoSthx for letting me know17:15
@HeikoSso the python meta example for perceptron doesnt work?17:15
@HeikoSwuwei: ^17:15
@HeikoSTest #452: generated_python-binary-perceptron ...   Passed17:16
@HeikoSwuwei: can you tell me how to reproduce the error?17:16
wuweilet me check17:17
wuweii didn't test meta examples locally17:17
wuweiin python, you create a perceptron instance17:18
@HeikoSwhere did you see the error?17:18
wuweiand then call train17:18
wuweiit will throw an error17:18
wuweiin my machine17:18
wuweithers's warning about swig17:19
wuweilike base class CIterativeMachine is unknown17:20
@HeikoSI am checking17:21
@HeikoScan you run17:21
@HeikoSctest -R perceptroin17:21
wuweiNothing known about base class 'CIterativeMachine< CLinearMachine >'. Ignored.  Maybe you forgot to instantiate 'CIterativeMachine< CLinearMachine >' using %template.17:22
wuwei^ the warning17:22
@HeikoSwhen running what?17:23
wuweii'm build develop branch17:23
@HeikoSwhat is the command that gives the error?17:24
@HeikoSis it running the meta example?17:24
@HeikoSand more importantly17:25
@HeikoShow do you instatiate the Perceptron17:25
@HeikoSusing new?17:25
wuweithe warning above is thrown when building with `make all`17:25
@HeikoSor using machine("Perceptron")17:25
wuweishogun.Perceptron17:25
wuweiusing the ctor17:25
@HeikoSyes17:25
@HeikoSok17:25
@HeikoSno problem then17:26
@HeikoSwe actually do not want to expose this into swig anymore anyways17:26
@HeikoSthis is why the meta example works17:26
@HeikoSit uses the factory17:26
@HeikoSwuwei: but thanks for letting me know anyways17:26
wuweioh yeah i see17:27
@HeikoSwuwei:  https://github.com/shogun-toolbox/shogun/issues/435417:31
@HeikoSwuwei: btw17:32
@HeikoSare there any more RealFeatures instances in undocumented/python ?17:32
@HeikoSwuwei: because once we have brought that down to zero, we can remove RealFeatures from swig17:33
@HeikoS(big compile speedup)17:33
wuweithere are still many17:33
@HeikoSwuwei: any of those are already portable?17:43
wuweiyeah, that's much work to be done17:44
@HeikoSwuwei: btw do you remember which example was in need of a lazy evaluated member17:45
@HeikoSso I can do 'get'17:45
@HeikoSbut then it computes something17:46
@HeikoSI forgot which examples this was17:46
wuweione is GaussianKernel::get/set width17:47
wuweicuz it's log_width stored17:47
@HeikoSwe have only passive17:48
@HeikoSso could do17:48
@HeikoSget('width')17:48
@HeikoSbut that is not the best illustration17:48
@HeikoSsomethign with distance17:48
wuweii remember another example is kmeans17:50
@HeikoSkmeans17:51
@HeikoScool17:51
@HeikoSlet me check thx17:51
@HeikoSyou remember the method?17:51
@HeikoScompute_cluster_variances17:51
@HeikoS?17:51
wuweiget_cluster_center17:53
@HeikoSlisitsyn: jo17:58
lisitsynHeikoS: yes17:58
@HeikoSso now watch_param17:58
@HeikoSwatch_param("cluster_centers", std::function<SGMatrix<float64_t>()>(get_cluster_centers));17:58
@HeikoSlike this?17:58
@HeikoSor how?17:58
@HeikoSlisitsyn: doesnt compile ;017:59
lisitsynHeikoS: you gotta bind 'this'18:00
lisitsynstd::bind(&Object::computed_member, obj)18:00
@HeikoSlemme try18:00
@HeikoSwatch_param("cluster_centers", std::bind(&KMeansBase::get_cluster_centers, this));18:01
lisitsynyes18:01
@HeikoSdoesnt like it18:02
@HeikoSwathc_param that is18:02
lisitsynah18:02
lisitsynyeah18:02
lisitsynwatch_param does not work probably18:02
@HeikoSerror: no matching function for call to 'shogun::CKMeansBase::watch_param(const char [16], std::_Bind_helper<false, shogun::SGMatrix<double> (shogun::CKMeansBase::*)(), shogun::CKMeansBase*>::type)'18:02
@HeikoS  watch_param("cluster_centers", std::bind(&CKMeansBase::get_cluster_centers, this));18:02
@HeikoSmmh18:02
@HeikoSdo we need a new watch_lazy maybe?18:02
lisitsynyes18:02
lisitsynI think so18:02
lisitsynsomething like watch_method("cluster_centers", get_cluster_centers);18:03
@HeikoSlisitsyn: mind adding that? :D18:03
lisitsynyeah lemme try18:03
lisitsynnot now though18:03
lisitsyninterviewing someone right now18:03
lisitsyn:D :D18:03
@HeikoSok18:07
@HeikoSpose as interview q!18:07
@HeikoSlisitsyn: enjoz!18:07
lisitsynHeikoS: I think I am just in the middle of my worst interview ever18:07
@HeikoSwhy is she so bad?18:08
lisitsyn?\_(?)_/?18:08
lisitsynis irc still logged? :D18:09
@HeikoSlol18:11
@HeikoSit is18:11
lisitsynthen I'd stop at this point heh18:12
-!- HeikoS [~heiko@host109-151-250-28.range109-151.btcentralplus.com] has quit [Ping timeout: 260 seconds]18:40
-!- travis-ci [~travis-ci@ec2-54-158-152-243.compute-1.amazonaws.com] has joined #shogun19:22
travis-ciit's Sergey Lisitsyn's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: https://travis-ci.org/shogun-toolbox/shogun/builds/40048440919:22
-!- travis-ci [~travis-ci@ec2-54-158-152-243.compute-1.amazonaws.com] has left #shogun []19:22
--- Log closed Fri Jul 06 00:00:56 2018

Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!