IRC logs of #shogun for Friday, 2011-06-17

--- Log opened Fri Jun 17 00:00:53 2011
@sonney2kf-x, could you please fix the copyright in the way john suggested that I can merge your patch?00:08
CIA-32shogun: Soeren Sonnenburg master * r5c71d64 / examples/undocumented/java_modular/classifier_libsvm_minimal_modular.java : polish minimal example to use jblas' signum / mean etc and simplify code slightly - http://bit.ly/lvOBau00:27
CIA-32shogun: Soeren Sonnenburg master * r40c6576 / (7 files in 4 dirs):00:27
CIA-32shogun: add missing functions to Distance/Kernel/LinearMachine such that00:27
CIA-32shogun: all examples run through at least. Minor fixes to example files. - http://bit.ly/mP5myl00:27
f-xsonney2k: submitted the patch. though i forgot to change the (C) -- it should be for Berlin Institute of Technology and Max Planck Society, right?00:59
f-xand thanks for the test - i have no idea why my laptop gives me totally opposite results...01:00
@sonney2kf-x, yeah and what john said01:00
@sonney2kf-x, I only see that both methods have the same speed01:00
@sonney2kdifferences are totally negligible01:00
f-xi did add the part john mentioned in the mail, i guess01:01
f-xanything more to be added there with respect to that?01:01
@sonney2knope01:03
@sonney2kthen I guess only append the GPL header and the (C) line01:04
@sonney2k(W)01:04
@sonney2ktoo01:04
f-xthere. that should hopefully do it.01:06
f-xbtw the StreamingSimpleFeatures class seems to be working, but the parent StreamingDotFeatures class is currently purely abstract01:09
f-xi've defined all the functions in StreamingSimpleFeatures01:09
f-xi'll make a dummy pull request from that branch later just so you can see the changes easily01:11
@sonney2kok01:14
@sonney2kbut I guess it should be purely abstract anyways01:14
@sonney2kI mean it is only to provide interfaces to StreamingSGD etc01:14
f-xagreed.. DotFeatures are supposed to necessarily provide a dot, dense_dot, add_to_dense_vec() operation for float64_t* vectors, right?01:18
f-xi mean those are the functions in DotFeatures.h01:18
CIA-32shogun: Shashwat Lal Das master * rfe6ab74 / (41 files in 11 dirs): Merge remote-tracking branch 'upstream/master' into streaming - http://bit.ly/iHPzmp01:20
CIA-32shogun: Shashwat Lal Das master * re14b29f / (src/libshogun/lib/IOBuffer.cpp src/libshogun/lib/IOBuffer.h): Changed copyrights in IOBuffer and made it derive from CSGObject. - http://bit.ly/iFvbpW01:20
CIA-32shogun: Shashwat Lal Das master * r6ce47b0 / (src/libshogun/lib/IOBuffer.cpp src/libshogun/lib/IOBuffer.h): More copyright changes to IOBuffer. - http://bit.ly/jEFy5901:20
@sonney2kf-x, yes though now the algorithm probably first needs to fetch an SGVector01:21
@sonney2kand then call these helper functions01:21
@sonney2kI am not so sure about how this can work well with sparse / strings01:22
@sonney2kI think one needs to assume that the current (feature) object is in memory as a member variable somehow01:23
f-xpull request made..01:26
f-xit was ok with dense features01:26
f-xsparse seems to be the challenge01:27
f-xi haven't seen properly how shogun handles sparse features yet01:27
f-xbut in the current implementation, the features object always works with a "current example" only01:27
f-xand the dot(), dense_dot(), add_to_dense_vec() etc operate using that and any other specified vector01:28
@sonney2klet me see01:30
@sonney2kf-x, yeah makes sense01:31
@sonney2kso there is no real challenge then01:31
@sonney2kjust operate on the fetched example01:31
@sonney2khowever you don't need to add the start/end parser stuff or?01:31
@sonney2kI mean it now really makes sense to have a StreamingFeatures object in that hierarchy01:32
f-xsonney2k: that can be avoided01:32
f-xbut then we should agree that start_parser should be called automatically sometime01:32
f-xend_parser will automatically run on object destruction01:32
@sonney2kI mean CFeatures -> CStreamingFeatures (with the get_next*) -> whatever StreamingFeatures01:32
f-xsonney2k: problem with having a CStreamingFeatures is that anything having the parser as a member must be templated01:33
@sonney2kf-x, one might want to manually start the parsing process01:33
@sonney2kf-x, I see - so then again interfaces only!01:33
f-xbut get_next_feature_vector(type** vec)01:34
@sonney2kf-x, I think there is no way other than in the learning algorithm start the parser01:34
f-x sonney2k: yes.. i've done it explicitly in the gist i sent01:34
f-xdata->start_parser()01:34
@sonney2kf-x, yes that is not possible01:34
@sonney2kok01:35
f-xsonney2k: but it really would be convenient to be able to dump that generic stuff into a parent class01:35
@sonney2kso if SGD is modified to only use the operations from StreamingDotFeatures then it means it will never call get_next_feature_vector explicitly01:36
f-xexactly!01:36
@sonney2kthat will be done in the respective dotfeature class01:36
f-xit only uses the operations it provides01:36
f-xlike dot, dense_dot, etc01:37
@sonney2kand there only to compute the dense_dot etc01:37
f-xget_vector is a specialized function of the StreamingSimpleFeatures class01:37
@sonney2kso all it needs to call is parser start / end01:37
@sonney2kand fetch_next_example()01:37
@sonney2kthat's it01:37
f-xsonney2k: yes.. but is that sufficient for all algorithms?01:37
f-xnever having to call get_feature_vector?01:37
@sonney2kf-x, surely not - but these will require special feature objects then01:39
@sonney2kand thus can use specific get_vector() etc functions01:39
f-xsonney2k: in the algorithms which currently work on DotFeatures, isn't this the case too?01:39
f-xget_feature_vector() is defined only for float64_t vectors (in DotFeatures)01:40
f-xfor others, i guess it is up to the algorithm to do the conversion to CSimpleFeatures* and use the specialized functions01:40
f-xsonney2k: oh - and i have some news.. i will be out of town (compulsorily) on Sunday and Monday.. So I'm sorry, don't think i can work then :(01:42
@sonney2kf-x, np - just announce (like you do now) that you are / when you are away01:43
f-xsonney2k: okay sure :) thanks01:43
@sonney2kf-x, but the algorithms don't need the specialiced get_feature_vector functions01:43
@bettyboo:)01:43
@sonney2kI mean it does not need to know if it is operating on strings / sparse vectors etc01:44
f-xsonney2k: hmm.. i'm beginning to see it now...01:44
@sonney2kok01:45
f-xsonney2k: what should be my priority now? i see (after running the online SGD example) that the parser can be improved speed-wise, and it's probably easier to do now.. should i do the parser optimization or more features/algorithms conversion? maybe even a proper clean version of StreamingSGD?01:47
@sonney2kI don't know how much time it will take to get streamingsgd to work01:48
@sonney2kbut definitely you should do the streamingstring/sparse/simple features01:48
@sonney2kas we figured out it is not a lot of work01:48
@sonney2kif these get a CStreamingFile (from proper type) then you can do ascii / binary etc support01:49
@sonney2kthough I would say for now just ascii - fancy stuff later01:49
f-xsonney2k: okay.. streamingsgd "works" minimally, and the code for that shouldn't change; only the code beneath it will01:49
f-xso now - streamingstring/sparse features01:49
@sonney2kyou won't have an algorithm for strings yet01:50
@sonney2kI guess at some point you need to rip of code from some DotFeatures based object that is string based01:50
f-xfor sgd? no. just inserted a few lines into whatever was there in the original SGD code01:50
@sonney2kf-x, yeah well you have to do that properly rather soon01:51
f-xsonney2k: before/after streamingstring/sparse features?01:52
@sonney2kf-x, streamingfeatures should be very little effort01:54
@sonney2kso I would rather do these proper first01:55
@sonney2kthen SGD / streaming sgd proper ( I guess you need to rip out some code to not have too much code duplication - e.g. loss functions in some CLoss class)01:55
@sonney2kand then streaming ascii reliability / speed improvements and some real test data set from e.g. http://largescale.ml.tu-berlin.de/01:56
f-xok.. sounds like a plan. first i'll concentrate on doing these streaming features properly01:57
f-xsonney2k: guess it's time to sleep now.. will have those string and sparse features ready as soon as possible01:58
f-xgood night, see you!01:59
@sonney2kjust keep us updated - send an email before you leave what the status (this weekly email thingy) is01:59
@sonney2kf-x, cu01:59
f-xyeah, i haven't forgotten that :)01:59
f-xbye01:59
-!- f-x [~user@117.192.198.36] has quit [Quit: ERC Version 5.3 (IRC client for Emacs)]01:59
-!- f-x [~user@117.192.198.36] has joined #shogun02:05
-!- f-x [~user@117.192.198.36] has quit [Client Quit]02:05
-!- in3xes_ [~in3xes@180.149.49.227] has joined #shogun05:16
-!- in3xes [~in3xes@59.163.196.121] has quit [Ping timeout: 240 seconds]05:19
-!- in3xes_ is now known as in3xes06:12
-!- in3xes_ [~in3xes@210.212.58.111] has joined #shogun06:16
-!- in3xes [~in3xes@180.149.49.227] has quit [Ping timeout: 258 seconds]06:19
-!- in3xes_ is now known as in3xes06:22
-!- Netsplit *.net <-> *.split quits: @mlsec10:22
-!- Netsplit over, joins: @mlsec10:26
-!- Netsplit *.net <-> *.split quits: @mlsec10:29
-!- heiko [~heiko@infole-06.uni-duisburg.de] has joined #shogun10:29
-!- Netsplit over, joins: @mlsec10:34
heikohello, anybody here?10:42
heikohas someone a copy of lib/v_array.h ? I cannot compile without it10:42
heikoand it is not in the repo10:43
CIA-32shogun: Baozeng Ding master * r54bb0a1 / (25 files in 2 dirs): add some distance examples and kernel examples, fix kerenl.i to support distance - http://bit.ly/lfqqpM12:06
CIA-32shogun: Baozeng Ding master * r0724fd9 / examples/undocumented/java_modular/kernel_auc_modular.java : add kernel_auc_modular example, this example crash jvm, please help check it - http://bit.ly/inTtDW12:06
@sonney2kheiko, around?12:10
heikohi, yes12:10
@sonney2ktime to chat / talk?12:10
@sonney2kif so I will call you12:11
heikoyes, in 5 mins?12:11
@sonney2kk12:13
heikok ready12:17
CIA-32shogun: Soeren Sonnenburg master * r3bbb4a5 / (2 files in 2 dirs): temporary fix for compiler errors - http://bit.ly/m6TuEM12:19
-!- heiko [~heiko@infole-06.uni-duisburg.de] has quit [Quit: Leaving.]15:36
CIA-32shogun: Shashwat Lal Das master * r9a5e66e / (3 files in 2 dirs): Removed StreamingFeatures.*, added v_array.h - http://bit.ly/lKjqjr16:56
CIA-32shogun: Shashwat Lal Das master * r4c1d91d / src/libshogun/lib/v_array.h : Fixed license of v_array.h - http://bit.ly/irvJ5J16:56
CIA-32shogun: Shashwat Lal Das master * r48ea63c / (27 files in 3 dirs): Commit for fixing compile errors. - http://bit.ly/k63TTY16:56
CIA-32shogun: Shashwat Lal Das master * r8768749 / src/libshogun/lib/v_array.h : v_array.h fix. - http://bit.ly/jnuOed16:56
-!- f-x [~user@117.192.200.96] has joined #shogun17:42
-!- blackburn [~blackburn@31.28.40.202] has joined #shogun19:02
blackburnwow how active is mailing list today!19:04
-!- Netsplit *.net <-> *.split quits: @mlsec19:07
-!- Netsplit over, joins: @mlsec19:08
-!- blackburn [~blackburn@31.28.40.202] has quit [Ping timeout: 255 seconds]19:12
-!- Netsplit *.net <-> *.split quits: @mlsec19:14
-!- Netsplit over, joins: @mlsec19:16
-!- blackburn [~blackburn@31.28.40.202] has joined #shogun19:23
-!- blackburn [~blackburn@31.28.40.202] has quit [Client Quit]19:28
-!- blackburn [~blackburn@31.28.40.202] has joined #shogun19:28
-!- f-x [~user@117.192.200.96] has quit [Remote host closed the connection]20:00
-!- blackburn [~blackburn@31.28.40.202] has quit [Read error: No route to host]20:47
-!- blackburn1 [~blackburn@31.28.40.202] has joined #shogun20:47
blackburn1sonney2k: wondering do we really need apply_to_feature_vector in preprocessors?20:48
@sonney2kblackburn1, sure20:52
@sonney2kconsider there is no feature matrix in memory20:52
@sonney2kbut you just have a single vector at a time20:53
-!- blackburn1 [~blackburn@31.28.40.202] has quit [Ping timeout: 255 seconds]20:56
-!- blackburn [~blackburn@31.28.40.202] has joined #shogun21:31
blackburnbad bad bad21:31
blackburnsonney2k: LLE is wrong :(21:31
blackburnbad bad bad21:36
blackburnSUCCESS!21:54
blackburnsonney2k: http://imageshack.us/photo/my-images/26/image3d2d.png/22:08
@sonney2kthe swiss roll :)22:33
blackburnsonney2k: yes, now it is working *right*22:34
blackburnI'm very disappointed with I did it wrong :(22:34
blackburnsonney2k: so please merge pull request with fixes :)22:35
CIA-32shogun: Sergey Lisitsyn master * r7778ddf / src/libshogun/preprocessor/LocallyLinearEmbedding.cpp : Fixes for LLE - http://bit.ly/k3X4oW22:37
blackburnsonney2k: as you can see - abs(id_vector[j]) instead of id_vector[j] caused everything goes wrong :D22:38
blackburnsonney2k: do you have some new ideas about temporarly matrices like distance matrix?22:38
@sonney2kblackburn, I think I should add a flag 'do_free' to these SGTypes22:39
@sonney2kif true the caller has to delete[] the matrix22:40
blackburnsonney2k: so if do_free then on SGMatrix deletion it will delete it?22:40
blackburnnice. like it22:40
@sonney2kblackburn, that is not so easy22:42
blackburnwhy?22:42
@sonney2kI mean currently we create SGMatrix object on the stack and then return a copy of it22:44
blackburnah22:44
blackburnoh shit22:44
blackburn:D22:44
@sonney2kso when the object on stack is deleted and then the other one later kaboom22:44
blackburnI see, yes22:44
@sonney2konly 'fix' would be to modify the object on copy constuctor22:44
@sonney2kI mean disable deletion in the object that is to be copied, e.g A = SGMatrix() ;  B = A; then A won't have the delete flat set but only B - but I don't like it22:46
blackburnyes, don't like it too22:46
blackburnI have two suggestions about our projects and your project with heiko22:47
@sonney2kor we add a member function release_matrix()22:47
@sonney2kand everyone has to call it22:47
blackburnsonney2k: I am not very familiar with how swig works22:49
blackburnwhat's with SGMatrices in say python?22:49
blackburnI mean when I do get_some_matrix() what I use?22:49
blackburnsome pointer to SGMatrix or so?22:49
blackburnsaying it just because some idea is to place some warning in case we forgot to release it22:53
@sonney2kthen get_some_matrix() will return the SGMatrix object to some C code - there it is converted to to a python numpy matrix22:53
@sonney2kthat's it22:53
blackburnsonney2k: well but we anyway have to do 'release_matrix()' whenever we use float64_t** stuff or SGMatrix, right?22:54
blackburnso I think it is ok22:54
@sonney2kblackburn, well no22:55
@sonney2kbefore we always did copy22:55
@sonney2kso we always deleted22:55
blackburnit looks like we are trying to add some garbage collector :D22:56
blackburnso some suggestions22:57
blackburnif you have time you would create tickets for out mid-term evaluations22:58
blackburnone student - one ticket22:58
blackburnand about framework project you could mind some scheme - I don't understand what are you planning to do and what are you did already - I think it will help both you and us22:59
@sonney2kI am dead sleepy sorry23:04
@sonney2kcu23:04
blackburnok, later23:04
blackburnsame thing, have been sleeping for less than 4 hours two days running23:06
blackburneven don't sure I speak very clearly :D23:06
--- Log closed Sat Jun 18 00:00:55 2011

Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!