--- Log opened Fri Jun 17 00:00:53 2011 | ||
@sonney2k | f-x, could you please fix the copyright in the way john suggested that I can merge your patch? | 00:08 |
---|---|---|
CIA-32 | shogun: Soeren Sonnenburg master * r5c71d64 / examples/undocumented/java_modular/classifier_libsvm_minimal_modular.java : polish minimal example to use jblas' signum / mean etc and simplify code slightly - http://bit.ly/lvOBau | 00:27 |
CIA-32 | shogun: Soeren Sonnenburg master * r40c6576 / (7 files in 4 dirs): | 00:27 |
CIA-32 | shogun: add missing functions to Distance/Kernel/LinearMachine such that | 00:27 |
CIA-32 | shogun: all examples run through at least. Minor fixes to example files. - http://bit.ly/mP5myl | 00:27 |
f-x | sonney2k: submitted the patch. though i forgot to change the (C) -- it should be for Berlin Institute of Technology and Max Planck Society, right? | 00:59 |
f-x | and thanks for the test - i have no idea why my laptop gives me totally opposite results... | 01:00 |
@sonney2k | f-x, yeah and what john said | 01:00 |
@sonney2k | f-x, I only see that both methods have the same speed | 01:00 |
@sonney2k | differences are totally negligible | 01:00 |
f-x | i did add the part john mentioned in the mail, i guess | 01:01 |
f-x | anything more to be added there with respect to that? | 01:01 |
@sonney2k | nope | 01:03 |
@sonney2k | then I guess only append the GPL header and the (C) line | 01:04 |
@sonney2k | (W) | 01:04 |
@sonney2k | too | 01:04 |
f-x | there. that should hopefully do it. | 01:06 |
f-x | btw the StreamingSimpleFeatures class seems to be working, but the parent StreamingDotFeatures class is currently purely abstract | 01:09 |
f-x | i've defined all the functions in StreamingSimpleFeatures | 01:09 |
f-x | i'll make a dummy pull request from that branch later just so you can see the changes easily | 01:11 |
@sonney2k | ok | 01:14 |
@sonney2k | but I guess it should be purely abstract anyways | 01:14 |
@sonney2k | I mean it is only to provide interfaces to StreamingSGD etc | 01:14 |
f-x | agreed.. DotFeatures are supposed to necessarily provide a dot, dense_dot, add_to_dense_vec() operation for float64_t* vectors, right? | 01:18 |
f-x | i mean those are the functions in DotFeatures.h | 01:18 |
CIA-32 | shogun: Shashwat Lal Das master * rfe6ab74 / (41 files in 11 dirs): Merge remote-tracking branch 'upstream/master' into streaming - http://bit.ly/iHPzmp | 01:20 |
CIA-32 | shogun: Shashwat Lal Das master * re14b29f / (src/libshogun/lib/IOBuffer.cpp src/libshogun/lib/IOBuffer.h): Changed copyrights in IOBuffer and made it derive from CSGObject. - http://bit.ly/iFvbpW | 01:20 |
CIA-32 | shogun: Shashwat Lal Das master * r6ce47b0 / (src/libshogun/lib/IOBuffer.cpp src/libshogun/lib/IOBuffer.h): More copyright changes to IOBuffer. - http://bit.ly/jEFy59 | 01:20 |
@sonney2k | f-x, yes though now the algorithm probably first needs to fetch an SGVector | 01:21 |
@sonney2k | and then call these helper functions | 01:21 |
@sonney2k | I am not so sure about how this can work well with sparse / strings | 01:22 |
@sonney2k | I think one needs to assume that the current (feature) object is in memory as a member variable somehow | 01:23 |
f-x | pull request made.. | 01:26 |
f-x | it was ok with dense features | 01:26 |
f-x | sparse seems to be the challenge | 01:27 |
f-x | i haven't seen properly how shogun handles sparse features yet | 01:27 |
f-x | but in the current implementation, the features object always works with a "current example" only | 01:27 |
f-x | and the dot(), dense_dot(), add_to_dense_vec() etc operate using that and any other specified vector | 01:28 |
@sonney2k | let me see | 01:30 |
@sonney2k | f-x, yeah makes sense | 01:31 |
@sonney2k | so there is no real challenge then | 01:31 |
@sonney2k | just operate on the fetched example | 01:31 |
@sonney2k | however you don't need to add the start/end parser stuff or? | 01:31 |
@sonney2k | I mean it now really makes sense to have a StreamingFeatures object in that hierarchy | 01:32 |
f-x | sonney2k: that can be avoided | 01:32 |
f-x | but then we should agree that start_parser should be called automatically sometime | 01:32 |
f-x | end_parser will automatically run on object destruction | 01:32 |
@sonney2k | I mean CFeatures -> CStreamingFeatures (with the get_next*) -> whatever StreamingFeatures | 01:32 |
f-x | sonney2k: problem with having a CStreamingFeatures is that anything having the parser as a member must be templated | 01:33 |
@sonney2k | f-x, one might want to manually start the parsing process | 01:33 |
@sonney2k | f-x, I see - so then again interfaces only! | 01:33 |
f-x | but get_next_feature_vector(type** vec) | 01:34 |
@sonney2k | f-x, I think there is no way other than in the learning algorithm start the parser | 01:34 |
f-x | sonney2k: yes.. i've done it explicitly in the gist i sent | 01:34 |
f-x | data->start_parser() | 01:34 |
@sonney2k | f-x, yes that is not possible | 01:34 |
@sonney2k | ok | 01:35 |
f-x | sonney2k: but it really would be convenient to be able to dump that generic stuff into a parent class | 01:35 |
@sonney2k | so if SGD is modified to only use the operations from StreamingDotFeatures then it means it will never call get_next_feature_vector explicitly | 01:36 |
f-x | exactly! | 01:36 |
@sonney2k | that will be done in the respective dotfeature class | 01:36 |
f-x | it only uses the operations it provides | 01:36 |
f-x | like dot, dense_dot, etc | 01:37 |
@sonney2k | and there only to compute the dense_dot etc | 01:37 |
f-x | get_vector is a specialized function of the StreamingSimpleFeatures class | 01:37 |
@sonney2k | so all it needs to call is parser start / end | 01:37 |
@sonney2k | and fetch_next_example() | 01:37 |
@sonney2k | that's it | 01:37 |
f-x | sonney2k: yes.. but is that sufficient for all algorithms? | 01:37 |
f-x | never having to call get_feature_vector? | 01:37 |
@sonney2k | f-x, surely not - but these will require special feature objects then | 01:39 |
@sonney2k | and thus can use specific get_vector() etc functions | 01:39 |
f-x | sonney2k: in the algorithms which currently work on DotFeatures, isn't this the case too? | 01:39 |
f-x | get_feature_vector() is defined only for float64_t vectors (in DotFeatures) | 01:40 |
f-x | for others, i guess it is up to the algorithm to do the conversion to CSimpleFeatures* and use the specialized functions | 01:40 |
f-x | sonney2k: oh - and i have some news.. i will be out of town (compulsorily) on Sunday and Monday.. So I'm sorry, don't think i can work then :( | 01:42 |
@sonney2k | f-x, np - just announce (like you do now) that you are / when you are away | 01:43 |
f-x | sonney2k: okay sure :) thanks | 01:43 |
@sonney2k | f-x, but the algorithms don't need the specialiced get_feature_vector functions | 01:43 |
@bettyboo | :) | 01:43 |
@sonney2k | I mean it does not need to know if it is operating on strings / sparse vectors etc | 01:44 |
f-x | sonney2k: hmm.. i'm beginning to see it now... | 01:44 |
@sonney2k | ok | 01:45 |
f-x | sonney2k: what should be my priority now? i see (after running the online SGD example) that the parser can be improved speed-wise, and it's probably easier to do now.. should i do the parser optimization or more features/algorithms conversion? maybe even a proper clean version of StreamingSGD? | 01:47 |
@sonney2k | I don't know how much time it will take to get streamingsgd to work | 01:48 |
@sonney2k | but definitely you should do the streamingstring/sparse/simple features | 01:48 |
@sonney2k | as we figured out it is not a lot of work | 01:48 |
@sonney2k | if these get a CStreamingFile (from proper type) then you can do ascii / binary etc support | 01:49 |
@sonney2k | though I would say for now just ascii - fancy stuff later | 01:49 |
f-x | sonney2k: okay.. streamingsgd "works" minimally, and the code for that shouldn't change; only the code beneath it will | 01:49 |
f-x | so now - streamingstring/sparse features | 01:49 |
@sonney2k | you won't have an algorithm for strings yet | 01:50 |
@sonney2k | I guess at some point you need to rip of code from some DotFeatures based object that is string based | 01:50 |
f-x | for sgd? no. just inserted a few lines into whatever was there in the original SGD code | 01:50 |
@sonney2k | f-x, yeah well you have to do that properly rather soon | 01:51 |
f-x | sonney2k: before/after streamingstring/sparse features? | 01:52 |
@sonney2k | f-x, streamingfeatures should be very little effort | 01:54 |
@sonney2k | so I would rather do these proper first | 01:55 |
@sonney2k | then SGD / streaming sgd proper ( I guess you need to rip out some code to not have too much code duplication - e.g. loss functions in some CLoss class) | 01:55 |
@sonney2k | and then streaming ascii reliability / speed improvements and some real test data set from e.g. http://largescale.ml.tu-berlin.de/ | 01:56 |
f-x | ok.. sounds like a plan. first i'll concentrate on doing these streaming features properly | 01:57 |
f-x | sonney2k: guess it's time to sleep now.. will have those string and sparse features ready as soon as possible | 01:58 |
f-x | good night, see you! | 01:59 |
@sonney2k | just keep us updated - send an email before you leave what the status (this weekly email thingy) is | 01:59 |
@sonney2k | f-x, cu | 01:59 |
f-x | yeah, i haven't forgotten that :) | 01:59 |
f-x | bye | 01:59 |
-!- f-x [~user@117.192.198.36] has quit [Quit: ERC Version 5.3 (IRC client for Emacs)] | 01:59 | |
-!- f-x [~user@117.192.198.36] has joined #shogun | 02:05 | |
-!- f-x [~user@117.192.198.36] has quit [Client Quit] | 02:05 | |
-!- in3xes_ [~in3xes@180.149.49.227] has joined #shogun | 05:16 | |
-!- in3xes [~in3xes@59.163.196.121] has quit [Ping timeout: 240 seconds] | 05:19 | |
-!- in3xes_ is now known as in3xes | 06:12 | |
-!- in3xes_ [~in3xes@210.212.58.111] has joined #shogun | 06:16 | |
-!- in3xes [~in3xes@180.149.49.227] has quit [Ping timeout: 258 seconds] | 06:19 | |
-!- in3xes_ is now known as in3xes | 06:22 | |
-!- Netsplit *.net <-> *.split quits: @mlsec | 10:22 | |
-!- Netsplit over, joins: @mlsec | 10:26 | |
-!- Netsplit *.net <-> *.split quits: @mlsec | 10:29 | |
-!- heiko [~heiko@infole-06.uni-duisburg.de] has joined #shogun | 10:29 | |
-!- Netsplit over, joins: @mlsec | 10:34 | |
heiko | hello, anybody here? | 10:42 |
heiko | has someone a copy of lib/v_array.h ? I cannot compile without it | 10:42 |
heiko | and it is not in the repo | 10:43 |
CIA-32 | shogun: Baozeng Ding master * r54bb0a1 / (25 files in 2 dirs): add some distance examples and kernel examples, fix kerenl.i to support distance - http://bit.ly/lfqqpM | 12:06 |
CIA-32 | shogun: Baozeng Ding master * r0724fd9 / examples/undocumented/java_modular/kernel_auc_modular.java : add kernel_auc_modular example, this example crash jvm, please help check it - http://bit.ly/inTtDW | 12:06 |
@sonney2k | heiko, around? | 12:10 |
heiko | hi, yes | 12:10 |
@sonney2k | time to chat / talk? | 12:10 |
@sonney2k | if so I will call you | 12:11 |
heiko | yes, in 5 mins? | 12:11 |
@sonney2k | k | 12:13 |
heiko | k ready | 12:17 |
CIA-32 | shogun: Soeren Sonnenburg master * r3bbb4a5 / (2 files in 2 dirs): temporary fix for compiler errors - http://bit.ly/m6TuEM | 12:19 |
-!- heiko [~heiko@infole-06.uni-duisburg.de] has quit [Quit: Leaving.] | 15:36 | |
CIA-32 | shogun: Shashwat Lal Das master * r9a5e66e / (3 files in 2 dirs): Removed StreamingFeatures.*, added v_array.h - http://bit.ly/lKjqjr | 16:56 |
CIA-32 | shogun: Shashwat Lal Das master * r4c1d91d / src/libshogun/lib/v_array.h : Fixed license of v_array.h - http://bit.ly/irvJ5J | 16:56 |
CIA-32 | shogun: Shashwat Lal Das master * r48ea63c / (27 files in 3 dirs): Commit for fixing compile errors. - http://bit.ly/k63TTY | 16:56 |
CIA-32 | shogun: Shashwat Lal Das master * r8768749 / src/libshogun/lib/v_array.h : v_array.h fix. - http://bit.ly/jnuOed | 16:56 |
-!- f-x [~user@117.192.200.96] has joined #shogun | 17:42 | |
-!- blackburn [~blackburn@31.28.40.202] has joined #shogun | 19:02 | |
blackburn | wow how active is mailing list today! | 19:04 |
-!- Netsplit *.net <-> *.split quits: @mlsec | 19:07 | |
-!- Netsplit over, joins: @mlsec | 19:08 | |
-!- blackburn [~blackburn@31.28.40.202] has quit [Ping timeout: 255 seconds] | 19:12 | |
-!- Netsplit *.net <-> *.split quits: @mlsec | 19:14 | |
-!- Netsplit over, joins: @mlsec | 19:16 | |
-!- blackburn [~blackburn@31.28.40.202] has joined #shogun | 19:23 | |
-!- blackburn [~blackburn@31.28.40.202] has quit [Client Quit] | 19:28 | |
-!- blackburn [~blackburn@31.28.40.202] has joined #shogun | 19:28 | |
-!- f-x [~user@117.192.200.96] has quit [Remote host closed the connection] | 20:00 | |
-!- blackburn [~blackburn@31.28.40.202] has quit [Read error: No route to host] | 20:47 | |
-!- blackburn1 [~blackburn@31.28.40.202] has joined #shogun | 20:47 | |
blackburn1 | sonney2k: wondering do we really need apply_to_feature_vector in preprocessors? | 20:48 |
@sonney2k | blackburn1, sure | 20:52 |
@sonney2k | consider there is no feature matrix in memory | 20:52 |
@sonney2k | but you just have a single vector at a time | 20:53 |
-!- blackburn1 [~blackburn@31.28.40.202] has quit [Ping timeout: 255 seconds] | 20:56 | |
-!- blackburn [~blackburn@31.28.40.202] has joined #shogun | 21:31 | |
blackburn | bad bad bad | 21:31 |
blackburn | sonney2k: LLE is wrong :( | 21:31 |
blackburn | bad bad bad | 21:36 |
blackburn | SUCCESS! | 21:54 |
blackburn | sonney2k: http://imageshack.us/photo/my-images/26/image3d2d.png/ | 22:08 |
@sonney2k | the swiss roll :) | 22:33 |
blackburn | sonney2k: yes, now it is working *right* | 22:34 |
blackburn | I'm very disappointed with I did it wrong :( | 22:34 |
blackburn | sonney2k: so please merge pull request with fixes :) | 22:35 |
CIA-32 | shogun: Sergey Lisitsyn master * r7778ddf / src/libshogun/preprocessor/LocallyLinearEmbedding.cpp : Fixes for LLE - http://bit.ly/k3X4oW | 22:37 |
blackburn | sonney2k: as you can see - abs(id_vector[j]) instead of id_vector[j] caused everything goes wrong :D | 22:38 |
blackburn | sonney2k: do you have some new ideas about temporarly matrices like distance matrix? | 22:38 |
@sonney2k | blackburn, I think I should add a flag 'do_free' to these SGTypes | 22:39 |
@sonney2k | if true the caller has to delete[] the matrix | 22:40 |
blackburn | sonney2k: so if do_free then on SGMatrix deletion it will delete it? | 22:40 |
blackburn | nice. like it | 22:40 |
@sonney2k | blackburn, that is not so easy | 22:42 |
blackburn | why? | 22:42 |
@sonney2k | I mean currently we create SGMatrix object on the stack and then return a copy of it | 22:44 |
blackburn | ah | 22:44 |
blackburn | oh shit | 22:44 |
blackburn | :D | 22:44 |
@sonney2k | so when the object on stack is deleted and then the other one later kaboom | 22:44 |
blackburn | I see, yes | 22:44 |
@sonney2k | only 'fix' would be to modify the object on copy constuctor | 22:44 |
@sonney2k | I mean disable deletion in the object that is to be copied, e.g A = SGMatrix() ; B = A; then A won't have the delete flat set but only B - but I don't like it | 22:46 |
blackburn | yes, don't like it too | 22:46 |
blackburn | I have two suggestions about our projects and your project with heiko | 22:47 |
@sonney2k | or we add a member function release_matrix() | 22:47 |
@sonney2k | and everyone has to call it | 22:47 |
blackburn | sonney2k: I am not very familiar with how swig works | 22:49 |
blackburn | what's with SGMatrices in say python? | 22:49 |
blackburn | I mean when I do get_some_matrix() what I use? | 22:49 |
blackburn | some pointer to SGMatrix or so? | 22:49 |
blackburn | saying it just because some idea is to place some warning in case we forgot to release it | 22:53 |
@sonney2k | then get_some_matrix() will return the SGMatrix object to some C code - there it is converted to to a python numpy matrix | 22:53 |
@sonney2k | that's it | 22:53 |
blackburn | sonney2k: well but we anyway have to do 'release_matrix()' whenever we use float64_t** stuff or SGMatrix, right? | 22:54 |
blackburn | so I think it is ok | 22:54 |
@sonney2k | blackburn, well no | 22:55 |
@sonney2k | before we always did copy | 22:55 |
@sonney2k | so we always deleted | 22:55 |
blackburn | it looks like we are trying to add some garbage collector :D | 22:56 |
blackburn | so some suggestions | 22:57 |
blackburn | if you have time you would create tickets for out mid-term evaluations | 22:58 |
blackburn | one student - one ticket | 22:58 |
blackburn | and about framework project you could mind some scheme - I don't understand what are you planning to do and what are you did already - I think it will help both you and us | 22:59 |
@sonney2k | I am dead sleepy sorry | 23:04 |
@sonney2k | cu | 23:04 |
blackburn | ok, later | 23:04 |
blackburn | same thing, have been sleeping for less than 4 hours two days running | 23:06 |
blackburn | even don't sure I speak very clearly :D | 23:06 |
--- Log closed Sat Jun 18 00:00:55 2011 |
Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!