--- Log opened Mon Jul 27 00:00:10 2015 | ||
-!- shaochuan [~shaochuan@c-50-184-81-180.hsd1.ca.comcast.net] has joined #shogun | 01:53 | |
-!- shaochuan [~shaochuan@c-50-184-81-180.hsd1.ca.comcast.net] has quit [Ping timeout: 252 seconds] | 01:58 | |
-!- HeikoS [~heiko@90.195.245.132] has joined #shogun | 02:01 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 02:01 | |
-!- HeikoS [~heiko@90.195.245.132] has quit [Quit: Leaving.] | 02:15 | |
-!- PirosB3 [~pirosb3@host116-44-dynamic.55-82-r.retail.telecomitalia.it] has quit [Quit: PirosB3] | 02:57 | |
-!- shaochuan [~shaochuan@2601:647:4600:fac5:e97f:35a0:bfa0:fc28] has joined #shogun | 06:19 | |
-!- shaochuan [~shaochuan@2601:647:4600:fac5:e97f:35a0:bfa0:fc28] has quit [Ping timeout: 246 seconds] | 06:26 | |
-!- PirosB3 [~pirosb3@host116-44-dynamic.55-82-r.retail.telecomitalia.it] has joined #shogun | 10:02 | |
-!- shaochuan [~shaochuan@c-50-184-81-180.hsd1.ca.comcast.net] has joined #shogun | 10:04 | |
-!- shaochuan [~shaochuan@c-50-184-81-180.hsd1.ca.comcast.net] has quit [Ping timeout: 264 seconds] | 10:09 | |
-!- PirosB3 [~pirosb3@host116-44-dynamic.55-82-r.retail.telecomitalia.it] has quit [Quit: PirosB3] | 10:36 | |
-!- HeikoS [~heiko@90.195.245.132] has joined #shogun | 11:06 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 11:06 | |
-!- ajph [~ajph@unaffiliated/ajph] has left #shogun ["Textual IRC Client: www.textualapp.com"] | 11:26 | |
lisitsyn | HeikoS: hey | 12:28 |
---|---|---|
lisitsyn | stammtisch today rye? | 12:28 |
-!- HeikoS [~heiko@90.195.245.132] has quit [Quit: Leaving.] | 12:39 | |
-!- shaochuan [~shaochuan@2601:647:4600:fac5:e97f:35a0:bfa0:fc28] has joined #shogun | 13:06 | |
-!- shaochuan [~shaochuan@2601:647:4600:fac5:e97f:35a0:bfa0:fc28] has quit [Ping timeout: 244 seconds] | 13:11 | |
-!- xwize [892c01ae@gateway/web/freenode/ip.137.44.1.174] has joined #shogun | 13:53 | |
xwize | why do I need svn to cmake this!?!?!? ngraargh | 13:57 |
@wiking | ? | 14:01 |
@wiking | u dont need svn at all to cmake anything :) | 14:01 |
xwize | I'm on windows and I need to build an x64 version, I've never used this before, I tried cmake . and after a while it complains about svn checkout of MSIntTypes | 14:03 |
@wiking | aaaah | 14:03 |
@wiking | noo | 14:03 |
@wiking | atm shogun will not compile on windows natively | 14:03 |
xwize | X_x | 14:03 |
@wiking | u r better of compiling that on linux | 14:04 |
xwize | but I'm doing all my dev on windows, which is my native OS, and I can't virtualise because I'm using the GPU and I need full performance | 14:05 |
@wiking | mmm sorry then :( | 14:05 |
@wiking | that svn stuff was a first attmept to get shogun compiled with visual studio... but it's not even near to be done | 14:06 |
xwize | darn | 14:07 |
xwize | I was looking forward to using this! I guess I should find something else | 14:07 |
@wiking | yeah or help in porting it to windows :) | 14:07 |
xwize | sorry, I'd like to help but I'm sort of in a hurry and my own project is giving me grey hairs | 14:11 |
@wiking | yeah sorry again windows natively isn't supported atm | 14:13 |
-!- HeikoS [~heiko@dab-yat1-h-1-2.dab.02.net] has joined #shogun | 14:33 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 14:33 | |
-!- HeikoS1 [~heiko@217.138.5.14] has joined #shogun | 15:32 | |
-!- HeikoS [~heiko@dab-yat1-h-1-2.dab.02.net] has quit [Ping timeout: 244 seconds] | 15:32 | |
-!- xwize [892c01ae@gateway/web/freenode/ip.137.44.1.174] has quit [Quit: Page closed] | 15:33 | |
-!- HeikoS1 [~heiko@217.138.5.14] has quit [Quit: Leaving.] | 15:40 | |
-!- HeikoS [~heiko@217.138.5.14] has joined #shogun | 15:40 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 15:40 | |
-!- HeikoS [~heiko@217.138.5.14] has quit [Ping timeout: 265 seconds] | 15:53 | |
-!- PirosB3 [~pirosb3@host116-44-dynamic.55-82-r.retail.telecomitalia.it] has joined #shogun | 16:40 | |
-!- HeikoS [~heiko@nat-187-200.internal.eduroam.ucl.ac.uk] has joined #shogun | 17:10 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 17:10 | |
-!- shogun-notifier- [~irker@7nn.de] has joined #shogun | 17:25 | |
shogun-notifier- | shogun: Sergey Lisitsyn :develop * f00e611 / CMakeLists.txt: https://github.com/shogun-toolbox/shogun/commit/f00e611c1d831e384235238f75db4cea64885b43 | 17:25 |
shogun-notifier- | shogun: Remove inline-functions option | 17:25 |
shogun-notifier- | shogun: Sergey Lisitsyn :develop * 5bd5d07 / CMakeLists.txt: https://github.com/shogun-toolbox/shogun/commit/5bd5d07fb4e14582da91b71608a73a35c9957bb9 | 17:25 |
shogun-notifier- | shogun: Merge pull request #2868 from lisitsyn/bugfix/yosemite_clang | 17:25 |
shogun-notifier- | shogun: | 17:25 |
shogun-notifier- | shogun: Remove inline-functions option | 17:25 |
-!- PirosB3 [~pirosb3@host116-44-dynamic.55-82-r.retail.telecomitalia.it] has quit [Quit: PirosB3] | 17:27 | |
lisitsyn | HeikoS: hey | 17:28 |
@HeikoS | lisitsyn: hey there | 17:28 |
lisitsyn | HeikoS: I glanced over optimizers PRs | 17:28 |
lisitsyn | I have no idea :D | 17:28 |
lisitsyn | I have a feeling it has too much pointer juggling | 17:29 |
lisitsyn | and code could be reduced | 17:29 |
@HeikoS | yeah | 17:29 |
@HeikoS | lisitsyn: I was asking you because you have a good idea of how to build elegant c++ code | 17:30 |
@HeikoS | what wu is doing is a bit crazy | 17:30 |
@HeikoS | like 5 new classes | 17:30 |
@HeikoS | for things like cost functions | 17:30 |
@HeikoS | etc | 17:30 |
lisitsyn | HeikoS: ok we need faster chat because I constantly forget commenting PRs | 17:30 |
lisitsyn | HeikoS: one approach I find quite good is | 17:30 |
lisitsyn | to write test/example first | 17:31 |
lisitsyn | and then write the api to satisfy it | 17:31 |
lisitsyn | we need a few use cases | 17:31 |
lisitsyn | say | 17:31 |
lisitsyn | we have a function blabla, want to minimize it | 17:32 |
@HeikoS | I see | 17:32 |
lisitsyn | I think this could be better | 17:32 |
@HeikoS | The thing is that Wu wants things to be modular | 17:32 |
@HeikoS | and I agree on that | 17:32 |
@HeikoS | so his approach is just taking the OOP way | 17:32 |
@HeikoS | but all these objects onherit from CSGobject | 17:32 |
@HeikoS | so we get like 800 lines of code per class | 17:33 |
lisitsyn | OOP is not the best way you know | 17:33 |
@HeikoS | yes | 17:33 |
lisitsyn | sometimes we just need a function | 17:33 |
@HeikoS | exactly | 17:33 |
@HeikoS | this is what I had in mind | 17:33 |
lisitsyn | there is no need for interface of the function we minimize | 17:33 |
@HeikoS | yep | 17:33 |
lisitsyn | and | 17:33 |
lisitsyn | actually | 17:33 |
@HeikoS | could you saw a few things on this in the PR? | 17:33 |
lisitsyn | HeikoS: yeah but I am not sure how should we turn wu into the right way | 17:34 |
lisitsyn | I mean we risk to overcriticize :) | 17:34 |
lisitsyn | HeikoS: do you have good understanding what's the goal? | 17:35 |
@HeikoS | yes | 17:35 |
@HeikoS | GPs have different objective functions | 17:35 |
@HeikoS | and there are different way to optimise | 17:35 |
@HeikoS | so he wants to do that from a base class | 17:35 |
@HeikoS | but I feel that function pointers should do things here | 17:35 |
@HeikoS | lisitsyn: he can take critic | 17:36 |
@HeikoS | lisitsyn: the thing is I cannot merge these giant PRs | 17:36 |
@HeikoS | I have no idea | 17:36 |
lisitsyn | I have a suggestion | 17:36 |
lisitsyn | to not use shogun things for this optimizers | 17:36 |
lisitsyn | at all | 17:36 |
lisitsyn | I mean use simple (maybe even C) api | 17:36 |
lisitsyn | what do you think? | 17:36 |
-!- travis-ci [~travis-ci@ec2-54-163-170-244.compute-1.amazonaws.com] has joined #shogun | 17:36 | |
travis-ci | it's Sergey Lisitsyn's turn to pay the next round of drinks for the massacre he caused in shogun-toolbox/shogun: https://travis-ci.org/shogun-toolbox/shogun/builds/72853431 | 17:36 |
-!- travis-ci [~travis-ci@ec2-54-163-170-244.compute-1.amazonaws.com] has left #shogun [] | 17:36 | |
lisitsyn | I fell in love with C api | 17:37 |
lisitsyn | it is much cleaner just always :D | 17:37 |
lisitsyn | I would even write tapkee in C today | 17:37 |
@HeikoS | maybe | 17:38 |
shogun-buildbot | build #30 of FC22 - libshogun is complete: Failure [failed test] Build details are at http://buildbot.shogun-toolbox.org/builders/FC22%20-%20libshogun/builds/30 blamelist: Sergey Lisitsyn <lisitsyn.s.o@gmail.com> | 17:38 |
@HeikoS | lisitsyn: depends on the details | 17:38 |
@HeikoS | lisitsyn: could you have a look in the PR? | 17:38 |
shogun-buildbot | build #1045 of FCRH - libshogun is complete: Failure [failed test] Build details are at http://buildbot.shogun-toolbox.org/builders/FCRH%20-%20libshogun/builds/1045 blamelist: Sergey Lisitsyn <lisitsyn.s.o@gmail.com> | 17:38 |
@HeikoS | I also feel Wu should probably draw a class diagram to discuss things | 17:38 |
lisitsyn | ok let me comment | 17:38 |
shogun-buildbot | build #2710 of bsd1 - libshogun is complete: Failure [failed test] Build details are at http://buildbot.shogun-toolbox.org/builders/bsd1%20-%20libshogun/builds/2710 blamelist: Sergey Lisitsyn <lisitsyn.s.o@gmail.com> | 17:40 |
shogun-buildbot | build #2667 of deb3 - modular_interfaces is complete: Failure [failed csharp modular] Build details are at http://buildbot.shogun-toolbox.org/builders/deb3%20-%20modular_interfaces/builds/2667 blamelist: Sergey Lisitsyn <lisitsyn.s.o@gmail.com> | 17:40 |
lisitsyn | HeikoS: ok commented | 17:43 |
shogun-buildbot | build #653 of deb4 - python3 is complete: Failure [failed test python modular] Build details are at http://buildbot.shogun-toolbox.org/builders/deb4%20-%20python3/builds/653 blamelist: Sergey Lisitsyn <lisitsyn.s.o@gmail.com> | 17:47 |
@wiking | boooo | 17:52 |
@wiking | fuck i wont be able to be awake again for stammtisch | 17:52 |
lisitsyn | wiking: ae | 17:59 |
@wiking | ye | 18:00 |
-!- PirosB3 [~pirosb3@host116-44-dynamic.55-82-r.retail.telecomitalia.it] has joined #shogun | 18:21 | |
-!- PirosB3 [~pirosb3@host116-44-dynamic.55-82-r.retail.telecomitalia.it] has quit [Client Quit] | 18:21 | |
-!- yorkerlin [b8af2f1e@gateway/web/freenode/ip.184.175.47.30] has joined #shogun | 18:31 | |
yorkerlin | hi | 18:31 |
yorkerlin | lisitsyn, please take a look at the comments I wrote at https://github.com/shogun-toolbox/shogun/pull/2876 | 18:33 |
lisitsyn | yorkerlin: hey | 18:33 |
lisitsyn | sure | 18:33 |
lisitsyn | yorkerlin: ok I have a suggestion how can we simplify design | 18:35 |
yorkerlin | ok | 18:35 |
lisitsyn | what if we write the code that uses that 'library' | 18:35 |
lisitsyn | not-yet-existing | 18:35 |
lisitsyn | I mean how do you see it being used | 18:35 |
yorkerlin | I think it is a tool for ML developers | 18:37 |
lisitsyn | do you see it as a standalone thing? | 18:37 |
yorkerlin | what do you mean standalone thing? | 18:37 |
lisitsyn | I mean if you write that thing so it can be compiled without shogun | 18:38 |
lisitsyn | it could be better | 18:38 |
lisitsyn | people could take it and use | 18:38 |
-!- PirosB3 [~pirosb3@host116-44-dynamic.55-82-r.retail.telecomitalia.it] has joined #shogun | 18:38 | |
lisitsyn | and you don't have to bother with our interface oriented things like our own map and sgvector | 18:38 |
yorkerlin | ic, should I use std lib for vector? | 18:39 |
yorkerlin | std::vector? | 18:40 |
lisitsyn | yorkerlin: no I mean you would be free to use whatever you want to | 18:40 |
yorkerlin | yes. if we do not use sgvector, the framework can be complied without shogun | 18:41 |
yorkerlin | and cmap | 18:41 |
lisitsyn | yorkerlin: there is also a cool technique to write C++ code under C api | 18:42 |
lisitsyn | it speeds up compilation and make it usable from say python | 18:42 |
lisitsyn | I don't suggest just telling you variants | 18:42 |
yorkerlin | any reference | 18:42 |
lisitsyn | ok let me find | 18:42 |
yorkerlin | lisitsyn, did see the comments about learning rate and descenupdater? (function pointers seem to be not enough) | 18:45 |
lisitsyn | yorkerlin: hmm didn't find it yet, ok let me send you one thing | 18:45 |
yorkerlin | ok | 18:45 |
lisitsyn | yorkerlin: do you feel like glancing over a book? | 18:45 |
yorkerlin | yes | 18:46 |
lisitsyn | I have been reading a cool book on C++ API | 18:46 |
lisitsyn | just a sec | 18:46 |
yorkerlin | API Design for C++? | 18:46 |
lisitsyn | yap | 18:47 |
lisitsyn | yeap* | 18:47 |
yorkerlin | ok, I have a copy of the book. | 18:47 |
lisitsyn | ah | 18:48 |
lisitsyn | ok :) | 18:48 |
lisitsyn | ok lets get back to this thing then | 18:48 |
yorkerlin | ok | 18:48 |
lisitsyn | I have got another idea meanwhile | 18:49 |
yorkerlin | :) | 18:49 |
lisitsyn | I have seen some pretty design here at my job | 18:49 |
lisitsyn | :) | 18:49 |
lisitsyn | so if we have some optimizer | 18:49 |
lisitsyn | it could be good idea to split mutable and immutable parts | 18:49 |
yorkerlin | hmm, good suggestion | 18:50 |
lisitsyn | yorkerlin: so you could create some class that contains all state of optimizer | 18:50 |
lisitsyn | current pointer, gradient | 18:50 |
lisitsyn | err sorry | 18:50 |
lisitsyn | pointer=point :) | 18:50 |
yorkerlin | state, u mean the mutable part? | 18:52 |
lisitsyn | yes | 18:52 |
lisitsyn | basically anything you need to stop and resume | 18:52 |
lisitsyn | so that other parts are just a matter of configuration | 18:52 |
yorkerlin | yes. one issue is learning rate, gradient update, penalty have many choices. | 18:54 |
lisitsyn | ok as I can see | 18:55 |
lisitsyn | LR, gradient, penalty just modify this mutable state | 18:55 |
yorkerlin | yes | 18:56 |
lisitsyn | so we can just make them a functions over mutable state | 18:56 |
lisitsyn | a set of functions | 18:56 |
lisitsyn | it would be less clutter I think | 18:56 |
lisitsyn | interfaces are ok but they bloat code a bit | 18:57 |
lisitsyn | in java they are the only way | 18:57 |
lisitsyn | but here we can make it shorter | 18:57 |
yorkerlin | so these set of functions should share the same function definition, right? | 18:58 |
lisitsyn | yeah | 18:58 |
yorkerlin | for example void fun1(int a,int b), void fun2(int a, int b) | 18:58 |
lisitsyn | just a sec | 18:59 |
lisitsyn | hmm I am not sure now | 19:00 |
lisitsyn | probably functions are not that good because we can mix them up | 19:00 |
yorkerlin | let me give an example | 19:00 |
yorkerlin | for inverse scaling learn rate, learning_rate = eta0 / pow(t, power_t), where t is the times we call get_learning_rate(...), eta0 and power_t must be given by users or default value. | 19:00 |
yorkerlin | for const learning rate, learning_rate=const_learning_rate, const_learning_rate is given by users or default value | 19:01 |
lisitsyn | https://gist.github.com/lisitsyn/32bb2a13aacd9cbf9b1b | 19:02 |
lisitsyn | yorkerlin: something like that? | 19:02 |
yorkerlin | ok. I am look at it | 19:02 |
lisitsyn | yorkerlin: I mean optimizer would be pretty generic | 19:04 |
yorkerlin | yes | 19:05 |
lisitsyn | I'll be around later, like in a hour | 19:05 |
yorkerlin | ic. | 19:05 |
lisitsyn | going home :) | 19:05 |
yorkerlin | ok see then | 19:05 |
lisitsyn | see you a bit later | 19:05 |
-!- shaochuan [~shaochuan@2601:647:4600:fac5:e97f:35a0:bfa0:fc28] has joined #shogun | 19:11 | |
-!- shaochuan [~shaochuan@2601:647:4600:fac5:e97f:35a0:bfa0:fc28] has quit [Ping timeout: 246 seconds] | 19:15 | |
@HeikoS | yorkerlin: hi! | 20:02 |
yorkerlin | hi | 20:02 |
@HeikoS | yorkerlin: good to catch you, how are things? | 20:04 |
yorkerlin | good. lisitsyn suggested another way to implement the framework. | 20:04 |
@HeikoS | yorkerlin: just reading through the logs | 20:05 |
@HeikoS | yorkerlin: do you see why I am a bit concerned about the current structure? | 20:05 |
yorkerlin | because of CSGObject ? | 20:06 |
@HeikoS | yorkerlin: yeah thats one thing | 20:06 |
@HeikoS | yorkerlin: whenever we add a new class that inherits from CSGObject, we get a few hundred lines of code to compile | 20:07 |
yorkerlin | ic | 20:07 |
@HeikoS | yorkerlin: and all the overhead from the base class that we dont need | 20:07 |
@HeikoS | yorkerlin: only objects that are directly exposed via the interfaces should inherit from this class | 20:07 |
yorkerlin | one issue is how to deal with serialization | 20:08 |
@HeikoS | yorkerlin: the other thing is the sheer number of classes. The OOP approach here is a bit convoluted in my eyes | 20:08 |
yorkerlin | if we do not use CSGObject as the base class | 20:08 |
@HeikoS | yorkerlin: how do you mean? | 20:08 |
@HeikoS | yorkerlin: can use CSGObject for the main class, and make it have pointers to the data where needed. | 20:08 |
@HeikoS | yorkerlin: but things like cost functions do not need to be serialised, or? | 20:09 |
yorkerlin | I am not sure how the serialization work in shogun. I think CSGObject take care of the serialization work. | 20:09 |
@HeikoS | yorkerlin: yes thats what it does | 20:10 |
yorkerlin | that is why I use the CSGObject as the base class | 20:10 |
@HeikoS | yorkerlin: btw I agree that it is bad to have global variables | 20:11 |
@HeikoS | yorkerlin: just want to avoid the number of classes exploding | 20:11 |
yorkerlin | I think lisitsyn suggested another way https://gist.github.com/lisitsyn/32bb2a13aacd9cbf9b1b/revisions | 20:11 |
@HeikoS | yorkerlin: yeah this seems cleaner, question is: would it work here? | 20:12 |
@HeikoS | It seems like this would allow to flatten out the code a bit | 20:13 |
yorkerlin | yes. I think so. however, there will be many classes (eg, l2penlaty, const_learning_rate, gradient_updater) | 20:13 |
yorkerlin | if we want to combine them | 20:13 |
@HeikoS | yorkerlin: if they are not extending CSGObject, all this is fine | 20:14 |
@HeikoS | yorkerlin: they are only used internally right? | 20:14 |
yorkerlin | yes | 20:14 |
@HeikoS | not from say Python | 20:14 |
@HeikoS | its just for the GP code developers | 20:14 |
yorkerlin | in fact we can do this http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html | 20:15 |
yorkerlin | not just for gp | 20:15 |
@HeikoS | yorkerlin: agreed! | 20:15 |
@HeikoS | on my todo list for long time ;) | 20:15 |
@HeikoS | yorkerlin: so but back to the classes | 20:15 |
yorkerlin | ok | 20:15 |
@HeikoS | if you need a state, you can just define a simple class | 20:15 |
@HeikoS | no difference to c-structs in fact, these things almost look the same in memory | 20:16 |
@HeikoS | aprt from v-tables if you have purely virtual methods | 20:16 |
@HeikoS | so you can do these things as long as they are not exposed via serialisation or Python/SWIG | 20:16 |
yorkerlin | about the serialization thing. maybe we can document how the serialization works | 20:17 |
@HeikoS | and you can use constructors and init methods of the CSGObject-based classes to set up the right combination of loss, etc | 20:17 |
@HeikoS | yorkerlin: sure | 20:17 |
@HeikoS | yorkerlin: Though I have the intention to drop this anyways | 20:17 |
@HeikoS | yorkerlin: but: it works like this: | 20:17 |
@HeikoS | yorkerlin: CSGObject has two methods for this: save_serializable, and load_serializable | 20:17 |
@HeikoS | All parameters that were added by SG_ADD macro are serilaised | 20:18 |
@HeikoS | easy as that | 20:18 |
@HeikoS | thats why you have to register parameters | 20:18 |
@HeikoS | they methods themselves look horrible and contain a lot of code | 20:18 |
@HeikoS | that is why adding a new CSGObject class to Shogun increases compile time quite abit | 20:18 |
yorkerlin | so if we do not use CSGObject as base class, when dserializable, we need to new the instance. right? | 20:19 |
yorkerlin | deserializable | 20:19 |
@HeikoS | yorkerlin: how do you mean? | 20:19 |
yorkerlin | how does the deSerialization work | 20:20 |
@HeikoS | we can only de-serialise CSGObject files | 20:20 |
@HeikoS | the way it works is: | 20:20 |
@HeikoS | empty instance is created (init method is called, so class knows all the pointers) | 20:20 |
@HeikoS | and load_serializable allocates and populates the memory of its parameters | 20:20 |
@HeikoS | yorkerlin: it is pretty horrible to be honest. I am for dropping serialisation | 20:21 |
yorkerlin | let me give an example for inverse scaling learn rate, learning_rate = eta0 / pow(t, power_t), where t is the times we call get_learning_rate(...), eta0 and power_t must be given by users or default value. | 20:21 |
@HeikoS | yorkerlin: in my MCMC code (python) learning rates are functions of t | 20:21 |
@HeikoS | yorkerlin: and the main class calls them with increasing values of t | 20:21 |
yorkerlin | however, there are many learning rate methods. | 20:22 |
@HeikoS | but I see, you mean there are paramters given by users | 20:22 |
yorkerlin | const learning rate, line search methods, | 20:22 |
yorkerlin | inverse scaling learning rate | 20:22 |
@HeikoS | and they all have different parameters, so we cannot just use an enum | 20:22 |
@HeikoS | because then the main class would have to store all parameters | 20:23 |
@HeikoS | I see | 20:23 |
yorkerlin | the problem is if a new learning rate method is added, we may need to modify the main class :( | 20:23 |
@HeikoS | yorkerlin: agreed that is not good | 20:23 |
@HeikoS | yorkerlin: but CSGObject as base class is neither a good idea | 20:24 |
@HeikoS | yorkerlin: will users change the type of learning rate? | 20:24 |
@HeikoS | yorkerlin: how many learning rates will there be? | 20:24 |
@HeikoS | yorkerlin: in many places in Shogun, such situations are solved via an enum and a switch in the main class | 20:25 |
yorkerlin | users (I mean, developer) have to init a learning rate instance | 20:25 |
@HeikoS | see QuadraticTimeMMD | 20:25 |
-!- shogun-notifier- [~irker@7nn.de] has quit [Quit: transmission timeout] | 20:25 | |
yorkerlin | the issue is learning rate may contain mutable variables. | 20:26 |
@HeikoS | I see | 20:26 |
yorkerlin | if we dp serialization and do deserialization | 20:26 |
@HeikoS | yorkerlin: question is: how many learning rates will there be? | 20:26 |
@HeikoS | yorkerlin: and do they have different numbers of parameters? | 20:26 |
@HeikoS | if there are just 3-4, and then it is unlikely to add another one, these can go into the main class | 20:27 |
@HeikoS | yorkerlin: its not like a kernel, where there are hundreds of possibilities | 20:27 |
yorkerlin | if we do not add linear search methods (for batch minimizers), there are 4 learning rate methods | 20:28 |
@HeikoS | how many parameters do they have? | 20:28 |
@HeikoS | how many more could there be? | 20:28 |
yorkerlin | if we add line search methods, | 20:28 |
yorkerlin | at least 3 more methods | 20:28 |
@HeikoS | yorkerlin: but all of them are one-liners as functions, not? | 20:28 |
@HeikoS | yorkerlin: what are the mutable parts of the learning rate? | 20:29 |
@HeikoS | iteration number is not really part of learning rate for example | 20:29 |
yorkerlin | for inverse scaling learn rate, learning_rate = eta0 / pow(t, power_t), | 20:29 |
yorkerlin | r | 20:29 |
yorkerlin | t is the mutable | 20:29 |
yorkerlin | variable | 20:29 |
yorkerlin | eta0, power_t are also mutable I think | 20:30 |
@HeikoS | but can't t be provided by the caller? | 20:30 |
yorkerlin | you need store t in the caller | 20:31 |
@HeikoS | t is a parameter to any learning rate, isnt it? | 20:31 |
@HeikoS | so then why not a static function where the eta0 and the power_t parameters are fixed? | 20:31 |
@HeikoS | like | 20:32 |
@HeikoS | lambda t: eta0/t**power_t | 20:32 |
@HeikoS | and eta0 was defined before | 20:32 |
@HeikoS | curried | 20:32 |
yorkerlin | for SGD, it is possible. for GD, I am not sure yet | 20:33 |
@HeikoS | yorkerlin: what would be a problem there? | 20:33 |
yorkerlin | In line search methods for GD or batch minimizer, we may need to try guess the learning rate | 20:35 |
yorkerlin | using https://en.wikipedia.org/wiki/Wolfe_conditions | 20:35 |
@HeikoS | yorkerlin: where would that guessing happen and based on which information? | 20:35 |
yorkerlin | gradient and variable from the cost function | 20:36 |
@HeikoS | but thats part of another class then | 20:36 |
@HeikoS | not part of the learning rate, so needs to be passed | 20:37 |
yorkerlin | yes | 20:37 |
@HeikoS | yorkerlin: but then again, one can go back to a static function as all information is passed by the caller | 20:37 |
@HeikoS | and the caller needs to know the parameters of the learning rate anyways (otherwise he cannot modify) | 20:38 |
yorkerlin | yes. again, if a new method is proposed, the static function may be changed. | 20:38 |
@HeikoS | so that breaks the modular structure anyways as two leaf classes need to know each others instance, no? | 20:38 |
yorkerlin | if we are ok fine with that | 20:38 |
-!- PirosB3 [~pirosb3@host116-44-dynamic.55-82-r.retail.telecomitalia.it] has quit [Quit: PirosB3] | 20:38 | |
@HeikoS | yorkerlin: my optinion is that this doesnt happen often, so I would go with that. | 20:39 |
@HeikoS | yorkerlin: but: | 20:39 |
@HeikoS | it's your code, I just wanted to have a discussion on this | 20:39 |
@HeikoS | lisitsyn should also give his opinion here | 20:39 |
@HeikoS | just to make sure these things are consciously decided | 20:40 |
yorkerlin | another thing is momentum update | 20:40 |
@HeikoS | yorkerlin: details? | 20:41 |
yorkerlin | for momentum update, we need to store a corrected_gradient. for plain update, we do not need to store the correct_gradient | 20:41 |
yorkerlin | if we go futher using inexact Hessian information, we may need to store the information | 20:42 |
yorkerlin | I mean the caller cannot store all these information | 20:43 |
@HeikoS | yorkerlin: if the learning rate class were to store that, where would it get it from? | 20:43 |
yorkerlin | gradientupater should store the needed information | 20:43 |
yorkerlin | oh, momentum update is not learning rate | 20:44 |
@HeikoS | yorkerlin: I am confused | 20:44 |
yorkerlin | I agree with you that learning rate can be use some static functions | 20:44 |
@HeikoS | yorkerlin: btw: do you have an idea how other strongly typed optimisation toolboxes solve these issues? | 20:45 |
yorkerlin | let me find some | 20:46 |
@HeikoS | yorkerlin: just asking, since I don't | 20:46 |
@HeikoS | the gradient information in the momentum are only helper variables right? | 20:47 |
@HeikoS | I mean they do not need to be visible to the outside | 20:47 |
@HeikoS | and they do not need to be serialised | 20:47 |
@HeikoS | so I guess we could just use classes that dont inherit from CSGObject | 20:47 |
yorkerlin | however, the gradient information in the momentum is mutable | 20:47 |
@HeikoS | yorkerlin: sure, but that is only read/written during the algorithm | 20:48 |
@HeikoS | so no serialisation happening then, right? | 20:48 |
@HeikoS | yorkerlin: I gotta go now, good talking to you | 20:49 |
yorkerlin | for increamental inference, may be not | 20:49 |
yorkerlin | ok | 20:49 |
@HeikoS | yorkerlin: maybe have another chat with sergey | 20:49 |
@HeikoS | and lambday | 20:49 |
yorkerlin | ok | 20:49 |
@HeikoS | he is really good in these things | 20:49 |
@HeikoS | maybe send him an email, he might not have seen the discussion in the PR | 20:49 |
@HeikoS | yorkerlin: Im back tomorrow, we could chat then again | 20:50 |
yorkerlin | the issue all come form "serilization and then deserilization" | 20:50 |
yorkerlin | ok | 20:50 |
@HeikoS | yorkerlin: keep in mind we might drop it | 20:50 |
@HeikoS | yorkerlin: but yeah you are right | 20:51 |
@HeikoS | yorkerlin: lets just avoid to inherit from CSGObject if possible. Only the main interfaced classes need to | 20:51 |
@HeikoS | yorkerlin: take care, bye | 20:51 |
yorkerlin | ok | 20:51 |
yorkerlin | 8 | 20:51 |
yorkerlin | for incremental inferce, we first do minimizer for existing data.we then do serilization. in some point, we do deserilization and do incremtnal update. | 20:52 |
yorkerlin | we want the mutable variables when we do the incremental update. | 20:53 |
yorkerlin | we may called warm_start at http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html | 20:56 |
-!- HeikoS [~heiko@nat-187-200.internal.eduroam.ucl.ac.uk] has quit [Ping timeout: 244 seconds] | 20:56 | |
yorkerlin | hi, I leave now. will send email to your guys | 21:20 |
-!- yorkerlin [b8af2f1e@gateway/web/freenode/ip.184.175.47.30] has quit [Quit: Page closed] | 21:20 | |
-!- PirosB3 [~pirosb3@host116-44-dynamic.55-82-r.retail.telecomitalia.it] has joined #shogun | 22:32 | |
-!- PirosB3 [~pirosb3@host116-44-dynamic.55-82-r.retail.telecomitalia.it] has quit [Client Quit] | 22:36 | |
-!- thoralf [~thoralf@ip5b4223a1.dynamic.kabel-deutschland.de] has joined #shogun | 22:50 | |
-!- mode/#shogun [+o thoralf] by ChanServ | 22:50 | |
@thoralf | Hey! | 22:50 |
lisitsyn | hey | 22:51 |
-!- thoralf [~thoralf@ip5b4223a1.dynamic.kabel-deutschland.de] has quit [Quit: Konversation terminated!] | 23:02 | |
--- Log closed Tue Jul 28 00:00:11 2015 |
Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!