IRC logs of #shogun for Saturday, 2018-03-17

--- Log opened Sat Mar 17 00:00:44 2018
-!- Farouk [81617d04@gateway/web/freenode/ip.129.97.125.4] has quit [Quit: Page closed]00:05
-!- witness [uid10044@gateway/web/irccloud.com/x-xddnldgkqtduxlsc] has quit [Quit: Connection closed for inactivity]00:53
@sukey[https://github.com/shogun-toolbox/shogun] Pull Request https://github.com/shogun-toolbox/shogun/pull/4208 opened by FaroukY06:12
-!- Farouk [81617d04@gateway/web/freenode/ip.129.97.125.4] has joined #shogun06:13
-!- Farouk [81617d04@gateway/web/freenode/ip.129.97.125.4] has quit [Client Quit]06:18
-!- Jinquan [72d455ad@gateway/web/freenode/ip.114.212.85.173] has joined #shogun08:02
-!- Jinquan [72d455ad@gateway/web/freenode/ip.114.212.85.173] has quit [Client Quit]08:04
-!- tctara [~quassel@irc.redcorelinux.org] has quit [Ping timeout: 240 seconds]10:28
@wikinglisitsyn, ping?11:11
lisitsynwiking: hey11:12
@wikingi have some questions11:12
lisitsynsure11:12
@wikingso we were talking with Heiko about coreml and implementing an executore11:12
@wiking*executor for that11:12
@wikingso that we can use coreml exported models easily in shogun11:12
lisitsynaha11:12
@wikingso i was wondering what do you think about how would you optimize the generated model?11:13
@wikingmeaning you know that it can define a pipeling11:13
@wiking*pipeline11:13
@wikingand of course we could do simply just a vanilla implemenation11:13
@wikingwhere you implement all those modules in coreml11:14
@wikingand apply on the features 1-by-111:14
lisitsynoptimize like train with warm start?11:14
@wikingbut i guess the reason why CoreML api has compileModel11:14
@wikingis that they actually can do optimization over the typeline11:14
@wikingthat way there's no temporary values etc11:14
@wikingsee what i mean?11:14
@wikinghttps://developer.apple.com/documentation/coreml/mlmodel/2921516-compilemodel11:15
lisitsynsounds like apple specific11:15
@wikingi mean as a beginning of course we can have the whole thing vanilla11:15
@wikingbut i guess later on it would be good to be able to do this11:15
lisitsynI think coreml is just a transport for models11:15
@wikingthat of course means that we have a sort of JIT capability11:16
lisitsynand we just convert it into our vanilla models11:16
@wikingyeye11:16
lisitsynI mean if it is a linear model we should just fill the weights11:16
@wikingi mean now i'm trying to finish up the mini example11:16
@wikingwhere i've trained scikit-learn LR11:16
@wikingon boston housing dataset11:16
@wikingit's a simple regression model11:16
@wikingit consist of 2 stages11:16
@wikinga) featurevectorizer11:16
@wikingb) glm regressor11:17
lisitsynah11:17
lisitsynthis thing11:17
-!- tctara [~quassel@irc.redcorelinux.org] has joined #shogun11:17
@wikinganyhow11:18
lisitsynI guess then we load(somefile)11:18
lisitsynand it returns a set of objects11:18
@wikingdo we want that actually?11:18
lisitsynpreprocessor, machine, blabla11:18
@wikingi mean do we actually wanna expose the model?11:18
lisitsynexpose like?11:18
@wikingi thought this to be the simply api11:18
@wikingnamespace nobunaga11:19
@wiking{11:19
@wiking    typedef std::shared_ptr<arrow::Table> Features;11:19
@wiking    class Model11:19
@wiking    {11:19
@wiking    public:11:19
@wiking        explicit Model(std::istream& is);11:19
@wiking        ~Model();11:19
@wiking        void predict(Features f) const;11:19
@wiking    private:11:19
@wiking        class Self;11:19
@wiking        std::unique_ptr<Self> m_self;11:19
@wiking    };11:19
@wiking}11:19
@wikingthis is all you can have11:19
@wikingnote the predict will need a return value11:19
@wikingbut that'll be handled later11:19
@wikingbut the idea is this11:19
@wikingand note that we use arrow for features11:19
@wiking(the reason for that that way we can easily solve the problem of GPU vs CPU operations.... arrow has GPU/CUDA memorybackend)11:20
lisitsynisn't it ArrowFeatures?11:20
lisitsynI mean ArrowFeatures is an instance of Features11:20
lisitsynI think the most straightforward is to map coreml objects into shogun objects11:21
lisitsynI mean I'd treat coreml as just another transport for models11:23
@wikingmmm11:24
@wikingthe reason i dont want to11:24
@wikingtie it to shogun is because if you just wanna have an executor11:24
@wikingthen you dont need shogun11:24
@wikingwe have waaaaaay tooo many stuff there11:24
@wikingwhereas this is jsut some very simple linalg11:24
@wikingover arrays11:24
@wikingright?11:24
@wikingwe dont need the whole linalg abstraction11:24
@wikingwe dont need GMM etc etc11:24
@wikingfor being able to run a pretrained coreml model11:25
lisitsynok then it might be that we need both11:25
@wikingthis should be a lightweight package that can be fully used within shogun11:25
lisitsynI mean having 'executor' of models11:25
lisitsynand being able to load coreml model11:25
@wiking+ of course being able to export11:25
@wikingeyey11:25
@wikingthat's another story11:25
@wikingfirst it'd be great11:25
lisitsynit could be that11:25
@wikingif we could just use within shogun11:25
@wikingtf, xgboost, sklearn, catboost models11:26
lisitsynyeap11:26
@wiking(as all of them can be exported into coreml)11:26
@wikingi mean i'm happy to go and do the vanilla implementation11:26
@wikingthe question is of course11:26
@wikingi'll try to push the GLM11:26
@wikingpart that 'works'11:27
@wikingso we can iterate on the code11:27
@wikingand not just talk in the air11:27
lisitsynok sounds good11:27
@wikingbut i was wondering11:27
@wikinghow would you solve the fact11:27
@wikingof being able to "JIT" it11:27
@wikingmeaning that convert into a more optimized version11:27
@wikingonce you have the pipeline11:27
@wikingas the coreml pipeline is just a simple vanilla descriptor11:27
@wikingof the elements11:27
@wikingusing protobuf11:28
lisitsynnot sure if there is a lot of room to optimize11:28
lisitsynI mean it is obviously suboptimal to work over protobuf instance11:28
@wikingyeah we wouldn't work11:28
lisitsynbut once you load into some internal representation11:28
lisitsynlike array<float> blabla11:28
@wikingwe just use it for our internal mapping11:28
@wikingbut i wouldn't expose it11:28
@wikingsee Self11:28
lisitsynnot sure what has to be optimized11:28
lisitsynyeah sure11:28
lisitsynI mean what're the things to be optimized?11:29
@wikingsay that you have a normalization11:29
@wikingbefore the glm11:29
@wikingok?11:29
@wikingthat is11:29
@wikingg(f(x))11:29
@wikingright11:29
@wiking?11:29
lisitsynso you mean combine them into one function?11:29
@wikingyes11:29
@wikingas you can do it11:29
lisitsynuh11:29
lisitsyndon't know11:29
lisitsynllvm yolo11:29
@wikingyes11:29
@wikingit is11:29
@wikingi mean this is future work of course11:30
@wikingquestion is what are the constraints for this11:30
@wikingin a long run11:30
lisitsynI haven't touched this thing but heard some success stories11:30
@wikingso that we dont come with a design11:30
lisitsynI see your concern11:30
@wikingthat is totally not letting you to do this11:30
@wikinglater11:30
@wikingi mean11:30
@wikinghonestly11:30
@wikingthere's the story of xtensor11:30
@wikingand eigen11:30
lisitsynyeah I see11:30
@wikingin both case you can do11:30
@wikingg(f(a(....)11:31
@wikingand then that is only executed11:31
@wikingonce there's a 'trigger'11:31
@wiking.execute()11:31
@wikingor whatever11:31
@wikingi mean we dont need to go with this design right away11:31
@wikingi'm just wondering what are the constraints in this case11:31
@wikingso that we go with the right approach11:31
@wikingthat enables this later11:31
lisitsynok let me think11:32
@wikingi mean i guess11:32
@wikingthen we need to go header only internally :P11:32
lisitsynI think that has to be in pipeline11:32
lisitsynso once we have some Pipeline11:32
lisitsynwe could have some Pipeline::compile11:32
lisitsynor Pipeline::optimize11:32
lisitsynthat actually tries to optimize it11:33
@wikinghttps://github.com/QuantStack/xtensor/blob/master/include/xtensor/xexpression.hpp11:33
lisitsynonce you call it you don't have access to your models11:33
@wikingyeah indeed11:33
@wikingonce you compile it11:33
@wikingit's just a linalg stack11:33
@wikingthat does operations over the input11:33
lisitsynand I wouldn't even bother exporting them back11:33
@wikingwhat do you mean by exporting them back?11:33
lisitsynI mean it is one-way ticket11:34
@wikingah yeah11:34
@wikingindeed11:34
lisitsynyou compile and then you can't analyze them11:34
lisitsyncheck the weights11:34
@wikingyou cannot reverse it11:34
lisitsynno training anymore11:34
lisitsyn:)11:34
@wikingyeye totally11:34
lisitsynsuch a design would not restrict (probably)11:35
@wikingbut i mean even in case of none-optimized version11:35
@wikingdo you want to have the user11:35
@wikingto have a reference11:35
@wikingon the models etc?11:35
@wiking(and their params?)11:35
lisitsynin case of coreml?11:35
@wikingsee apple's design is11:35
@wikingthat once you exported into the protobuf coreml format11:35
lisitsynI think it would be very nice thing to have11:35
@wikingyou just can deserialize11:36
@wikingand apply the model11:36
@wikingyou cannot 'observe' nor change anything11:36
lisitsynbut this doesn't have to be in the executor11:36
lisitsynso not for executor but for IO with coreml protos11:36
lisitsynbut in general11:36
lisitsynit would be a nice thing to have even in executor11:36
lisitsynI mean in production you actually want to monitor things11:37
lisitsynI mean I'd put a few graphs like weights norm, maximal weight, minimal weight etc11:37
@wikingmmm11:38
lisitsynand if you can't access that in shogun you'd need some code to access that somewhere else11:38
@wikingyeah we can add these options11:38
@wikingi'm just wondeirng11:38
@wikingwhether then we should actually11:38
@wikingexpose the protobuf objects?11:39
@wikingi wouldn't do it personally11:39
@wikingbut essentially you could do it :)11:39
lisitsynI am not sure it is protobuf objects11:39
lisitsynit is rather shogun objects or so11:40
@wikingwhat i meant11:40
@wikingthat the easiest way to expose any param11:40
@wikingis to expose the protobuf objects11:40
@wiking:)11:40
lisitsynyeah11:40
@wikingas they essentially contain all the info11:40
lisitsynok it might be11:41
lisitsynthat11:41
lisitsyncoreml protos -> (load) shogun objects -> (compile) optimized pipeline11:41
@wikingmmm11:42
lisitsynwould be the way11:42
@wikingyeah but11:42
@wikingi dont see why it's good to use shogun object11:42
@wikings11:42
@wikingthey are super clunky11:42
@wikingand way too much overhead11:42
@wikingfor a simple executore11:42
@wikingright?11:42
lisitsynto let users re-train them etc11:42
@wikingyeah but you can do that11:42
@wikingin the place11:42
@wikingwhere you did it already11:42
@wiking:)11:42
lisitsynso you mean it should be two things11:42
lisitsyn1) coreml proto -> shogun objects11:43
lisitsyn2) coreml proto -> optimized pipeline11:43
lisitsynand third one is actually11:43
@wikingyeah11:43
lisitsynshogun objects -> optimized pipeline11:43
@wikingyeye11:43
@wikingfirst i would do 2)11:43
@wikingthen 3)11:43
@wikingand then 1)11:43
lisitsynsounds good also11:43
@wikingwith 2) we could integrate into shogun easily11:43
@wikingso we can use models11:43
@wikingfrom other libs11:43
@wiking3) we want to be able to serialize our models :P11:44
@wikingand the 2) would be actually using all our infra11:44
@wikingfor how to make this multilang11:44
@wikingbecause we could make the whole thing like shogun11:44
@wikingthat you can actually use from all the swigable languages11:44
lisitsynsounds good11:45
@wikingwith arrow you have zero copy11:45
@wikingand then you could actually run a coreml11:45
@wikingmodel in jvm11:45
@wikingor python11:45
@wikingor ruby11:45
@wikingon none-apple systems as well11:45
@wikingimo that'd be good11:45
lisitsynyeah I agree11:46
lisitsynsounds reasonable11:46
@wikingok so i'll try to finish this simple glm object11:46
@wikingand push it11:46
@wikingand see what you think about it11:46
@wikingand then slowly we can actually do this for all the things in the coreml definition11:46
lisitsynI wonder if google is to come with similar definitions for android11:50
lisitsynthey still didn't release anything like that11:51
@wikingi mean they have serving11:51
lisitsynyeah but it is a bit bs11:51
@wikingit is in a way similar format11:51
lisitsynI mean tf is not the only way to serve models11:51
@wikingbut has an grpc stack over it11:51
lisitsynit doesn't solve the problem of exporting a model from your jupyter lab notebook11:52
lisitsynthat's why I wonder11:52
@wikingyeah11:52
@wikinggood question11:52
@wikinghttps://github.com/tf-coreml/tf-coreml11:52
@wiking:)11:52
@wikingbut there's already keras -> coreml11:53
lisitsynyeah true11:53
lisitsynbut no similar executor in android still!11:53
lisitsynor not?11:53
@wikingdunno11:53
@wikinghave not heard about it11:54
@wikinghttps://www.quora.com/What-is-the-counterpart-of-Apples-CoreML-in-Android11:54
@wikinghttps://www.quora.com/What-is-the-counterpart-of-Apples-CoreML-in-Android;)11:54
@wiking:)11:54
lisitsynyeah some tf lite but that's bs11:54
lisitsyn:)11:54
lisitsyncrazy11:54
@wikinghttps://techcrunch.com/2017/05/17/googles-tensorflow-lite-brings-machine-learning-to-android-devices/11:54
@wiking:)11:54
lisitsynand nobody started a startup doing this?!11:55
lisitsyncrazy11:55
@wiking:))))))))))))))11:55
lisitsynthe idea is so obvious I could cry11:55
@wikingraise 10M!11:55
lisitsynI bet you get 100M from nowhere11:55
@wikingyeah let's raise it then11:55
@wikingnobunaga :)11:55
lisitsynbillions of android devices and no general solution11:56
@wikinganyhow lemme see how this works with arrow11:56
@wikingand a simple weight vector :)11:56
-!- baladinha_top [55da31af@gateway/web/freenode/ip.85.218.49.175] has joined #shogun12:08
-!- baladinha_top [55da31af@gateway/web/freenode/ip.85.218.49.175] has quit [Quit: Page closed]12:16
-!- Farouk [81617d04@gateway/web/freenode/ip.129.97.125.4] has joined #shogun16:17
FaroukHi everyone. So I just wanted to finalize my idea for the GSOC project. So I want it to be a two parts project. The first part would be to add to the NN component of Shogun. Right now, Shogun only supports training according to the Minimum squared error loss function, by hard coding the actual derivatives and the back propagation algorithm.16:25
FaroukI want to change that to use some auto differentiation library like Adept that would allow to define an arbitrary error function and then automatically differentiate it and back propagate. We can also add some extra ready made functions like categorical cross entropy, etc.16:26
FaroukThe second part of the project would be to use the new improved NN features to build a NN that plays an Atari game from scratch. That would be a notebook and can showcase the new features of the NN component.16:26
FaroukSo any feedback?16:26
@wikingFarouk sounds good16:38
@wikingmake sure that you have detailed weekly schedule16:39
@wikingfor those weeks of gsoc16:40
@wikingands till then u have couple of prs already merged (by mid april(16:40
FaroukSure. No Problem. I have 2 pull requests in Review right now.16:41
@wikingregardung16:41
@wikingautodiff16:41
@wikingthere were couple of discussions about this.... we were looking into stan16:41
FaroukIs that a new autodiff library? I am open to any one really, It's just I worked a bit before on Adept before so had more experience in it. But I think it shouldn't be too hard to switch to stan?16:43
@wikingwell16:43
@wikingthe problem is that  autodiff in general would only make sense with wrapping the code16:43
@wikingbecause we wanna avoid16:44
@wikingdirect dependencies16:44
@wikingas what if one of the lib stops being developed16:44
FaroukAhh I see.16:44
FaroukSo any additional new library would need to be wrapped first16:45
@wikingideally :)16:45
FaroukHmm so how is the discussion on Stan going. Any updates on it?16:46
FaroukOhh i see that stan is a header-only library. That seems convenient. Had a look at some examples and they look similar to Adept.17:02
@wikingFarouk, yeah stan is pretty cool library :)18:05
FaroukSo is it okay to assume that I can insert the stan library into my program, or is that still under discussion?18:15
-!- iglesias [~iglesias@f119189.upc-f.chello.nl] has joined #shogun18:16
-!- syashakash [0e8becd2@gateway/web/freenode/ip.14.139.236.210] has joined #shogun18:26
@wikingFarouk, in a way yes18:30
@wikingit's not an easy task :)18:30
@wikingbut yeah18:30
@wikinglet's have the assumption that it can be done18:30
FaroukOkay great then. Thanks for the feedback and the information :)18:31
-!- syashakash [0e8becd2@gateway/web/freenode/ip.14.139.236.210] has quit [Quit: Page closed]18:49
-!- syashakash [0e8becd2@gateway/web/freenode/ip.14.139.236.210] has joined #shogun18:50
-!- HeikoS [~heiko@host86-132-201-109.range86-132.btcentralplus.com] has joined #shogun20:01
-!- mode/#shogun [+o HeikoS] by ChanServ20:01
-!- HeikoS [~heiko@host86-132-201-109.range86-132.btcentralplus.com] has quit [Ping timeout: 260 seconds]20:27
-!- HeikoS [~heiko@host86-132-201-109.range86-132.btcentralplus.com] has joined #shogun20:29
-!- mode/#shogun [+o HeikoS] by ChanServ20:29
-!- Farouk [81617d04@gateway/web/freenode/ip.129.97.125.4] has quit [Quit: Page closed]21:43
-!- HeikoS [~heiko@host86-132-201-109.range86-132.btcentralplus.com] has quit [Ping timeout: 260 seconds]22:22
-!- iglesias [~iglesias@f119189.upc-f.chello.nl] has quit [Quit: leaving]23:09
-!- HeikoS [~heiko@host86-132-201-109.range86-132.btcentralplus.com] has joined #shogun23:25
-!- mode/#shogun [+o HeikoS] by ChanServ23:26
--- Log closed Sun Mar 18 00:00:46 2018

Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!