IRC logs of #shogun for Sunday, 2013-05-26

--- Log opened Sun May 26 00:00:18 2013
-!- hushell [~hushell@8-92.ptpg.oregonstate.edu] has joined #shogun		00:05
-!- HeikoS [~heiko@176.248.212.176] has joined #shogun		00:27
-!- mode/#shogun [+o HeikoS] by ChanServ		00:27
@HeikoS	sonney2k: can we do multiple inheritence in shogun?	00:32
@HeikoS	I think I remember no, but I forgot	00:32
@HeikoS	whats the best way to do a thing similar to java's interface (which is inheriting a couple of pure virtual methods)	00:32
@HeikoS	lisitsyn: ^	00:33
@lisitsyn	HeikoS: no MI :)	00:34
@HeikoS	lisitsyn: how else=?	00:34
@HeikoS	lisitsyn: also, I am thinking of adding a general ComputationTask class	00:34
@HeikoS	which can be registered in another class	00:35
@HeikoS	which then handles computation of all those	00:35
@HeikoS	and different implementations may do it differently (multicore, mpi, etc)	00:35
@lisitsyn	HeikoS: hmm	00:35
@lisitsyn	when did you come to that idea?	00:35
@lisitsyn	it reproduces something that is in my mind too	00:36
@HeikoS	lisitsyn: I need this for log-det project	00:36
@HeikoS	but would be better to have this in general	00:37
@HeikoS	every task impleentation should have code how to solve it	00:37
@HeikoS	and the std impleentation of CComputationPool	00:37
@HeikoS	just does everything sequentially	00:37
@HeikoS	then we program against that interface	00:37
@HeikoS	and people might come up with more fancy things	00:37
@HeikoS	without changing algorithm code	00:38
@lisitsyn	HeikoS: I see	00:38
@HeikoS	lisitsyn: so lets think about this a bit	00:38
@lisitsyn	but why do you need interfaces there?	00:38
@HeikoS	lisitsyn: nevermind about this	00:38
@HeikoS	lets talk about the computation	00:39
@HeikoS	:)	00:39
@lisitsyn	HeikoS: hah ok	00:39
@lisitsyn	HeikoS: what is computation pool?	00:39
@HeikoS	ok	00:39
@HeikoS	so CComputationPool	00:40
@HeikoS	is an abstract base where one can register tasks	00:40
@HeikoS	and one can call solve_all(), which gives a list of CComputationTaskResult instances	00:40
@HeikoS	register(CComputationTask task)	00:40
@lisitsyn	what is real instances of computation pool?	00:40
@HeikoS	one example:	00:40
@HeikoS	CSequentialComputationPool	00:41
@HeikoS	solve_all just loops over all Tasks and solves them	00:41
@HeikoS	each task knows how it gets solved	00:41
@HeikoS	another exaple:	00:41
@lisitsyn	okay sequential parallel what else?	00:41
@HeikoS	MPI	00:41
@HeikoS	group like structures	00:41
@lisitsyn	ohh hah	00:41
@HeikoS	problem dependent	00:41
@HeikoS	there are only few generic ones	00:41
@HeikoS	most of them will be problem specific	00:41
@HeikoS	still only one interface from main algorithm	00:42
@HeikoS	CIndependentParallelComputationPool	00:42
@HeikoS	I think of using external libraries for more structured stuff	00:43
@HeikoS	but for now, I am just interested in the interface	00:43
@lisitsyn	what libraries?	00:43
@HeikoS	something to schedule for example	00:43
@HeikoS	but doesnt matter now	00:43
@HeikoS	you could imagine that one class uses graphlab for example	00:43
@HeikoS	if there is a lot of structure	00:44
@HeikoS	but even multicore might be nice	00:44
@lisitsyn	but it looks like doing a task in multicore manner is a more frequent case	00:44
@lisitsyn	there is a strong reason to do tasks multicore	00:45
@HeikoS	yes	00:45
@HeikoS	definitely	00:45
@HeikoS	once class could for example do grid-search in a multicore way	00:45
@lisitsyn	when you do totally different things your context is switching like crazy	00:45
@HeikoS	but bad example since grid-search is already impleented, and not in terms of tasks that one regiusters	00:45
@lisitsyn	HeikoS: I'd rather call it Queue btw	00:46
@HeikoS	but new things could be written in terms of tasks that one first registers, and then solves	00:46
@lisitsyn	Pool is a different pattern	00:46
@HeikoS	lisitsyn: it is not a queue	00:46
@HeikoS	but agreed on pool is not good	00:46
@lisitsyn	it is not a pool too ;)	00:46
@HeikoS	Set? :)	00:46
@lisitsyn	pool is a set of prepared objects	00:46
@lisitsyn	set is so neutral that it doesn't tell anything	00:47
@HeikoS	Organizer ?=	00:47
@lisitsyn	engine may be	00:47
@HeikoS	Engine is good! :)	00:47
@lisitsyn	well it is engine in graphlab	00:47
@lisitsyn	:D	00:47
@lisitsyn	I've seen they have some fancy algorithms	00:48
@lisitsyn	for philosophers thing	00:48
@HeikoS	indeed	00:48
@HeikoS	this is not what I want to do	00:48
@lisitsyn	HeikoS: why do you need it btw?	00:49
@HeikoS	log-det estimates have to be parallelized	00:49
@HeikoS	can do up to factor few hundred	00:49
@lisitsyn	HeikoS: did you consider opencling it too btw?	00:49
@HeikoS	lisitsyn: I dont want to actually do this for now, but rather prepare it	00:50
@HeikoS	its an experiment	00:50
@HeikoS	other way would be to say:	00:50
@HeikoS	ah nevermind	00:50
@HeikoS	so I want to try it	00:50
@HeikoS	even computing 1 estimate can be parallelized massively	00:50
@lisitsyn	HeikoS: I like idea of formulating all operations as jobs/tasks	00:50
@HeikoS	usually one needs a few hundred of them	00:51
@HeikoS	lisitsyn: yes, thats the experiment, if we can make this work, things might be easier to parallelize	00:51
@HeikoS	which they should	00:51
@lisitsyn	I mean if we call train	00:51
@HeikoS	so many loops of independent things in our code	00:51
@lisitsyn	we should just enqueue some operation	00:51
@HeikoS	exactly	00:51
@HeikoS	also this would separate the code structure froim the actual computation a bit more	00:52
@lisitsyn	HeikoS: as for pools - I hope we will get to them too	00:52
@lisitsyn	would be cool to have a thing that manages memory	00:53
@HeikoS	indeed	00:54
@HeikoS	lets experiment with those!	00:54
@lisitsyn	HeikoS: I personally have difficulties with experimenting in shogun	00:54
@lisitsyn	it is big and I have superstitions :D	00:55
@HeikoS	lisitsyn: the best point to do this is when the framework is extended	00:55
@HeikoS	which the log-det project does	00:56
@HeikoS	quite a few classes are necessary for this	00:56
@HeikoS	I wouldnt do it for GP for example	00:56
@HeikoS	there is already too much in single-thred logic	00:56
@HeikoS	thread	00:56
@lisitsyn	HeikoS: I would not go for generic design of that actually	00:57
@lisitsyn	so lets just gradually do that specifically for your task	00:57
@HeikoS	how do you mean that?	00:57
@HeikoS	yes thats my plan	00:57
@lisitsyn	and then generalize when we see a generalization point	00:57
@lisitsyn	HeikoS: I failed too many times with generic design :D	00:58
@HeikoS	haha :)	00:58
@HeikoS	I will send you the class diagram once lambday and I have worked this out	01:00
@HeikoS	he is a smart guy and probably can help a lot there...	01:00
@HeikoS	it makes no sense to do this stuff single-threaded btw	01:00
@HeikoS	lisitsyn: and we should have at least a general framework for multicore stuff with a unified interface	01:02
@HeikoS	since so many tasks are like that	01:03
@HeikoS	I mean independent loops	01:03
@lisitsyn	HeikoS: yes true	01:04
@lisitsyn	just avoid trying to do that general right now	01:04
@HeikoS	well, a little bit at least :)	01:05
@HeikoS	general enough to have multiple forms for the log-det stuff	01:05
@lisitsyn	HeikoS: it would be possible to design a general thing if we had experience	01:05
@lisitsyn	otherwise we have to do that evolutionary	01:06
@lisitsyn	HeikoS: I can design multiagent systems now - but soooo many mistakes have been fixed	01:07
@HeikoS	i see	01:08
@lisitsyn	so is that thing I am sure	01:09
@lisitsyn	:)	01:09
@HeikoS	one should never start coding too early :)	01:09
@HeikoS	want to spend some time planning this	01:09
@lisitsyn	we are just unexperienced to foreseen that	01:09
@lisitsyn	nahh that fails too	01:09
@lisitsyn	HeikoS: it depends on the experience again	01:10
@lisitsyn	in this case I'd rather plan something not really detailed then code it	01:10
@lisitsyn	then see everything is wrong	01:10
@lisitsyn	and refactor	01:10
@lisitsyn	then guess what :D	01:10
@lisitsyn	HeikoS: should a task have a separate object to store data? how to store dependencies? what are types of dependencies?	01:12
@HeikoS	lisitsyn: no dependencies	01:12
@HeikoS	as I said, this is not my goal	01:12
@HeikoS	independent loops	01:12
@lisitsyn	HeikoS: yeah I mean there are a lot of questions	01:12
@HeikoS	data is stored within task, or reference	01:13
@lisitsyn	and not all of them are answer-able design-time	01:13
@HeikoS	this depends on the implementation	01:13
@lisitsyn	HeikoS: I see a lot of possibilities there anyway	01:13
@HeikoS	lisitsyn: yes	01:13
@HeikoS	lisitsyn: I mean, I just want to have something for the log-dets	01:14
@lisitsyn	most of them are usually unforeseen so be ready to refactor and refactor ;)	01:14
@HeikoS	I have coded up all of this in Matlab, both seq and par, so I know what happens, but maybe you are right and I should not be so general	01:15
@lisitsyn	HeikoS: no I just warn you and lambday to not strive for generality from the very beginning	01:15
@HeikoS	lisitsyn: again, he is not meant to implement parallel things	01:16
@HeikoS	just write the sequential one against an interface that might be able to handle this	01:16
@lisitsyn	I see	01:16
@HeikoS	so and the interface I wanted to have btw is	01:16
@HeikoS	that a class can inherit a set of methods	01:17
@HeikoS	that are: register stuff, solve subproblem, etc	01:17
@HeikoS	so whats a good way to simulat interfaces?	01:17
@HeikoS	java doesnt have MI, thats why they have interfaces, but how do we do this?	01:17
@lisitsyn	HeikoS: we are tied to no MI so forget about java :)	01:17
@lisitsyn	I don't know	01:18
@HeikoS	no way?	01:18
@lisitsyn	it is problem dependent	01:18
@HeikoS	by hand	01:18
@lisitsyn	MI is totally troublesome	01:18
@lisitsyn	HeikoS: you mean they form an hierarchy of classes to share some methods	01:18
@lisitsyn	but all of them implement Task	01:19
@lisitsyn	?	01:19
@HeikoS	yes for example	01:19
@lisitsyn	HeikoS: well I see no problem putting Task to the very top of that hierarchy	01:20
@lisitsyn	so all depends..	01:21
@HeikoS	lisitsyn: Ill show you the class diagram :)	01:23
@HeikoS	when its more or less done	01:23
@lisitsyn	HeikoS: interfacing is java idiom so may be it just requires to change a point of view	01:23
@lisitsyn	we will see	01:23
@lisitsyn	HeikoS: alright will try to schlafen :)	01:26
@HeikoS	good night lisitsyn! :)	01:26
@lisitsyn	HeikoS: good night	01:27
-!- foulwall [~foulwall@2001:da8:215:6901:93a:5fb3:ab52:7a68] has joined #shogun		02:03
-!- HeikoS [~heiko@176.248.212.176] has quit [Quit: Leaving.]		03:06
-!- nube is now known as out		04:12
-!- out is now known as nube		04:12
shogun-buildbot	build #407 of nightly_default is complete: Failure [failed test] Build details are at http://www.shogun-toolbox.org/buildbot/builders/nightly_default/builds/407	04:17
-!- gsomix [~gsomix@83.149.21.63] has joined #shogun		05:02
gsomix	good morning	05:02
foulwall	gsomix: morning	05:02
-!- gsomix [~gsomix@83.149.21.63] has quit [Ping timeout: 264 seconds]		06:41
-!- nube [~rho@49.244.28.55] has quit [Ping timeout: 256 seconds]		07:36
-!- foulwall [~foulwall@2001:da8:215:6901:93a:5fb3:ab52:7a68] has quit [Remote host closed the connection]		07:40
-!- nube [~rho@49.244.116.16] has joined #shogun		07:51
-!- flxb_ [~flxb@master.ml.tu-berlin.de] has joined #shogun		07:53
-!- flxb [~flxb@master.ml.tu-berlin.de] has quit [Write error: Broken pipe]		07:54
@sonney2k	morning...	08:33
-!- sijin [~smuxi@144.214.222.109] has quit [Read error: Connection reset by peer]		08:57
@sonney2k	pickle27, any insights?	08:59
-!- iglesiasg [d58f32ac@gateway/web/freenode/ip.213.143.50.172] has joined #shogun		09:27
-!- mode/#shogun [+o iglesiasg] by ChanServ		09:28
-!- sijin [~smuxi@144.214.222.109] has joined #shogun		09:40
-!- hushell [~hushell@8-92.ptpg.oregonstate.edu] has quit [Ping timeout: 264 seconds]		10:05
-!- foulwall [~foulwall@2001:da8:215:503:d9a2:88ea:88e3:5e47] has joined #shogun		10:15
-!- hushell [~hushell@c-67-189-100-116.hsd1.or.comcast.net] has joined #shogun		10:23
-!- votjakovr [~votjakovr@host-46-241-3-209.bbcustomer.zsttk.net] has joined #shogun		10:30
-!- foulwall [~foulwall@2001:da8:215:503:d9a2:88ea:88e3:5e47] has quit [Remote host closed the connection]		10:49
-!- votjakovr [~votjakovr@host-46-241-3-209.bbcustomer.zsttk.net] has quit [Quit: Leaving]		11:18
-!- iglesiasg [d58f32ac@gateway/web/freenode/ip.213.143.50.172] has quit [Ping timeout: 250 seconds]		12:43
-!- vgorbati [5f8777f7@gateway/web/freenode/ip.95.135.119.247] has joined #shogun		13:16
-!- van51 [~van51@athedsl-320452.home.otenet.gr] has joined #shogun		13:18
-!- foulwall_ [~foulwall@2001:da8:215:503:746c:70bc:a9be:cac0] has joined #shogun		13:30
-!- lisitsyn [~blackburn@109-226-74-97.clients.tlt.100megabit.ru] has quit [Ping timeout: 246 seconds]		13:31
-!- vgorbati_ [5f85daa8@gateway/web/freenode/ip.95.133.218.168] has joined #shogun		14:01
-!- vgorbati [5f8777f7@gateway/web/freenode/ip.95.135.119.247] has quit [Ping timeout: 250 seconds]		14:03
-!- vgorbati_ is now known as vgorbati		14:05
-!- zxtx [~zv@ool-457e751d.dyn.optonline.net] has quit [Ping timeout: 246 seconds]		14:08
-!- zxtx [~zv@ool-457e751d.dyn.optonline.net] has joined #shogun		14:10
-!- vgorbati [5f85daa8@gateway/web/freenode/ip.95.133.218.168] has quit [Ping timeout: 250 seconds]		14:12
-!- vgorbati [5f85daa8@gateway/web/freenode/ip.95.133.218.168] has joined #shogun		14:24
-!- gsomix [~gsomix@188.168.2.227] has joined #shogun		14:34
gsomix	hi	14:34
gsomix	sonney2k, sent PR.	14:35
-!- van51 [~van51@athedsl-320452.home.otenet.gr] has left #shogun ["JOIN #shogun"]		14:38
gsomix	sonney2k, I hope it's readable now. :)	14:38
-!- van51 [~van51@athedsl-320452.home.otenet.gr] has joined #shogun		14:38
-!- vgorbati [5f85daa8@gateway/web/freenode/ip.95.133.218.168] has quit [Ping timeout: 250 seconds]		14:50
-!- foulwall_ [~foulwall@2001:da8:215:503:746c:70bc:a9be:cac0] has quit [Remote host closed the connection]		15:27
-!- foulwall [~foulwall@2001:da8:215:c252:4b2:f64d:b662:b135] has joined #shogun		16:50
gsomix	cu later, guys	16:52
-!- sanyam [uid10602@gateway/web/irccloud.com/x-myercfhnlmkikdyu] has quit [Ping timeout: 252 seconds]		17:39
-!- foulwall [~foulwall@2001:da8:215:c252:4b2:f64d:b662:b135] has quit [Ping timeout: 240 seconds]		17:45
-!- nube [~rho@49.244.116.16] has quit [Ping timeout: 264 seconds]		18:23
-!- nube [~rho@49.126.16.146] has joined #shogun		18:26
-!- nube [~rho@49.126.16.146] has quit [Ping timeout: 256 seconds]		18:54
-!- nube [~rho@49.244.8.172] has joined #shogun		19:17
-!- gsomix [~gsomix@188.168.2.227] has quit [Ping timeout: 245 seconds]		19:25
-!- gsomix [~gsomix@188.168.2.227] has joined #shogun		19:26
-!- van51 [~van51@athedsl-320452.home.otenet.gr] has left #shogun ["PING 1369589370"]		19:29
-!- sanyam [uid10602@gateway/web/irccloud.com/x-nsunvhvtlukipqcp] has joined #shogun		19:36
-!- katia_ [5f43c1c3@gateway/web/freenode/ip.95.67.193.195] has joined #shogun		19:48
-!- deerishi [c649b206@gateway/web/freenode/ip.198.73.178.6] has joined #shogun		19:56
-!- vgorbati [5f6ff438@gateway/web/freenode/ip.95.111.244.56] has joined #shogun		20:00
pickle27	sonney2k: sorry haven't had a chance to work on that yet	20:10
-!- deerishi [c649b206@gateway/web/freenode/ip.198.73.178.6] has quit [Ping timeout: 250 seconds]		20:18
-!- katia_ [5f43c1c3@gateway/web/freenode/ip.95.67.193.195] has quit [Ping timeout: 250 seconds]		20:22
-!- katia_ [5f43c1c3@gateway/web/freenode/ip.95.67.193.195] has joined #shogun		20:38
gsomix	good evening	20:46
-!- vgorbati [5f6ff438@gateway/web/freenode/ip.95.111.244.56] has quit [Ping timeout: 250 seconds]		21:21
-!- vgorbati [5f6ff438@gateway/web/freenode/ip.95.111.244.56] has joined #shogun		21:30
pickle27	sonney2k: valgrind didn't complain about qda	21:32
pickle27	sonney2k: paste is here http://pastebin.com/xc3SERUR	21:34
-!- vgorbati [5f6ff438@gateway/web/freenode/ip.95.111.244.56] has quit [Ping timeout: 250 seconds]		21:35
@sonney2k	pickle27, well yeah it is no memory leak but something else	21:42
@sonney2k	pickle27, how about you pickle.dump all the input that the function gets when you run tester.py	21:43
@sonney2k	and then load that to reproduce/debug the issue	21:43
* sonney2k off		21:43
pickle27	sonney2k: I thought valgrind might complain because I thought it might be a result that is bigger than its return allocation if that makes sense	21:44
pickle27	sonney2k: okay	21:44
pickle27	sonney2k: the function doesn't get any input from tester.py it just runs the example	22:05
pickle27	sonney2k: at least thats what it looks like to me	22:05
gsomix	nite	22:07
pickle27	sonney2k: if I run the modular example on my own it runs fine	22:07
pickle27	sonney2k: I'll try the same data in the c++ example	22:07
@sonney2k	pickle27, no	22:15
@sonney2k	pickle27, did you pickle dump?	22:15
pickle27	sonney2k: I was just looking through to see what tester actually did	22:15
pickle27	sonney2k: doesn't it just run classifier_qdq_modular.py?	22:16
@sonney2k	pickle27, yes but did you dump the data it gets?	22:16
pickle27	what do you main it doesn't get data the data is loaded in the example itself	22:17
pickle27	*mean	22:17
@sonney2k	so did you dump it or not?	22:18
pickle27	theres no need to dump it, its the data/fm_train_real data	22:18
@sonney2k	ok then let me do it	22:18
@sonney2k	pickle27, alright so the reason is that m_store_covs is True in one test	22:22
@sonney2k	pickle27, so just put a true as last argument and you can reproduce the crash in the example	22:23
pickle27	sonney2k: I thought that might be the problem but it still runs for me when I do that	22:23
pickle27	sonney2k: ahh got the bug now	22:24
@sonney2k	pickle27, parameter_list = [[traindat, testdat, label_traindat, 1e-4, True], \	22:24
@sonney2k	then it will crash	22:24
pickle27	sonney2k: and I see sort of whats happening with the tester	22:24
pickle27	okay I'll work on fixing this now	22:25
@sonney2k	thanks	22:26
pickle27	sonney2k: got it now, I just switched to ozansener's covar calc instead	22:39
pickle27	theres a lot of room for better use of Eigen3 in QDA but it'll work for now	22:40
@sonney2k	pickle27, heh feel free to do it - ohh and benchmarks welcome too!	22:46
pickle27	sonney2k: yeah I'd like to give it a try in the next bit!	22:48
-!- katia_ [5f43c1c3@gateway/web/freenode/ip.95.67.193.195] has quit [Quit: Page closed]		22:49
pickle27	sonney2k: the test runs now but the result is different in a 2 places (unsure why slight numerical differences?) should I make a PR with the fix now and continue investigating?	22:53
@sonney2k	gsomix, yes readable finally :-)	22:54
-!- HeikoS [~heiko@176.248.212.176] has joined #shogun		23:02
-!- mode/#shogun [+o HeikoS] by ChanServ		23:02
@sonney2k	HeikoS, hey there!	23:03
@HeikoS	sonney2k: hi!	23:03
@HeikoS	how is it going?	23:04
@sonney2k	tomorrow is the day students will be notified	23:04
@HeikoS	I know	23:06
@HeikoS	sonney2k btw, discussing something with lambday	23:08
@HeikoS	which is basically a class CIndependentComputationEngine	23:08
@HeikoS	which can take instances of a CIndependentComputationTask	23:08
@HeikoS	and run all of them in parallel	23:09
@HeikoS	or sequentially	23:09
@HeikoS	or whatever	23:09
@HeikoS	we need this for the log-det stuff	23:09
@sonney2k	HeikoS, ohh I think sergey had some thoughts on that too	23:09
@HeikoS	and maybe it might be worth to think about generalising it for other things	23:09
@HeikoS	yes we already discussed	23:09
@sonney2k	and wiking would need this for his bagging machine and you for your xval stuff	23:09
@HeikoS	so algorithms just produce a set of tasks instead of doing computations	23:10
@HeikoS	those are given to the computation class	23:10
@HeikoS	it returns results	23:10
@HeikoS	results are being passed to algorithm which aggregate	23:10
@HeikoS	but only for independent/trivially parallelizable stuff	23:10
@HeikoS	otherwise it will be too complicated	23:10
@HeikoS	but this way, many things might benefit	23:10
@HeikoS	grid-serach for example	23:11
@sonney2k	HeikoS, I am not sure how exactly this would work	23:11
@HeikoS	we could have one class which does things in a multicore way	23:11
@sonney2k	yeah multi core / multiple machines	23:11
@sonney2k	machines == computeres	23:11
@HeikoS	yes	23:11
@HeikoS	and future implementations might be coded against this	23:11
@HeikoS	sergey had some doubts however	23:12
@HeikoS	and he is right, its not easy to do this in general	23:12
@sonney2k	howe would it work in case of say bagging?	23:12
@sonney2k	how do you tell which stuff is to be transferred to the remote machine and which not?	23:13
@HeikoS	so the way I would do the abstraction is this	23:13
@sonney2k	I currently can see this work with threads and just a couple of parameters	23:13
@HeikoS	one has a class for indepedent tasks	23:14
@HeikoS	which has abstract method solve	23:14
@sonney2k	(beware already here - you have to set obj->parallel->set_num_threads(1) then)	23:14
@HeikoS	The task itself know everything it needs to know	23:14
@sonney2k	better compute()	23:14
@HeikoS	you one can just call compute/solve	23:14
@HeikoS	and the implementation of the task does everything and returns an instance of an abstract base for result	23:14
@HeikoS	so then your algorithm just produces a set of those tasks	23:15
@HeikoS	these may share data for now (as long as its not modified)	23:15
@HeikoS	but the point is that they hold a complete representation of the subproblem	23:15
@HeikoS	you pass them to computation enginge class	23:15
@HeikoS	basic case: sequential: just a loop over all task.compute()	23:16
@HeikoS	returns a set with result instances	23:16
@HeikoS	one passes them to the algorithm which knows how to aggregate the results if it has produced the tasks	23:16
@HeikoS	multicore implementation would run things at once	23:16
@HeikoS	for this, one needs to clone stuff which is modified	23:17
@HeikoS	read-only things can stay in shared memory	23:17
@HeikoS	distributed implementation might serialize objects and send them to computer	23:17
@HeikoS	s	23:17
@HeikoS	since we are only considereing independent stuff, we dont have scheduling problems	23:18
@sonney2k	HeikoS, so to get it right the task creates all required objects that are passed to the compute engine	23:18
@HeikoS	yes	23:18
@HeikoS	computation engine just calls compute() method in some way	23:18
@sonney2k	how can one do that efficiently? I mean you don't want to create 10 copies of a data set in memory?	23:18
@HeikoS	sonney2k, indeed	23:18
@HeikoS	the thing is: if data is modified, there is no way around that	23:19
@sonney2k	so you pass references only	23:19
@HeikoS	anyway	23:19
@sonney2k	and they are copied if needed	23:19
@HeikoS	exactly	23:19
@HeikoS	so multicore works on the same objects	23:19
@sonney2k	yes for single machines	23:19
@HeikoS	for multiple machines, data needs to be transfered anyway	23:19
@HeikoS	no way around that	23:19
@sonney2k	but for clusters we would just serialize	23:20
@HeikoS	yes	23:20
@HeikoS	to a byte stream or so	23:20
@sonney2k	there is the issue with crashing parts	23:20
@HeikoS	what do you mean with that?	23:20
@sonney2k	(I get the picture and it should be OK)	23:20
@sonney2k	say a cluster node crashes	23:21
@HeikoS	I see	23:21
@sonney2k	how do you fail over	23:21
@sonney2k	or a thread cannot be created etc	23:21
@HeikoS	well thats all to be handled by the computation engine implementation	23:21
@HeikoS	so we can do this later	23:21
@HeikoS	no problem, just try, if it doesnt work, try another machine	23:21
@sonney2k	we have to somehow be able to 'resume' or to restart failed stuff or how $BIGCOMPANY does it start say 30% more jobs	23:22
@sonney2k	to anticipate failures	23:22
@HeikoS	sonney2k, I wouldnt do that	23:22
@HeikoS	rather make tasks smaller	23:22
@HeikoS	subtasks	23:22
@HeikoS	an algorithm can even produce a set of different tasks	23:22
@HeikoS	as long as it knows to aggregate the results	23:22
@HeikoS	resuming is very difficult	23:23
@HeikoS	(I think at least)	23:23
@HeikoS	sonney2k: so I dont want to get involved in too much techical distributed programming, but rather start thinking about a framework that could be extended to this	23:23
@HeikoS	for now, just multicore	23:24
@HeikoS	but formulate algorithm in terms of that task-based framework	23:24
@HeikoS	for independent stuff	23:24
@HeikoS	so quite simple actually	23:24
@sonney2k	HeikoS, I see, but IIRC you have a cluster @work?	23:25
@HeikoS	yes, can do	23:25
@sonney2k	qsub based stuff?	23:25
@HeikoS	yes	23:25
@HeikoS	so I have in mind to create a computation engine that submits qsub jobs	23:25
@HeikoS	at some point	23:25
@sonney2k	so IMHO it would be worth to do that	23:26
@sonney2k	qsub and (just ssh based)	23:26
@HeikoS	yes definitely, we have many independent subproblems in shogun	23:26
@sonney2k	do your nodes share a common file system?	23:26
@HeikoS	I am currently runnign a thing on 100 nodes, thats quite a big speedup factor.	23:26
@HeikoS	yes	23:26
@HeikoS	so one could indeed serialize	23:27
@HeikoS	to a file	23:27
@sonney2k	yes data to one big file and then load only the modified parameters from different files	23:27
@sonney2k	I recall very much the limits we hit with a shared file system	23:28
@HeikoS	this can even be handled by the tasks - give a filename for the main data, and just store the parameters in local variables	23:28
@HeikoS	sonney2k, well one usually doesnt get ones hands on more than a few hundred nodes	23:28
@sonney2k	back then I used bittorrent to cache data on all cluster nodes - all copying would otherwise have taken a week	23:29
@HeikoS	sonney2k: haha :) when was that?	23:29
@sonney2k	hmmhh 2007 or 8?	23:29
@HeikoS	sonney2k: I mean these are all details on how the tasks are implemented, but it all works under the interface	23:29
@sonney2k	looong time ago	23:29
@HeikoS	so if one invests a lot of brainpower, one gets good results	23:30
@HeikoS	if one does not, it might not scale	23:30
@sonney2k	yes	23:30
@HeikoS	point now would be more the general structure	23:30
@sonney2k	the standard map-reduce scheme would work aswell with this	23:30
@HeikoS	true	23:31
@sonney2k	issue is still no loops possible	23:31
@HeikoS	I would rather go for this independent task based stuff	23:31
@HeikoS	more intuitive	23:31
@HeikoS	also, shogun is not really parallel based	23:31
@HeikoS	I mean its focus is not on this	23:31
@HeikoS	but these independent things are just so easy and so useful	23:32
@HeikoS	that we could focus on just them	23:32
@sonney2k	true	23:32
@sonney2k	shogun is meant to run on single machines	23:32
@sonney2k	with lots of cores	23:32
@HeikoS	I would say this stuff that one would parallelize on qsub clusters would be very useful though	23:33
@HeikoS	parameter sweeps etc	23:33
@HeikoS	and exactly, first engine would be one with a shared memory model	23:33
@HeikoS	then we could start modifying existing algorithms	23:34
@HeikoS	and once this is more or less stable	23:34
@HeikoS	one could try adding distributed things	23:34
@HeikoS	step by step	23:34
@sonney2k	HeikoS, the problem really is that you hardly get speedups by just switching multi-core -> multi-machine	23:35
@sonney2k	the algorithm needs to be designed for that usually	23:35
@HeikoS	lets see how it goes, will start with the log-det stuff, which is already a bit of a challenge under this framework. Many linear systems that share a lot of stuff	23:35
@sonney2k	so yes only the big independent jobs will benefit	23:35
@HeikoS	sonney2k: yes	23:35
@sonney2k	but that is what you have in mind	23:35
@HeikoS	the rest is too complicated anyway	23:36
@HeikoS	grid-search is the best example	23:36
@sonney2k	so bagging/ms/etc	23:36
@HeikoS	and random forests etc	23:36
@sonney2k	when I had to parallelize I did mostly ms	23:36
@HeikoS	yes same, usually only independent stuff	23:36
@sonney2k	sometimes data was too big	23:36
@sonney2k	so I trained or applied on chunks	23:37
@HeikoS	I see	23:37
@HeikoS	sonney2k: gotta go now, diner is ready :) be back later	23:37
@sonney2k	cu	23:38
@sonney2k	nice talking to you as always :D	23:38
-!- HeikoS [~heiko@176.248.212.176] has quit [Quit: Leaving.]		23:39
--- Log closed Mon May 27 00:00:19 2013

Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!