--- Log opened Fri Oct 12 00:00:17 2012 | ||
wiking | lol primal objective: -1000000000000000019884624838656.000000 | 00:29 |
---|---|---|
wiking | :> | 00:29 |
wiking | mosek mosek | 00:29 |
shogun-buildbot | build #130 of nightly_default is complete: Failure [failed test] Build details are at http://www.shogun-toolbox.org/buildbot/builders/nightly_default/builds/130 | 03:31 |
-!- adoniscik [~emre@c-67-180-103-118.hsd1.ca.comcast.net] has joined #shogun | 07:00 | |
-!- sonne|work [~sonnenbu@194.78.35.195] has quit [Ping timeout: 246 seconds] | 09:30 | |
-!- sonne|work [~sonnenbu@194.78.35.195] has joined #shogun | 09:45 | |
-!- adoniscik [~emre@c-67-180-103-118.hsd1.ca.comcast.net] has quit [Ping timeout: 240 seconds] | 10:16 | |
-!- Netsplit *.net <-> *.split quits: romi_, sonne|work | 12:41 | |
-!- Netsplit over, joins: sonne|work, romi_ | 12:45 | |
-!- blackburn [~blackburn@188.168.4.211] has joined #shogun | 13:16 | |
blackburn | wiking: you've minimized it quite a lot already! | 15:24 |
-!- too [2eda6d52@gateway/web/freenode/ip.46.218.109.82] has joined #shogun | 15:28 | |
too | hi there | 15:28 |
sonne|work | hi there too :D | 15:30 |
too | blackburn: hi there, is that you who wrote the CAlphabet::translate_from_single_order" function ? | 15:30 |
too | sonne|work: looking for some explanation about CAlphabet::translate_from_single_order | 15:30 |
sonne|work | nope me | 15:30 |
sonne|work | and gunnar IIRC | 15:31 |
too | bitwise operations make me crazy :p | 15:31 |
sonne|work | too efficient :DF | 15:31 |
too | indeed | 15:32 |
sonne|work | yeah. idea is basically to squeeze e.g. 2 characters into one byte etc | 15:32 |
sonne|work | so for DNA you need just 2 bits for A,C,G,T | 15:32 |
sonne|work | so you can have 4 characters encoded in 1 byte | 15:33 |
too | I see. And max value seems to be 2^8 then alphabet of size > 256 seems not possible right now, right ? | 15:33 |
sonne|work | yes it is for byte alphabets max | 15:34 |
sonne|work | if you have bigger alphabets you shouldn't use StringByteFeatures but StringWordFeatures etc anyways | 15:35 |
sonne|work | and then not do this kind of encoding but use the hashing trick | 15:35 |
too | the hashing trick ? | 15:35 |
sonne|work | yeah, compute a hash of your n-characters | 15:36 |
sonne|work | and store just that | 15:36 |
sonne|work | it is good enough for any real world app and very fast | 15:36 |
too | is there any example of this proc with shogun ? | 15:37 |
sonne|work | too: use murmurhash2 | 15:38 |
sonne|work | in lib/Hash | 15:38 |
too | just to be sure: StringWordFeatures = CStringFeatures<uint64_t> and StringByteFeatures = CStringFeatures<char> ? | 15:39 |
too | tanks for advice | 15:43 |
sonne|work | not uint64_t but uint16_t | 15:43 |
sonne|work | but you can use whatever is appropriate for your alphabet | 15:44 |
too | allright, then in practice I can make shogun common string kernels work with stringfeatures from bigger alphabet (?) | 15:48 |
sonne|work | difficulty depends on kernel you need though | 15:52 |
too | spectrum kernel for example | 15:53 |
sonne|work | I would suggest to implement DotFeatures for your feature type - that is the fastest possible way and you can use all linear SVMs (that then train using the spectrum kernel) | 15:54 |
sonne|work | there are a couple of examples for that already Hashed* features | 15:55 |
too | thanks | 15:57 |
-!- too [2eda6d52@gateway/web/freenode/ip.46.218.109.82] has quit [Quit: Page closed] | 16:11 | |
-!- sonne|work [~sonnenbu@194.78.35.195] has quit [Quit: Leaving.] | 17:03 | |
-!- adoniscik [~emre@c-67-180-103-118.hsd1.ca.comcast.net] has joined #shogun | 19:11 | |
-!- heiko [~heiko@host86-177-112-51.range86-177.btcentralplus.com] has joined #shogun | 21:03 | |
-!- romi_ [~mizobe@187.57.2.253] has quit [Remote host closed the connection] | 22:22 | |
-!- heiko [~heiko@host86-177-112-51.range86-177.btcentralplus.com] has left #shogun [] | 22:32 | |
--- Log closed Sat Oct 13 00:00:17 2012 |
Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!