verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
keonkim has quit [Ping timeout: 255 seconds]
keonkim has joined #mlpack
govg has joined #mlpack
rcurtin has joined #mlpack
gurupunskill has joined #mlpack
< gurupunskill> Hey! I'm new, but I'd like to contribute to your organization. Can someone help me get started along?
gurupunskill has quit [Ping timeout: 260 seconds]
gurupunskill has joined #mlpack
< gurupunskill> Okay, so I checked the chanel logs and found out how you helped another guy xD
< gurupunskill> I built mlpack from source, and I'm running through the tutorials to understand how it works. Thanks :)
gurupunskill has quit [Client Quit]
vivekp has quit [Ping timeout: 240 seconds]
vivekp has joined #mlpack
kaushik_ has joined #mlpack
< zoq> gurupunskil: Welcome, if you have any further questions please don't hesitate to ask here or via the mailing list.
the_neo_ has joined #mlpack
the_neo_ has left #mlpack []
the_neo_ has joined #mlpack
< the_neo_> I am Mathematics and Computing Sophomore. I want to contribute. I know C/C++, python (intermediate) and basic ML. I don't know how to proceed further. Can anyone help me? Thanks in advance!
< rcurtin> hi there the_neo_, I would suggest taking a look at http://www.mlpack.org/involved.html
< rcurtin> (let me know if those pages do not work... the system mlpack.org was suffering some routing issues this morning, but I think the hosting company has fixed them now...)
ans has joined #mlpack
ans has quit [Client Quit]
vivekp has quit [Ping timeout: 248 seconds]
the_neo_ has quit [Quit: Page closed]
vivekp has joined #mlpack
vivekp has quit [Ping timeout: 248 seconds]
vivekp has joined #mlpack
vivekp has quit [Ping timeout: 240 seconds]
vivekp has joined #mlpack
vivekp has quit [Ping timeout: 246 seconds]
vivekp has joined #mlpack
vivekp has quit [Ping timeout: 240 seconds]
vivekp has joined #mlpack
vivekp has quit [Ping timeout: 248 seconds]
vivekp has joined #mlpack
govg has quit [Ping timeout: 240 seconds]
govg has joined #mlpack
govg has quit [Ping timeout: 260 seconds]
kaushik_ has quit [Quit: Connection closed for inactivity]
gopala has joined #mlpack
< gopala> @rcurtin checked ur message in logs..
< gopala> 08:17 < rcurtin> or if you have multiple sequences you are training on, with a single sequence consisting of an observations file and a labels file, then you can pass the list of observation files (one per line) to --input_file 08:17 < rcurtin> and the list of labels files to --labels_file
< gopala> to clarify using an analogy, if I am to build say a speech phoneme HMMs, i'd have my speech features files at some sampling rate.. and in label files have phoneme label of each frame .. correct ?
< rcurtin> gopala: yep, that is correct
< gopala> so where does the number of states fit in ? each phoneme will have #n states ?
< rcurtin> you would specify the number of states to the mlpack_hmm_train program as the --states option (or '-n' for short)
< rcurtin> ah sorry hang on, let me clarify
< rcurtin> the number of states should be equal to the number of different label values that your HMM can have
< rcurtin> I think that in speech, typically a single phoneme is allowed to have multiple hidden states, which is not really the "textbook" way HMMs are done nor the way they are done in mlpack
< rcurtin> there's not really a way to say "train this HMM, but if you see label <x>, then make sure the hidden state is one of N states"
< rcurtin> however... what you could do, and I think this is what HTK does (not 100% sure on this), is train one HMM for each phoneme
< rcurtin> and when you train these HMMs, don't use labeled training; just specify the number of states for that phoneme, and only pass in training sequences for that particular phoneme
< rcurtin> so at the end of that process, you'll have ~30-70 HMMs (depending on language I guess), and then you can do prediction for a single phoneme by taking the max likelihood over all these models
< rcurtin> I'm not sure if that's all helpful, I hope I did not give way too much answer for a simple question :)
< gopala> ok it helps
< rcurtin> http://mi.eng.cam.ac.uk/~sjy/papers/gayo07.pdf might have some more clarity on how HMMs are typically used for speech recognition, and it may make it clear how to use mlpack for that
< gopala> so in supervised training, is there any baumwelch..or just compute GMM params
< rcurtin> if you are doing supervised training, no need for baum-welch---you can just estimate directly
< rcurtin> the GMMs still have to be fit with EM though
< rcurtin> but at least the transition probabilities matrix and initial state probabilities matrix can be estimated directly
< rcurtin> now, if you happen to have phonemes that are labeled not just, e.g., "a" but more like { "a0", "a1", "a2", ... } where you have one label for each hidden state of the phoneme, then you can use direct supervised estimation with mlpack_hmm_train, no need for the complex setup I just described
< rcurtin> but I don't think most speech data is typically labeled at that level
< gopala> ok thanks this helps.. so if i start with some rough (uniform) segmentation of phones to say 3 states.. I can loop over iterations of train and viterbi.. over and over again to end up with better labels.. correct ?
< rcurtin> hmm, I dunno if you could get better labels like that
< rcurtin> but I did think of a way you can get labels for individual phoneme states like you want
< rcurtin> train an HMM on only one phoneme with N hidden states
< rcurtin> so this is unlabeled baum-welch training
< rcurtin> once you get an HMM, use it to predict hidden states for all of your training data for that particular phoneme
< gopala> the problem is Id have to cut individual segments of this phoneme and generate training right ? or may be better deal at the api level
< rcurtin> if you do this for all phonemes, now you have hidden states for all phonemes, and you can "merge" all of this into a single HMM that can predict for an entire sequence by training with the output "labels" from each of the phoneme-level HMMs
< rcurtin> I think if you train an HMM for an individual phoneme, there is no need to segment the data itself any further than the phoneme-level labels that you already have
< gopala> are you suggesting a the idea of merge or is there something in mlpack to do this --- several hmms into one giant hmm?
< gopala> I guess Ill start using it and complain as I go... thanks ryan
< rcurtin> sure, happy to help
< rcurtin> what I'm suggesting in the second idea is to just retrain one big HMM directly on labeled data that you've generated with several HMMs
< rcurtin> i.e. you first train an HMM on each phoneme with baum-welch (i.e. without labels) with N states, and then you use that HMM to produce "sub-phoneme" labels for each phoneme
< gopala> yea ... it involves cutting each segment of each phoneme into diffreent files right ?
< rcurtin> where a "sub-phoneme label" is a term I'm using to refer to the particular internal state of a single phoneme
< rcurtin> right, but if you already have labels at the phoneme label it should be easy
< rcurtin> so if you have, say, 10000 labels of the "a" phoneme, you can just split those out of your data, train an "a" HMM on that "a" data without labels, then use that trained HMM to get "sub-phoneme labels" for all the "a" phoneme data
< gopala> got it.. yes..
< rcurtin> once you've done that for all phonemes (and changed the numbers of the sub-phoneme labels so they don't collide), you can train a "big" HMM model on the data and the "sub-phoneme labels" directly
< rcurtin> and then you can get predictions for a speech sequence directly
< gopala> i have to look at how you save the HMM model,, each state as a number ?
< gopala> and sequentially ?
< gopala> as in there shouldnt be any transitions between say a1 and b1 right.. but there should be between a1 and a2
< gopala> i using a1 and b1 as the first hidden states of a and b
< rcurtin> hmm, I am not sure I understand the question completely
< rcurtin> do you mean how I save the predictions from the HMM model? that would be as a number from 0 to the number of states
< gopala> so in doing the merge, I will change the number of states and something in the transition matrix.. correct ?
< rcurtin> yeah, when you create the merge model, then you will set the number of states to (number of phonemes * number of states for each phoneme)
< rcurtin> and you'll also have to change the labels for the states
< rcurtin> so i.e. if you had 'a' and 'b', and each of those phonemes had three states (so 'a0', 'a1', 'a2', 'b0', 'b1', 'b2')
< gopala> yep.. got it..
< rcurtin> then the 'a' HMM would produce class labels 0, 1, 2, and so would the 'b' HMM
< rcurtin> so you'd want to map the 'b' HMM output to, I guess, 3, 4, 5
< rcurtin> and so forth for the other models
< gopala> and will you also have a state stansition matrix ?
< rcurtin> no, the state transition matrix will be estimated from the training data directly
< rcurtin> it will be a block diagonal matrix, since i.e. state 'b0' never transitions to 'a2'
< gopala> got it.. i guess that was what i was asking about
< gopala> but i guess it has to have some non-zero probability if it is to decode a long speech file correct
< rcurtin> the probability from 'b2' to 'a0' shouldn't be zero
< gopala> yep
< rcurtin> anyway I have to run for now, I'll be back later
< gopala> ok
< rcurtin> feel free to ask any other questions if you have any problems
< gopala> thanks i got all i i neeeded..
< rcurtin> great, let me know if it works :)
< gopala> wil do..tx
gopala has quit [Ping timeout: 260 seconds]