code used to prepare Switchboard Dialog Act corpus in the form of vector for SVM classifier
* read the SWDA corpus
* uses Porter Stemmer to convert words to there stem
* stop words which found in "corpus/stopword" will be removed
* words with frequency count less than 10 is removed
* prepare a dictionary using gensim library and the preprocessed data
* create a BOW corpus using the dictionary created before
* transform the BOW corpus to other representation using LSI/LDA
* append class label to the vectors
* remove unnecessary characters and prepare the final corpus for LIBSVM