The training dataset folder contains two .csv training data files. The Data_1_train.csv contains header+2313 lines (training examples) and Data_2_train.csv contains header+3602 lines. You can open the .csv files in excel to view them. However, sometimes excel may drop few lines from showing (due to the presence of special char symbols in review text) based on its settings. To view full content, you may use any text editor (like Sublime Text, Notepad++ etc.).
"," in review sentence (column B) is denoted as "[comma]" to separate it from the column delimiter (",") of .csv file. While parsing (data file reading), you can use "," to split the line into fields (coloums) values.
Build the best possible sentiment predictor to solve the task.
Run the DMTM-Project-2.ipynb
jupyter notebook.