Automated doubt identification from informal reflections through hybrid sentic patterns and machine learning approach

Research and Practice in Technology Enhanced Learning

Table 3 Performance metric of LR models using various features

Model	Features^*	Precision	Recall	F1 score
1	All data unigram features with resampling	0.61	0.46	0.52
2	All data unigram and bigram features with resampling	0.64	0.38	0.47
3	QM and 5W1H with resampling	0.29	0.38	0.33
4	QM, 5W1H and QP with resampling	0.29	0.38	0.33
5	TextBlob polarity score	0.24	0.51	0.33
6	Selected data with unigram features	0.70	0.67	0.68
7	Selected data with unigram, bigram features	0.83	0.62	0.71
8	Selected data with doc2vec embedding	0.76	0.75	0.75

^*The result of Models 1–7 is based on the TF as the feature vector space. The result from TF-IDF is omitted since it is consistently lower than the above