A Novel Fused Feature Space for English Poetry Classification

  • Kazipeta Praveen Kumar, Dr. T. Maruthi Padmaja

Abstract

Due to the usage of similar vocabulary among the English poets across the world, it is hard to classify English poetry. Further, mining from the poetry documents poses higher challenges compared to prose, as it includes style features such as orthographic and phonemic apart from text semantics. Therefore, this work presents a new Hybrid Style and Semantic (HSSe) feature space to classify English poetry.  The proposed HSSe is based on relevant style and semantic features.  In  HSSe, the style features are identified using supervised and unsupervised feature selection methods. Whereas, the semantic features are computed with N-grams. The performance of the HSSe is compared with individual style and semantic features over ten classifiers to classify inter county poetry such as Indian and Western poetry.  Prior to the classification the distribution difference in English word usage across the two corpuses is identified using Spearman rank correlation and Chi-square test methods. The experimental results reveal that Bayesian Net classifier with HSSe of unsupervised style and 5-gram features yields better Friedman's mean rank in terms of classification performance, which is 81.2% compared to style and N-gram features alone, with other hybrid feature spaces.

Published
2021-06-12
How to Cite
Kazipeta Praveen Kumar, Dr. T. Maruthi Padmaja. (2021). A Novel Fused Feature Space for English Poetry Classification. Design Engineering, 341 - 359. Retrieved from http://www.thedesignengineering.com/index.php/DE/article/view/1993
Section
Articles