DSpace Repository

Gender classification of microblog text based on authorial style

Show simple item record

dc.contributor.author Mukherjee, Shubhadeep.
dc.contributor.author Bala, Pradip Kumar.
dc.date.accessioned 2018-04-04T09:11:23Z
dc.date.available 2018-04-04T09:11:23Z
dc.date.issued 2017-02
dc.identifier.citation Mukherjee, S., & Bala, P.K. (2017). Gender classification of microblog text based on authorial style. Information Systems and e-Business Management, 15(1), 117-138. doi: https://doi.org/10.1007/s10257-016-0312-0. en_US
dc.identifier.issn 1617-9846
dc.identifier.uri https://doi.org/10.1007/s10257-016-0312-0
dc.identifier.uri http://10.10.16.56:8080/xmlui/handle/123456789/235
dc.description.abstract Gender profiling of unstructured text data has several applications in areas such as marketing, advertising, legal investigation, and recommender systems. The automatic detection of gender in microblogs, like twitter, is a difficult task. It requires a system that can use knowledge to interpret the linguistic styles being used by the genders. In this paper, we try to provide this knowledge for such a system by considering different sets of features, which are relatively independent of the text, such as function words and part of speech n-grams. We test a range of different feature sets using two different classifiers; namely Naïve Bayes and maximum entropy algorithms. Our results show that the gender detection task benefits from the inclusion of features that capture the authorial style of the microblog authors. We achieve an accuracy of approximately 71 %, which outperforms the classification accuracy of commercially available gender detection software like Gender Genie and Gender Guesser. en_US
dc.language.iso en en_US
dc.publisher Springer en_US
dc.subject Text mining en_US
dc.subject Twitter en_US
dc.subject Natural language processing en_US
dc.subject Gender classification en_US
dc.subject Knowledge discovery en_US
dc.subject Supervised learning en_US
dc.subject Artificial intelligence en_US
dc.subject Business intelligence en_US
dc.subject IIM Ranchi en_US
dc.title Gender classification of microblog text based on authorial style en_US
dc.type Article en_US
dc.volume 15 en_US
dc.issue 1 en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record