Impact Factor:6.549
 Scopus Suggested Journal: Tracking ID for this title suggestion is: 55EC484EE39417F0

International Journal
of Computer Engineering in Research Trends (IJCERT)

Scholarly, Peer-Reviewed, Platinum Open Access and Multidisciplinary




Welcome to IJCERT

International Journal of Computer Engineering in Research Trends. Scholarly, Peer-Reviewed, Platinum Open Access and Multidisciplinary

ISSN(Online):2349-7084                 Submit Paper    Check Paper Status    Conference Proposal

Back to Current Issues

A Survey on various Stemming Algorithms

Sundar Singh, R K Pateriya, , ,
Affiliations
Computer Science & Engineering Department Maulana Azad National Institute of Technology Bhopal, India, 462003
:NOT ASSIGNED


Abstract
Stemming is a technique used to reduce words to their root form called stem, by removing derivational and inflectional affixes. Most of the existing stemming algorithms uses affix stripping technique. This technique has wide application in NLP, Text mining and information retrieval. Stemming improves the performance of information retrieval systems by decreasing the index size. There are many stemming algorithms implemented for English language. Many of these algorithms are working successfully in information retrieval system. However there are many drawbacks in stemming algorithms, since these algorithms can’t fully describe English morphology. In this paper different stemming algorithms are discussed and compared in terms of usefulness and there limitations.


Citation
Sundar Singh,R K Pateriya."A Survey on various Stemming Algorithms". International Journal of Computer Engineering In Research Trends (IJCERT) ,ISSN:2349-7084 ,Vol.2, Issue 05,pp.310-315, May - 2015, URL :https://ijcert.org/ems/ijcert_papers/V2I57.pdf,


Keywords : Stemming, stop word, recall, precision, Text mining, NLP, IR.

References
[1] Porter M.F. “An algorithm for suffix stripping” Program. 1980; 14, 130- 
[2] Porter M.F. “Snowball: A language for stemming algorithms”. 2001 
[3] Eiman Tamah Al-Shammari “Towards An Error-Free Stemming”, in Proceedings of ADIS European Conference Data Mining 2008, pp. 160-163. 
[4] Frakes W.B. “Term conflation for information retrieval”. Proceedings of the 7th annual international ACM SIGIR conference on Research and development in information retrieval. 1984, 383-389. 
[5] Frakes William B. “Strength and similarity of affix removal stemming algorithms”. ACM SIGIR Forum, Volume 37, No. 1. 2003, 26-30. 
[6] M. Nithya, “Clustering Technique with Porter stemmer and Hyper graph Algorithms for Multi-featured Query Processing”, International Journal of Modern Engineering Research (IJMER), Vol.2, Issue.3, pp960-965, May-June 2012 
[7] Galvez Carmen and Moya-Aneg•n F˜lix. “An Evaluation of conflation accuracy using finite-state transducers”. Journal of Documentation 62(3). 2006, 328-349 
[8] J. B. Lovins, “Development of a stemming algorithm,” Mechanical Translation and Computer Linguistic., vol.11, no.1/2, pp. 22-31, 1968. 
[9] Harman Donna. “How effective is suffixing?” Journal of the American Society for Information Science. 1991; 42, 7-15 7. 
[10] Kjetil, Randi, “News Item Extraction for Text Mining in Web Newspapers” WIRI’05, IEEE, 2009 
[11] Kraaij Wessel and Pohlmann Renee. “Viewing stemming as recall enhancement”. Proceedings of the 19thannual international ACM SIGIR conference on Research and development in information retrieval. 1996, 40-48. 
[12] Krovetz Robert. “Viewing morphology as an inference process”. Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval. 1993, 191-202. 
[13] Mayfield James and McNamee Paul. “Single N-gram stemming”. Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. 2003, 415- 416. 
[14] Melucci Massimo and Orio Nicola. “A novel method for stemmer generation based on hidden Markov models”. Proceedings of the twelfth international conference on Information and knowledge management. 2003, 131-138. 
[15] Mladenic Dunja. “Automatic word lemmatization”. Proceedings B of the 5th International Multi-Conference Information Society IS. 2002, 153-159. [14] Paice Chris D. “Another stemmer”. ACM SIGIR Forum, Volume 24, No. 3. 1990, 56-61. 
[16] Paice Chris D. “An evaluation method for stemming algorithms”. Proceedings of the 17th annual international ACM SIGIR conferenceon Research and development in information retrieval. 1994, pp. 42-50.
 [17] Plisson Joel, Lavrac Nada and Mladenic Dunja. “A rule based approach to word lemmatization”. Proceedings C of the 7th International Multi-Conference Information Society IS. 2004 
[18] Prasenjit Majumder, Mandar Mitra, Swapan K. Parui, Gobinda Kole, Pabitra Mitra and Kalyankumar Datta. “YASS: Yet another suffix stripper”. ACM Transactions on Information Systems. Volume 25, Issue 4. 2007, Article No. 18.


DOI Link : NOT ASSIGNED

Download :
  V2I57.pdf


Refbacks : Currently there are no Refbacks

Support Us


We have kept IJCERT is a free peer-reviewed scientific journal to endorse conservation. We have not put up a paywall to readers, and we do not charge for publishing. But running a monthly journal costs is a lot. While we do have some associates, we still need support to keep the journal flourishing. If our readers help fund it, our future will be more secure.

Quick Links



DOI:10.22362/ijcert


Science Central

Score: 13.30





Submit your paper to editorijcert@gmail.com