Home About Us Kpo Services Blog Careers Contact Us
Market Research Financial Market Research Life Sciences and Pharma Research Web Based Market Research

Text Mining

Text mining, sometimes alternately referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text. High-quality information is typically derived through the devising of patterns and trends through means such as statistical pattern learning. Text mining usually involves the process of structuring the input text (usually parsing, along with the addition of some derived linguistic features and the removal of others, and subsequent insertion into a database), deriving patterns within the structured data, and finally evaluation and interpretation of the output. 'High quality' in text mining usually refers to some combination of relevance, novelty, and interestingness. Typical text mining tasks include text categorization, text clustering, concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling.


Labor-intensive manual text mining approaches first surfaced in the mid-1980s,[citation needed][examples needed] but technological advances have enabled the field to advance during the past decade. Text mining is an interdisciplinary field that draws on information retrieval, data mining, machine learning, statistics, and computational linguistics. As most information (common estimates say over 80%) is currently stored as text, text mining is believed to have a high commercial potential value. Increasing interest is being paid to multilingual data mining: the ability to gain information across languages and cluster similar items from different linguistic sources according to their meaning.


Recently, text mining has received attention in many areas.

Security applications

Many text mining software packages are marketed for security applications, especially analysis of plain text sources such as Internet news. It also involves in the study of text encryption.

Biomedical applications

A range of text mining applications in the biomedical literature has been described.
The more important online text mining application in the biomedical literature is GoPubMed.GoPubmed was actually the first semantic search engine on the Web. Other example isPubGene that combines biomedical text mining with network visualization as an Internet service.

Software and applications

Text mining methods and software is also being researched and developed by major firms, including IBM and Microsoft, to further automate the mining and analysis processes, and by different firms working in the area of search and indexing in general as a way to improve their results. Within public sector much effort has been concentrated on creating software for tracking and monitoring terrorist activities.

Online media applications

Text mining is being used by large media companies, such as the Tribune Company, to disambiguate information and to provide readers with greater search experiences, which in turn increases site "stickiness" and revenue. Additionally, on the back end, editors are benefiting by being able to share, associate and package news across properties, significantly increasing opportunities to monetize content.

Marketing applications

Text mining is starting to be used in marketing as well, more specifically in analytical customer relationship management. Coussement and Van den Poel (2008) apply it to improvepredictive analytics models for customer churn (customer attrition).

Sentiment analysis

Sentiment analysis may involve analysis of movie reviews for estimating how favorable a review is for a movie.Such an analysis may need a labeled data set or labeling of the affectivityof words. Resources for affectivity of words and concepts have been made for WordNet and ConceptNet, respectively.

Text has been used to detect emotions in the related area of affective computing.Text based approaches to affective computing have been used on multiple corpora such as students evaluations, children stories and news stories.

Academic applications

The issue of text mining is of importance to publishers who hold large databases of information needing indexing for retrieval. This is especially true in scientific disciplines, in which highly specific information is often contained within written text. Therefore, initiatives have been taken such as Nature's proposal for an Open Text Mining Interface (OTMI) and the National Institutes of Health's common Journal Publishing Document Type Definition (DTD) that would provide semantic cues to machines to answer specific queries contained within text without removing publisher barriers to public access.

Academic institutions have also become involved in the text mining initiative:

  • The National Centre for Text Mining (NaCTeM), is the first publicly funded text mining centre in the world. NaCTeM is operated by the University of Manchester[12] in close collaboration with the Tsujii Lab,[13] University of Tokyo.[14] NaCTeM provides customised tools, research facilities and offers advice to the academic community. They are funded by the Joint Information Systems Committee (JISC) and two of the UK Research Councils (EPSRC & BBSRC). With an initial focus on text mining in the biological and biomedicalsciences, research has since expanded into the areas of social sciences.
  • In the United States, the School of Information at University of California, Berkeley is developing a program called BioText to assist biology researchers in text mining and analysis.

    Notable software and applications

    Text mining computer programs are available from many commercial and open source companies and sources.


KPO Services

Market Research & Analysis
Financial Market Research
Life Sciences Research
Web Based Market Research


India-the KPO destination
Research Process Outsourcing
Outsourcing market research
Financial market research
Outsourcing life sciences
Web market research
FAQ on pharmaceutical research
Bio and cheminformatics KPO center

The O2I Advantage!
About us
KPO Team
KPO Samples
KPO Case Studies

Contact Us

For More Information Please Contact Us on info@kposervice.com

Market Research | Financial Market Research | Life Sciences and Pharma Research | Web Based Market Research
Help | Infrastructure | Resources 1 | Privacy Policy | Links | Careers
© 2007 KPO Services .All Rights Reserved.