Skip to main content

Text Analytics

Related products


At your request, we will design a solution tailored to your needs.


Discover the potential of this application from Clementine staff!


Clementine experts can even provide on-site training to help your make the most of your applications.


Clementine events present the latest trends and developments in data science and natural language processing.

Text Analytics


Add-on which is capable of extracting information from transcribed text by managing the specificities of the Hungarian language, using dictionaries and morphological tools.

By handling the peculiarities of the Hungarian language, with the help of dictionaries and morphological tools, the software is able to extract information from the transcribed text.
Our CLEMTEXT solution, specially developed to handle the specificities of the Hungarian language, is a toolkit for pre-processing text content and extracting basic linguistic information, which can simultaneously

  • word processing tasks, and
  • morphological and syntactic analysis.

The word-stuffed list of words that constitute our textual data source, as well as information on conjugation and substitution, are essential prerequisites for the dictionary-building process that is the heart of text analytical analysis. It is a kind of pre-processing step of our unstructured data, during which word frequency, synonym and stop-word lists can be created for our documents. While the morphological and syntactic analyses, i.e. the clarification of the word-genre of the words that make up the content of the text and the determination of the role of each word in the sentence structure, allow the identification of stylistic elements based on the specificities of Hungarian language use. The analysis of complex speech style characteristics that answer the biggest question in sales: with whom and in what style is it worth communicating, to whom and in what style is it worth making an offer.

Domain- and industry-specific dictionaries
Thanks to our robust dictionaries, our solution can be used effectively in a wide range of customer service areas in a wide variety of industries. CLEMVOICE methodology is more than just a traditional keyword recognition technology. Thanks to the flexible interface of our analysis tool, there are no quantitative limits to the expansion and customisation of our dictionaries. Our solution processes the full range of text documents and searches for the context in which each phrase and sentence is used. Thanks to the complex processing methodology, it provides demonstrably more accurate results than monitoring only the occurrence of a single word by itself.

Our reference dictionaries for different economic fields:

  • Insurance (~4500 items): Insurance specific words, phrases, patterns e.g. products, services, damages, etc.
  • Banking (~3200 items): Banking specific services, products, issues
  • Telecommunications (~4000 items): Telecommunications specific dictionary
  • Opinion (~1000 items): Positive, negative adjectives, adjectives
  • Telesales (~1000 items): Telesales scripts, typical phrases, idioms.