April 12, 2020

Similarity Measures

Semantic similarity measures are very important in many computer-related fields. Previous works in applications such as data integration, query expansion, tag refactoring, automatic question answering or text clustering have used some semantic similarity measures to implement a solution. Despite the usefulness of semantic similarity measures in these applications, the problem of measuring the similarity between two text expressions remains a key challenge. And the community strives every day to find satisfactory solutions that can solve the problem once and for all.

Semantic similarity measures are a class of textual based metrics resulting in a similarity or dissimilarity score between two text strings for approximate matching or comparison and fuzzy string searching. For example, the strings "Sam" and "Samuel" can be considered (although not the same) to a degree similar. A string metric provides a floating-point number indicating an algorithm-specific indication of similarity.