This article advocates the use of lexical knowledge and semantics to improve
the accuracy of information retrieval. A system of measuring text similarity is
developed, which attempts to integrate the meaning of texts into the similarity
measure. The work hinges on the organization of the synsets in the WordNet
according to the semantic relations of hypernymy/hyponymy, metronymy/holonymy
and antonymy. A unique measure using link distance between the words in a
subgraph of the WordNet has been evolved. This needs word sense disambiguation
which in itself is a complex problem. We have developed an algorithm for word
sense disambiguation exploiting again the structure of the WordNet. The results
support our intuition that including semantics in the measurement of similarity
has great promise.