In this episode we introduce a very powerful approach for computing semantic similarity between documents. Here, the terminology “document” could refer to a web-page, a word document, a paragraph of text, an essay, a sentence, or even just a single word. Two semantically similar documents, therefore, will discuss many of the same topics while two semantically different documents will not have many topics in common. Machine learning methods are described which can take as input large collectio...
view more