Learn Before
Vectors and documents
Term-document Matrix: In a term-document matrix, each row represents a word in the vocabulary and each column represents a document from some collection of documents. The fig. below show a small selection from a term-document matrix showing the occurrence of four words in four plays by Shakespeare.
Vector Space Model: The term-document matrix of fig below was first defined as part of the vector space model of information retrieval. In this model, a document is represented as a count vector.
Vector is just a list or array of numbers. A vector space is a collection of vectors, characterized by their dimension. The ordering of the numbers in a vector space indicates different meaningful dimensions on which documents vary.Thus the first dimension for both these vectors corresponds to the number of times the word battle occurs, and we can compare each dimension, noting for example that the vectors for As You Like It and Twelfth Night have similar values (1 and 0, respectively) for the first dimension.

0
1
Tags
Data Science