=========================preview======================
(COMP336)[2000](s)midterm~=joji^_52396.pdf
Back to COMP336 Login to download
======================================================
COMP 336 Information Retrieval Quiz 2, Spring 2000 Date: April 19, 2000
Name: Student ID:
1. [30] Given the following three document spaces, fill in i), ii), or iii) in the space provided under each document space the method that might enable the system to return more relevant documents, or iv) if you think there is no way to improve the result. The four choices are:
i) reformulate original query into a new query based on relevant documents identified by user ii) document space modification iii) query splitting iv) no way to get more relevant document by relevance feedback
documents identified as relevant by the user
non-relevant documents
relevant documents not yet retrieved
original query
method ___i_________ method ___ii_________ method iii(iv is acceptable)
2. [20] A corpus contains the following words of some unknown language:
doxpuving
doxpuver
doxpuved
doxxon
doxing
doxplexenv
Give the successor variety of each prefix (d, do, dox, .., etc) of the word doxpuvedand then segment the word using peak and plateau method.
Prefix Successor variety Letters
d 1 o
do 1 x
dox 3 p, x, i
doxp 2 u, l
doxpu 1 v
doxpuv 2 e, i
doxpuve 2 r, d
Segment: dox | puved
3. [30] After indexing a set of document, and you find that the documents are very close together according to the similarity measure you use. Explain the effect of having documents too close to each other on precision and recall.
Answer: precision and recall will both be dropped
What is the effect on the average distance between the documents in the collection if you: i) remove a common word from the index; give a brief explanation on your answer. Answer: remove a common word will increase the average distance
ii) remove a very rare word from the index; give a brief explanation on your answer.
Answer: remove a rarely word will increase the average distance very little; we may regard the distance to be unchanged
4. [20] Circle the correct answer(s) below:
stemming
thesaurus
term phrase formation
relevance feedback