Relevance feedback (contd.) - Pseudo-relevance feedback
- D+ and D- generated automatically
- E.g.: Cornell SMART system
- top 10 documents reported by the first round of query execution are included in D+
- typically set to 0; D- not used
- Not a commonly available feature
- Web users want instant gratification
- System complexity
Ranking by odds ratio - R : Boolean random variable which represents the relevance of document d w.r.t. query q.
- Ranking documents by their odds ratio for relevance
- Approximating probability of d by product of the probabilities of individual terms in d
Bayesian Inferencing - Bayesian inference network for relevance ranking. A document is relevant to the extent that setting its corresponding belief node to true lets us assign a high degree of belief in the node corresponding to the query.
- Manual specification of mappings between terms to approximate concepts.
Bayesian Inferencing (contd.) - Four layers
- Document layer
- Representation layer
- Query concept layer
- Query
- Each node is associated with a random Boolean variable, reflecting belief
- Directed arcs signify that the belief of a node is a function of the belief of its immediate parents (and so on..)
Bayesian Inferencing systems - 2 & 3 same for basic vector-space IR systems
- Verity's Search97
- Allows administrators and users to define hierarchies of concepts in files
- Estimation of relevance of a document d w.r.t. the query q
- Set the belief of the corresponding node to 1
- Set all other document beliefs to 0
- Compute the belief of the query
- Rank documents in decreasing order of belief that they induce in the query
Do'stlaringiz bilan baham: |