Information retrieval is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. It involves searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds. An information retrieval process begins when a user or searcher enters a query into the system. Queries are formal statements of information needs, for example, search strings in web search engines. In information retrieval, a query does not uniquely identify a single object in the collection. Instead, several objects may match the query, perhaps with different degrees of relevance.
Information retrieval can be defined as a software program that deals with the organization, storage, retrieval, and evaluation of information from document repositories, particularly textual information. The system assists users in finding the information they require but does not explicitly return the answers to the questions. It informs the existence and location of documents that might consist of the required information. The documents that satisfy the users requirement are called relevant documents. A perfect IR system will retrieve only relevant documents.
Some aspects of ad-hoc retrieval that are addressed in IR research include how users can improve the original formulation of a query with the help of relevance feedback, how to implement database merging, how to handle partly corrupted data, and which models are appropriate for the same. Mathematically, a retrieval model consists of representation for documents, representation for queries, the modeling framework for documents and queries along with the relationship between them, and a similarity function that orders the documents with respect to the query.
Types of Information Retrieval (IR) Model include:
- Boolean Model
- Vector Space Model
- Probabilistic Model
- Language Model
The evaluation of an information retrieval system is the process of assessing how well a system meets the information needs of its users. In general, measurement considers a collection of documents to be searched and a search query.