Friday, September 7, 2007

What is Information Retrieval?

In broad terms, information retrieval refers to the act of retrieving of unstructured data. In practical terms information retrieval means retrieval of text documents from a repository such as library collection. The term IR was coined during the time when text documents were the only or primary mode of information storage. These included documents, books, papers to name few. For this reason most the literature available today is concerned with retrieval of documents although things are changing.

The nature and scope of IR domain has changed in recent times because the availability of wide variety of media in which data can be stored. With the invention of sophisticated computing devices that can allow user to create and store different media formats, and growing numbers of users having access to such devices. In addition with the increasing digitization of every kind of media such as photos and music, it is imperative for IR domain to develop newer approaches for searching non-textual items such as pictures.

Consequently, new special domain such as video retrieval, music retrieval and image retrievals systems are being developed and deployed everywhere.

Most importantly, the size , scope and nature of ever expanding web has posed new challenge for IR community. The uncontrolled and open nature of web and innumerably possible formats of data available on the web is challenging traditional IR approaches. This has led a new domain called Web information retrieval (WIR). The famous search engines are but examples of WIR

In summary, IR community has new but great challenges ahead to tackle. mobile messages, emails are also becoming part of a larger definition of data. An such a corpus of multimedia data is the next frontier waiting for IR community

No comments: