INDEXER:
A Learning Companion For Document Browsing.



Abstract


When people access the web, we can classify their activities into two broad categories. They are either searching for specific information, or they are browsing(Marko & Yoav 1995). There have been several efforts to support browsing activity and searching. Searching though narrower than browsing, can sometimes be time-consuming, given the current exponential growth in the volume of information accessible over the Internet. This is a proposal to develop a system which will help narrow down the domain of the search. The system "learns" to identify words which the user will find interesting. It first presents the user with a list of keywords, from the document in the master window, which he evaluates as RELEVANT or NOT-RELEVANT and this information is used as the standard for the learning system which maintains a "user preferences" file. The system then creates a clickable list of "interesting words" in a slave window. The user can open up the documents one after the other and he can read the context "relevant" to him by just clicking on the keyword (which is chosen by the trained system) in the slave window. The goal of this system, with its summarizer and indexer, is to impove browsing capabilities within the document which itself is the result of a search. It thus increases the value of the html document as a whole.

Solution Approach:

Future Work

    1. Integrating Learning capacity to INDEXER
    2. "Stemming of words" will be included as part of INDEXER and will be based on the heuristics used by the 'SMART' system.
    3. It would be nice to have a pop up window to set up selection heuristics and display learning But for now it will be done transparently.
    4. The implementation of more advanced Visual techniques to choose keywords (TAU system, Swaminathan 1993) can be added as a further development but may not be included in this project.
    5. "Synonym analysis" can also be added as a feature of the system.