Visualization of database structures for information retrieval

This paper describes the Book House system, which is designed to support children's information retrieval in libraries as part of their education. It is a shareware program available on CD-ROM or floppy disks, and comprises functionality for database searching as well as for classifying and storing book information in the database. The system concept is based on an understanding of children's domain structures and their capabilities for categorization of information needs in connection with their activities in schools, in school libraries or in public libraries. These structures are visualized in the interface by using metaphors and multimedia technology. Through the use of text, images and animation, the Book House encourages children even at a very early age to learn by doing in an enjoyable way, which plays on their previous experiences with computer games. Both words and pictures can be used for searching; this makes the system suitable for all age groups. Even children who have not yet learned to read properly can, by selecting pictures, search for and find those books they would like to have read aloud. Thus, at the very beginning of their school life, they can learn to search for books on their own. For the library community, such a system will provide an extended service which will increase the number of children's own searches and also improve the relevance, quality and utilization of the book collections in the libraries. A market research report on the need for an annual indexing service for books in the Book House format is in preparation by the Danish Library Centre A/S.


Introduction
It is generally agreed that current information retrieval systems in both public and school libraries present major problems for child users. These difficulties are particularly perplexing today when information-retrieval skills play a major role both in education and, increasingly, in society as a whole. One of the barriers against providing children with information retrieval skills is the lack of programs for casual and novice users; other barriers exist between public libraries, school libraries and education in schools. The purpose of this paper is to describe how a new library system -the Book House -based

School library use in education
Due to the many developments and innovations in teaching methods, the tasks to be performed by the school library, and expectations about its support of pedagogical activities, have gained considerable attention. Project-related teaching, and work in groups around shifting specific themes and subjects, have forced students to change their way of working, and the school library has had to do likewise in order to support this change. Projects and theme work are often cross-disciplinary in their subject matter, and involve several school subjects. In the higher grades and levels, themes are sometimes chosen by the pupils themselves. Thus information retrieval has become a necessary work activity for students as part of this expanded project and theme work. They must learn to use the school library in the classroom, and gain easy and efficient access to up-to-date information. In many countries, school libraries have become the central place for pupils' information-seeking tasks. In keeping with the times, the school library must be able to provide students their own easy access to electronic information-retrieval (IR) systems with comprehensive database coverage. Current emphasis on key phrases, such as selfaction, teach and use and learn by doing, reflect the approach under development in all modern pedagogical activities having to do with school libraries. Yet despite the widespread introduction of computers in modern education programmes, school libraries have until now lacked a tool which could be used by the students themselves.

Teaching computer use and skills
Recently, it has become mandatory in many European schools to integrate education in computing into the curriculum of the different disciplines taught. A shift in emphasis has taken place from teaching computing as a single isolated domain towards teaching how computers can be used to support problem-solving in different kinds of tasks and settings. Several implications arise from this shift: (a) Integration of teaching the use of computers into traditional subjects, such as reading and learning about the mother tongue, social and natural sciences, etc., leads to their use as a natural tool for problem-solving activities related to learning the subject content of a domain.
(b) Teaching computing to provide knowledge about the application of computers in specific task-functions and in actual work-domains implies exceeding borderlines between the subjects taught in the classroom context and actual work places outside the school.
Programs are therefore needed which can serve both as the student's own learning tool for solving problems, writing papers and exercises, as well as a professional support vehicle in a work-domain familiar to students or closely related to children's' work in the classroom.

Children's problems with catalogues
Several problems are encountered by children in connection with their comprehension and full exploitation of a computer system for retrieving information. Such systems were originally developed to meet librarians' needs for keeping records of their book-stock by providing computerized access to automated card catalogues. Today, they are used as a collection of access tools supporting the complete range of disciplines relevant to a library, as well as self-service information-seeking support tools for untrained and inexperienced casual library users. Despite continuing re-designs, including the incorporation of online user-assistance in the form of menus, help displays, informative error messages and browsing facilities, etc., it is generally agreed that there still exist major problems for users who are not experts in a field or experienced searchers, or are not generally familiar with how libraries work (Hancock-Beaulieu, 1989;Hildreth, 1989;Markey, 1990, Pejtersen, 1992 a and. These problems are mainly related to the large number of mapping processes required of the user during the search process in order to convert and translate his/her knowledge structures and language in accordance with the constraints imposed by the search language and classification schemes used both to classify and to retrieve information in IR systems. The only stable structure typically available in IR systems to guide users is the hierarchical structure of a thesaurus and/or classification scheme, which will give most youngsters problems. Search languages require advanced verbal skills, and classification schemes require advanced categorizing skills and/or domain knowledge obtainable only through exhaustive training. The use of IR systems requires capabilities for making appropriate deductions and inferences about the hierarchical category memberships of the user's own need, as well as formulating and modifying systematic combinatorial searches in generic terms. Although some work indicates that, at the age of four, children's language capability includes some categorical structures (Borgman et ah, 1989), and that, at an early age, they do have some ability to process hierarchically structured information (Keil, 1979), they do not possess skills for making hypotheses and predictions in order to plan a search successfully in conventional information systems.
Piaget's work (Piaget, 1958) suggests that children's capabilities for structural thinking and abstract reasoning can be divided into several developmental stages. Three of these have been of relevance in the design context of the Book House for young users from age six to sixteen: Independent of discussions of specific details of Piaget's work (Boden, 1980;Carey, 1985), it is obvious that the capability for making appropriate deductive inferences based on category membership, the understanding of formal logic structures and the use of welldefined categories will not be available skills in young children to the extent needed in order to operate within a comprehensive, formal library-classification scheme with its associated artificial and standardized retrieval language. Even for older children who have well-developed abilities for making deductive inferences from hierarchical library schemes, operational use of these schemes will require domain knowledge beyond their mental capacity.
In short, online catalogues require different types of knowledge and skills that children do not yet possess, in spite of their growing experience with personal computers, and familiarity with other modern media -all of which have contributed to their motivation for using a compatible form for computerized IR. As a result, usable IR systems must not only be competitive with existing, familiar computer programs and systems accepted by children, but must also minimize the need for excessively demanding mental juggling among incompatible languages and categories.

Mapping database structures to children's categorization skills
Given these user-characteristics, many conversion and mapping problems can be avoided by using a classification scheme and a database structure which are compatible with children's ways of categorizing information. As opposed to hierarchical library classification models, which are too cognitively demanding to be learned and understood well enough by children, such a system will make it possible to make database structures and representations of book content correspond directly to children's intuitive ways of categorizing their needs. An analysis of 200 protocols reflecting children's search behaviour and ways of categorizing the domain of books which were recorded in ten different types of libraries revealed structures and other aspects different from traditional library classification schemes (Pejtersen, 1986 a andb, 1992 a and. These were used as the basis for the classification scheme and database structure, mentioned in paragraph 7.6 below, in the Book House system.
Since a great variety of cognitive skills in the categorization of information is likely to be found among children, the Book House system was designed to: (a) match children's different developmental stages and problem-solving skills by supporting both young children's trial-and-error and intuitive perception-based searches, and older children's more or less systematic use of abstract relations and classes, in searching in a stable database structure compatible with their ways of categorizing information (this also guides children's learning of domain knowledge); (b) extend their categorization skills by several different interface representations of the classification structure used in the database -each placing different requirements on users' intellectual skills and domain knowledge. Each representation needs an interface content of classification categories and book description that is understandable by children at varying levels of perceptual and cognitive skills and training and, in addition, an interface form that is familiar enough to children to speed up their learning to search in abstract classification categories across the abovementioned developmental stages.
In section 7.3 to 7.6 below, four different interface representations of the same database content are shown, ranging from a graphical display of the complete classification structure to displays of a structured textual description and unstructured displays of icons.

Visualization of database with icons and metaphors
The degree of difficulty involved in problem-solving activities will not only depend on sufficiently well-developed innate structures needed for operational thinking in Piaget's terms, but also on the characteristics of the task, its requirements to precision, degree of detail and, of course, on the user's familiarity with the task situation. It is well known that everyday thinking is more likely to rely on analogies with past examples and rules of thumb than on logical abstract reasoning within clear and well-formulated categories (Rasmussen, 1986). This implies that interface displays should take advantage of these abilities by using icons or pictures with reference to familiar, preferably, concrete objects, which tap familiar situations and prompt common-sense reasoning.
First, icons used as metaphors can enable the user to transfer intuition and skill from a familiar situation in another context to the functionality of the IR system. In order fully to support users' ability to make analogies to familiar objects and events, icons should represent both the database system, the system's retrieval functionality including options for user actions, and the database content. Usually, the interface menus of IR systems do not reveal much of the basic semantic structure of the database content Even if they do, users will often still be in trouble since the structure used for organizing the semantics of the database content will most likely not match their own way of structuring the concepts used to reflect their reading need. Genuine support is best given by providing a visualization of the structure imposed on the database content through several types of icon reflecting the categories and principles actually used intuitively by children. The Book House interface is tailored to children's abilities to categorize and understand concepts, as determined from their associative responses to pictures and words in a controlled, multiple-choice association test (Pejtersen, 1990).
Secondly, it is well known that pictures are easier and faster to learn than text. They can also be perceived independently of the users' language capabilities. They are faster to 'read' than text since one icon can communicate very complex messages that would take much language to explain -a picture can say much more than a thousand words. Pictures can thus efficiently express subject information and conceptual relationships. They are easier to remember and to recognize once learned, which is advantageous since recognition of information is easier than recall of information. Visually, we can process a far larger amount of information than in any other form. We can, for instance, recognize a well-known face in a crowd, while we might have problems in recalling and verbally describing this face. In the same way, it is easier to use a map of a town than process a verbal instruction.

Book House Search: database retrieval
There are two entries into the Book House system: Book House Search for searching information from the database, and Book House Write for indexing and storing new information in the database.
Book House Search supports several repetitive decision tasks: formulation of a user's problem and need, choice of search strategy, execution of search, evaluation of relevance of retrieved books compared with user-need, revision of search, etc. The Book House assists children's accomplishment of retrieval and indexing tasks without assistance from adults. Several types of help are available, as well as facilities for experimenting and generally playing around with the system in a trial-and-error mode.

Book House metaphor
As they learn, children are likely to try to build a mental model of their problem-space and its representation in the database system in order to understand and predict the content and functionality of the system. Spavold (Spavold, 1990) found that children would spontaneously make images of databases during their use in the classroom. Our evaluation studies offer ample evidence that the choice of a Book House metaphor enables children to make easy analogies between their everyday life experiences and corresponding objects and tools in a house, and their functionality (Goodstein and Pejtersen, 1989). Even very young children have been able to use the system and achieve successful results.
When the user explores the screen with the mouse, a text line at the bottom follows the mouse movements and gives two kinds of messages. If the mouse is not in an area with selectable objects, the text 'Move the mouse and see what you can do' appears. When the mouse points at a selectable object, a square is drawn around the icon to evoke the user's attention and to indicate the mouse-sensitive limits of the object. A message appears telling the user to press the mouse button, with information on what will happen next. These help texts support interactive, self-activated 'learning by doing' during the user's exploration of information on the screen. They also support first-time users in understandmg the meaning of each icon: prototype testing with users has shown that not all children will be able to interpret the meaning of icons and predict a unique retrieval action associated with a particular icon.

Figure 2
The Book House system contains a database with children's fiction, a database with children's non-fiction, and a database that combines fiction and non-fiction.

Figure 3
The four strategies are (Figure 3, from the left): (a) the associate search by analogy based on prototypes for books similar to the book the user has in mind; (b) the intuitive strategy of browsing through icons for recognizing a perhaps unidentified need by means of pictures on a 'flip-over' projector; (c) the analytical subject search strategy represented as objects on a work table; and (d) browsing shelves, which give access to a browsing through the book stock strategy as a means for getting inspiration on what to read.

Find books similar to a known book
When a search by analogy is chosen from the left side of the strategy room, a model book is displayed with two options on the front cover: Author and Title. Based on the user's specification of his/her model book, the system uses the corresponding model-book description on the screen as the basis for finding other similar books in the database by means of probabilistic techniques.

Figures
Children often use pictures on the front cover in their searches, as well as in their assessment of the relevance and value of content of books. In a significant extension of this, users can browse and search for books in the Picture Association Thesaurus by selecting among icons symbolizing the subject content of books. When searching via an icon, book meanings normally mediated by many keywords become available. Such a pictorial representation produces only associatively related concepts displayed at levels corresponding to a broad range of more or less well-defined categorization skills. No verbal skills are needed to search by selection of an icon with a mouse.

Figure 6
Requiring knowledge about the names of fields in database records when formulating a precise search query can be avoided by making the structure of the database visible using appropriate icons ( Figure 6).
The poster on the wall gives access to subject-matter keywords: books about family life, quarrels, anorexia nervosa, happiness, etc. The world globe on the table and the clock on the wall give access to keywords on time and geographical places such as books about teenagers' life in Los Angeles in the nineties. The view out of the window gives access to keywords about the environment, social and professional settings, for example books about sport among Jewish children living in busy families in big cities. The masks on the wall give access to keywords related to emotional experiences, such as curiosity or excitement, evoked in the reader during the reading process. The icon showing the author sitting with his typewriter gives access to keywords concerning the author's intention to put forward ideas and/or opinions. The glasses on the table give access to keywords on the accessibility of books, their readability, language level, the age for being read aloud or read by the child him/herself, the size of the typography, etc. The card-catalogue drawers give access to keywords on traditional genres such as war stories or suspense stories.  Figure 7 When the icon browsing through book shelves is chosen in the right side of the strategy room, the system pops up with randomly chosen book descriptions and information on book content. Each time this strategy is activated, a new set of book descriptions is used (see book description immediately below in 7.8).

Open book with book description
Book descriptions are structures according to the classification system to enable the user to see the correspondence between his/her selected category of terms and the contents of each book. The user can then browse and turn pages one after the other in both directions by clicking on the lower red corners of the book showing one arrow. The user can modify the current need by adding search terms through a selection of red keywords from the book description, or removing terms with the eraser icon. When selected with the mouse, they are automatically combined with Boolean operators. The book case shows the number of retrieved books.

Search example
A realistic search query culled from our user studies illustrates a typical use of the system. A child of twelve needs to find books for her Natural Science class; she wants to choose a topic for group work on 'pollution'. She wants to find 'an exciting book with a blue cover and with a moon and whales on the front cover*. She does not remember the title or the author's name, since the source of information about the book is a friend from another class who showed it to her. Her motivation for looking for this particular item is the recommendation by her friend, who had been told by her teacher to read this book on pollution for a mother-tongue lesson.
She selects the door action icon and enters the Book House. She selects the room in the middle to search in the database with fiction and non-fiction for children. Having a fairly good idea about the characteristics of the desired book, she chooses the work table with classification access. First, she selects the masks on the wall to search for exciting books, then the book on the table to search for colours and pictures on the front cover. She looks at the result of the search shown as book descriptions. Having checked the emotional experience 'exciting' to be gamed by reading the books and the cover designs 'blue, moon and whales', she realizes that the subject is on 'whales at the point of extinction due to pollution of the sea'. She decides to take a print-out and uses the printer icon. She continues to browse retrieved books and uses the book in hand icon to put aside interesting candidates to be printed at the end of the search.

Book House Write: classification and editing the database
New books can be added to the database and existing books can be edited. A typical task of storing information about a book will involve indexing, i.e. the analysis and representation of the book to be searched for and displayed in the Book House Search system. This activity includes: check whether the book is already in the database; if not, select a card to fill in a book description; skim the book, possible reviews and other material; formulate a new description using standard categories; save the book description in the database, which is then updated.

Blank card for a new book
When adding new descriptions to the database, the blue New Book card catalogue is chosen, with blank cards and labels for empty categories ready to be filled in. When editing existing book descriptions, the yellow Titles catalogue or the green Authors catalogue gives access to a known item. The red All catalogue gives access to browsing through all the books in the database for examples of book descriptions. This is a useful option relevant in unfamiliar situations since it provides inspiration on how to solve a classification problem.
The basic principle is that a book description can be made with very little effort; and TTBOO Iktctolbtc l*T>*"*^~-< ** '^*Y^S« * Figure 8 practically no knowledge about classification or indexing is needed. The intention has been to make children progress gradually in their learning about the storage of information in the database.

Figure 9
Changing descriptions of books already stored in the database will require the user to fetch the book from the database by selecting the action icons of Author or Title card catalogues at the top of the screen. A yellow card filled in with book descriptions already saved in the database is then displayed, and the user can now work with this card in exactly the same way as with the white card.

Classification example
A school class in the 6th grade (children of about 11 years old) at the Peder Lykke School in Copenhagen had a one-month project in which the mother-tongue teacher co-operated with the school librarian. The overall purpose of the project was to teach children to review books and to prepare their reviews by means of a computer for subsequent database retrieval. Children chose books from the school library according to their own interests, and their task was to make a book review of their own books for the benefit of the other children in the class. They prepared their review on a paper form corresponding to the blank card for new books, and later used the card in the Book House to store their reviews. This took about IS minutes. Teaching in advance of and during the project involved an introduction to literary text analysis by explaining the meaning of the categories in the form. They were also taught about how to prepare data for information retrieval through the selection of the keywords necessary to communicate the most important aspects of a book. The class created a database with 45 books, and enjoyed retrieving each others' books as well as accessing another database, which they found, created by a 9th grade class (children of about 14 years old) (Jensen and Acker, 1992 a, b andc).

Integrating IR and computer use in libraries and schools
Recently, many efforts have been made to integrate the traditional concept of a library as a place for information retrieval of books with the role of the school in the training of young people to be skilled and knowledgeable in the use of computers. One approach is the extension of the library concept into a media centre in public and school libraries through the development of systems comprising all types of media: multimedia technologies, computer programs, video and tape recordings, etc., and building a collection of programs for loan in the media library.
Several experiments have been carried out to try to integrate school-teaching of traditional subjects with teaching about work in a library, i.e. its book collection and its use. For instance, this has been done by training young people in book-acquisition tasks, and having them overtake the actual task of buying books for the library for a couple of months during mother-tongue classes. Through the review process, they learned to generate literary-quality assessments, and to formulate their reading experiences. Children actually learned the trade-off involved in carrying out real-life tasks, and they enjoyed for once not being encumbered by the usual 'let's pretend' type of project. Activating co-operation between school children, teachers and librarians in serious, reallife tasks had positive effects on children's reading habits and their use of the library (Weinrich 1991;Hussmann, 1992). Such initiatives can continue throughout life-long educational activities, and can carry with them an active contact with the local community and its organizations involved in communicating knowledge (Kinnell, 1992). Working with the Book House in integrated activities in both a library and a school setting has given many valuable experiences about the effects of the system on children. Apart from the obvious purpose of teaching computer skills, there is evidence that the system has other advantageous effects:

• It encourages children to do their own searches in public and school libraries, and to
do them more effectively, in direct response to their daily information needs for the preparation of class-work and leisure activities. The system has much in common with computer games and is perceived by children to be as exciting a challenge. It is easy to use for the first-time user, but is not tedious or too simple for expert users.
With the current database of children's books, it is suitable for many different types of subject searches. The browse-through-icons option is particularly useful for children who do not master the reading of lists of keywords. That this is feasible has been demonstrated by the Book House evaluation process, where children proved to be more than enthusiastic about doing their own searches on computers in libraries (Goodstein and Pejtersen, 1989).
• It increases the number of books read, and widens reading tastes. Evaluation of the Book House system in libraries, and results from the experiments mentioned here, clearly indicate that use of the Book House in school libraries and in class work will increase the number of books read and the number of visits to the library, change children's reading patterns, and widen their taste as they become aware of hitherto neglected literature. Special care was devoted to the development of a database for fiction subject-retrieval in order to increase the use of the fiction stock in public and school libraries. Indexing of specific subjects in fiction is usually not a service offered to libraries, mainly due to the many problems that have been encountered in earlier attempts to develop classification schemes for fiction. The retrieval and promotion of fiction have become even more difficult because of the high number of new published books each year, since (ideally speaking) these have to be read and remembered by the librarians who are the traditional mediators of borrowed fiction. Not all books in the library stock can be read, nor remembered when read, during actual dialogues with users.
• It teaches how to formulate and solve an information-retrieval problem, improving the education and training of children in storing and retrieving information via a uniform platform, i.e., a system common for both the teaching situation in the classroom and the retrieval situation in the library. Book House Write can be an important facility for Unking language from text analysis and evaluation to language used for database development and information retrieval. Dealing with terminology problems between school class-work and database language has been reported by Borgman (1991) as an important problem.
Children from the 4th grade (nine years old) can make their own database, and search on their own book descriptions with automatically converted red keywords, without having to go through all the categories and fields of a blank card. Only author and title are mandatory. Children's subjective experiences and principles adopted for the classification of a book can be communicated to other children in a special field for personal comments. Older children can try to develop a controlled vocabulary and begin to understand the differences between free, natural-language text and a controlled vocabulary, as well as the important roles of precision and consistency in handling language in connection with database construction. This experience will not only increase knowledge about the library; it can be generalized to the use of computers for IR and indexing in many other fields than the library. Experiments by Spavold (1990) involving the use of a database in classroom teaching confirmed that children who used it both for data input and for output/retrieval understood the system's functionality and search procedures better, and could formulate more diverse queries, than those children who used the database only for searching. It helped in the learning of interrogation and formulation of good queries. Most children enjoyed both coding and data entry, as well as searching for information. There were no preferences for more problem-solving activity (involved in retrieval) as contrasted with the more repetitive, routine and yet highprecision task involved in the preparation and coding of data. This has been confirmed in experiments with the Book House system.
• It improves skills for text analysis and evaluation. Two traditional goals which teachers and school librarians have in common can control the use of the Book House: to motivate children to read books of a quality not typical for their age group, and to teach children literary analysis as a basis for generating book reviews. Work then takes on a practical goal, and has the effect that is physically visible and useful for the whole class.
In a Book House Search and Write project for the 9th grade (children of about 14 years old), the teacher and school librarian chose 20 normally not very popular books, a mixture of adult books that might be read by teenagers and books of high quality for teenagers. They asked the students to classify these books for the Book House database.
The purpose was not only to draw their attention to these books, but also to create a database for retrieval in the school library that would draw the attention of the children of the same age for these particular types of book. Thus the database serves as a communication tool which widens children's reading perspectives, and broadens their view from purely personal reading experiences to more of a critical-analysis approach. As a tool for the analysis of books from about the 4th to 10th grade (children from 9 to about 16), the classification categories can be used to prompt a number of questions about a book.
• It supports cross-disciplinary teaching, and the existing custom in education of integrating fiction and non-fiction in those subject fields where such an integration is appropriate -for instance, history, social science, natural science -by allowing similar and specific subject accesses to both fiction and non-fiction.

Present and future
The Book House is now available as two shareware programs. One is for the retrieval and indexing of fiction for children and adults in public libraries, with a database content of 3,500 novels from different countries which have been translated into Danish. The second is for school libraries, and covers the retrieval and indexing of both fiction and nonfiction with a database containing both children's fiction and non-fiction. It was released in the summer of. 1993 by Apple on a CD-ROM, and by RIS0 on floppy disks, and will be distributed to Scandinavian school libraries in the near future. The Danish Library Centre A/S, which provides Danish libraries with bibliographical data, will conduct market research this Autumn, 1994, on the number of libraries that will subscribe to a centralized indexing service which supplies books with new data in a format that matches the classification of books in the Book House database. If the response from libraries is sufficient in number, the delivery of data to the Book House system will become a service similar to other data deliverables to Danish libraries. The project is currently being continued by extending the use of multimedia, and by the development of an adaptive, multimedia interface including interactive user-modelling. In addition, the Book House project is being expanded to a Nordic Book House System with participation from the other Nordic countries.