On the WWW there are no commitments on persistency or classification of content whatsoever. This makes it very easy to make information public but makes it harder to make sure that the information will be found by the targeted audience or that it can be found again. This commitment-less philosophy has probably been of profound importance to make the WWW grow quickly but makes it very difficult to build services that use the information presented on the WWW.
Usually, a library will also have a scope of interest, i.e. it maintains a collection of items for a particular purpose. The reason for this need not primarily be to reduce costs. Maintaining a collection and making it searchable means that one has to maintain some dictionary or more elaborate structure, e.g. some topic hierarchy. The larger this structure gets, the harder will it be to keep it consistent and to make sure the maintainers use it consistently (e.g. how do we decide what keywords appropriately describe a document).
If we see the library as collection of items for a particular purpose (e.g. a department's books collection or even a broader scope, such as "contemporary swedish literature") one can talk about the relevance of adding an item to the collection of a certain library. This means that one can also talk about the collection as being more or less complete with regard to the topic and the indexing as being more or less sufficient for that library's purpose. Basically, this is how libraries already tend to be organized; different libraries have different fields of expertise but they can search the collections of other libraries. There are often efforts to merge the collections of the libraries but this work is done in parallel to expansion and evolution of the local collections of the libraries, i.e. one does not want to sacrifice the usefulness of the current collection to the utopian goal of one, single all-encompassing collection.
Seachable structures encompassing the entire WWW (yahoo etc.) will be incomplete, outdated and often inconsistent because of the sheer size of the subject they try to cover. (If the WWW, as a whole, is seen as a library then that scope is really very wide...). It is also difficult to assess the quality (w.r.t. accuracy of classification and "updatedness") of such broad collections. Within interest-communities, smaller, topic-centered collections of resources are often maintained. This can be seen as a a parallel to topic centred libraries, although without any of the commitments usually associated with a library. Collections are usually maintained also by individual users (e.g. bookmarks-lists and home pages) so there is really no lower bound to what could be seen as a collection.
Is it possible to merge the two concepts, the openness of the Internet and stability of the library, to gain advantages from both of them? Can we combine a library's characteristics of persistency and search ability of its collection with the WWW-idea that anyone can contribute information? Our approach is to provide personal libraries to community of users, all maintaining information relevant to themselves, individually or in coalitions. By exchanging information between the libraries more information than was registered in the personal library will be available to the individual user. We call this combined search space a Virtual Community Library (VCL).
By requiring that the information is relevant to the individual users we address the problem that users will rarely bother to register information they, themselves, have no interest in. Classification of information by end-users usually introduces the problem of having non-librarians doing classification according to the best of their own knowledge. This makes it harder to develop use-conventions in classifications (e.g. to mark an item with a keyword that is not even mentioned in the text, but still is relevant). We will still need librarians that classify larger collections to get homogeneous use-conventions. Such professionally maintained collections can too be seen as personal libraries (neither they can claim to "complete" or reflect more than a the classifiers opinions), but might be used more frequently by others than a library representing only one individual (and, in that sense, be more influential).
The VCL can be seen as a community of agents in the sense of the participants all being, basically, self-interested and having incomplete knowledge. We want the agents to be able to make use of other agents work or, (at least) know when work can be shared. The agents might be committed to things such as maintaining collections of information or informing others of changes in their knowledge bases but would typically not allow manipulation of their own knowledge by others.
Each agent's internal knowledge representation may be as detailed as it needs, but the agents might be constrained in what they can communicate by the communication language or by what knowledge is shared. E.g. it might not be possible to communicate that two books are written by the same person if the other agent has no ability to understand the concept of a "person". I.e. since we want as many information sources as possible to be regarded as agents, we cannot demand that they all commit to use a single ontology.
[Tail of this section is not edited]
*goals: share work, manage different classifications-schemes, generate
representations according to different standards*
* hard: finding "same" papers/items ... *
* SO: What is relevant to a user. His/her papers, references,
projects, readings-list personal info, interests..
Since more can be said and guaranteed about the information in
a library, quite possibly, a library has useful properties when experimenting
with techniques such as Social Filtering etc. [see Recommending
And Evaluating Choices In A Virtual Community].
*examples here*
Also, a personal digital library can probably help when collecting
information about a particular user's interests and further guide the filtering.
| The Web | A Library | A Personal Library | |
| Goal | To make it easy to make documents public | To have a complete collection in its knowledge area. | To have a complete collection in its interest area. |
| Commitments | No commitments. | Issues references to the documents in its collection to be used for later retrieval of the document. | Same commitments as for a library. |
| Issues bibliographic information to the documents in its collection to be used for later searching for the document. | |||
| How one adds new documents | Anyone may publish documents. | Certain people are allowed to add and register documents. | Same way to add documents as for a library. |
| No need to register documents. Immediate access to new documents. | There is an active choice of what documents to add according to the goal. | ||
| The information organization | No centralized control of the flow of information. | Organizes information about the internal and external documents. | Same organization as for a library. |
| Users can organize their information with uni-directed links. | Attaches meta-information to the the items in the collection. | + Possibilities to make personal annotations (e.g. tags and notes). | |
| Durability of documents | One finds a document as long as the publishing user wants. | Different users should be able to find exactly the same documents over and over again. | I have to be able to find exactly the same document over and over again. |
Note: If the bibliographic information is standardized in some
way, (e.g. title or isbn-number) the same information can be used by another
library which maintains a collection with a copy of the item. For this
reason collections of standardized bibliographic information becomes a
resource that is interesting to maintain in its own right.
Note: A library will usually also keep collections of references
and bibliographic information contained in other libraries. This can be
thought of as keeping a reference to the other library.
Note: Libraries traditionally maintain collections of "frozen"
documents. Maintaining references to living documents (and the consequences
of changing content) is still a research topic not part of traditional
library cataloguing work. The characterization of a library above is meant
for a traditional library.