This essay is not about suggesting any specific original solutions to formal knowledge representation problems. The attempt is to draw from the current wisdom of logics and knowledge representation, to build a comprehensive, flexible, and extendable conceptual schemes of KR and develop a server that stores, modifies, and distributes across the Internet such schemes. Its main objective is to publish any knowledge representation scheme with or without constraints in an object oriented distributed database.
The initial motivation for making GNOWSYS was to develop a community portal (gnowledge.org) that meets the following three objectives: (a) to draw concept graphs and inferences from a knowledge base; (b) the system represents the conceptual scheme of an expert's knowledge in a given area, and reports the matches and mismatches of a learner's conceptual schemes in the process of acquiring knowledge; (c) each node in the expert's conceptual scheme refers to one or more learnable resources like lessons, images, videos, figures, and other such digitally encoded and accessible resources available on the Internet. An obvious application of such a tool is a sophisticated knowledge base for elearning. It may be noted that objectives (a) and (c) are several times easier than the objective (b), since the problem is not merely about static representation of knowledge of an expert and a novice, but to model the dynamics of conceptual change in the process of learning and discovery, which are known to be complex. To fulfil the objective (b) it is required to have a framework to store expert's knowledge.
Some aspects of the problems were realized in the early days of AI by Herbert Simon in his often cited work Sciences of the Artificial[1]. Simon's pioneering contributions to the theory and applications of AI, and Marvin Minsky's idea of frames[2], continue to play a pivotal role in solving the knowledge representation problems even today. This work grows out of and draws from their wisdom. Recently, several researchers used concept maps and semantic networks to enhance conceptual learning[3,4,5,6] in the context of education. Most of these tools, suggested in the above citations, are essentially drawing tools, and the maps drawn by the students or experts could not be stored in an accessible knowledge base. Graphs were stored as separate files, making reusing a component of a graph difficult. Since the graphs were encoded in a format that is internal to the applications, it is difficult to compare two concept graphs, made by different applications, and remain unshareable. The objective of matching and mismatching of concept graphs of two or more agents could not be achieved without a sharable encoding. While designing GNOWSYS these problems were kept in mind, so the graphs generated by other applications could be shared and published by the system.
The epistemological presuppositions (the working hypotheses) of this undertaking are: (1) a cognitive agent understands a new concept when relations are established between the preexisting concepts with the new concept[7,8,3]; (2) to educate a person therefore is to facilitate the process of establishing the relevant relations between concepts so as to match that of an expert; (3) learning therefore involves restructuring of conceptual schemes; and (4) misunderstandings are due to mismatching between conceptual schemes between the agents. According to this approach no concept gains any meaning independent of its relations with other concepts. Thus, meaning of a concept is the network it forms with others.
The sense of understanding used here is stronger since we are seeking that the relations between the concepts be made explicit. For example, when we look at a tree, and recognize that it is indeed a tree, is also understanding of a sort, but it is implicit. Also the term `education' is used here in a strong sense. This does not cover the various forms of behavioral mastery, such as skills, that children learn and execute without any explicit understanding. One of the challenges in education, particularly of exact sciences, is to gradually train learners towards more and more explicit forms of representation. Formal sciences like theoretical physics, mathematics and logic, for example, are domains of discourse where procedural knowledge is declaratively stated and declarative knowledge is procedurally stated reaching a highest degree of explicit knowledge representation[9]. If a system could model the process of learning science beginning from folk-lore to formal knowledge, the system is required to capture implicitly several entities in the knowledge base with entities loosely and sometimes inconsistently held with other entities, and with several degrees of modalities of expressing a proposition.
Thomas Kuhn's Structure of Scientific Revolutions[10] had a major impact on researchers studying cognitive development, and several research programs are guided by this work for studying conceptual change during ontogeny[11]. Conceptual scheme, by which Kuhn also meant the taxonomy (ontology) of a scientific theory, became a very important mode of analysis for modeling conceptual change and inter-theoretic semantic relations (see e.g., [12]). These problems are not new, but recent rise of interest, among the computer science community, in semantic web[13,14], brought computational, semantic and logic oriented problems of ontologies a new life.
The context of learning (cognitive development) and discovery (history of science), is possibly the most challenging problem for AI, because a learner or a scientist respectively, in the course of development, not only changes from one conceptual scheme to another, but also often harbors contradictory systems of beliefs, often due to implicit and unknown parameters of knowledge. Automated real cognitive agents are not like the artificial systems that maintain rigid consistency and require for their functionality explicit frameworks. Thus the problem is to model the transformation from a loosely and often inconsistently structured and implicit epistemologies and ontologies to tightly integrated, explicitly consistent forms.
Further, a model developed say for biology may not be appropriate for mathematics. A model chosen for representing common sense may not be good for science, and a model for science may not be good for formal sciences like logic and mathematics. And the problem becomes even more difficult when we think of mapping different folk-lore of divergent cultures across the globe. This suggests that we need a system that could express multiple epistemologies, ontologies and logics.
Even for exact sciences, it is not an easy task to get together all the pool of predicates required for even a single domain of knowledge. Added to the problem is that an ontology made by one school of thought will sure be contested by another school of thought. Controversies on issues related to epistemology, ontological commitments etc., may make none of the models successful or complete. An awareness of this seemingly impossible and utterly difficult task of modeling the process of learning and using this to build elearning applications by employing the current wisdom of cognitive and computer science suggests that the project has all seeds of failure inbuilt at the very core, therefore doomed to fail.
Is there a way out? I think so, and it is the objective of this essay to suggest a proposal in the form of GNOWSYS. This proposal is not described in a formal language, essentially because its shape is not final, without a final shape it is difficult to perform a discourse in a formal language. I thought it is wise to seek comments from experts at this very stage before freezing the final architecture. It is likely that the model outlined here for GNOWSYS has inherent, known or unknown limitations.
A comment here on ACT-R system developed by J.R. Anderson and his colleagues is appropriate[15]. ACT-R is one striking comprehensive architecture of cognition with various levels of simulating the process of cognition, problem solving and a test bed to evaluate theories of cognitive development with a comprehensive theory of memory. GNOWSYS system, unlike ACT-R, and is not based on the current information processing model of cognition. I have argued elsewhere that semantic memory, as against episodic memory, is not stored in the body of a cognitive agent, and semantics are a function of inter-agent communication[9]. Approach of GNOWSYS is to create a collaborative and communicating knowledge base capable of storing and exchanging propositions, conceptual schemes, behavioral schemes and belief systems with other agents, both human and artificial.
The popular Cyc project, and the recently released OpenCyc (a free software version of Cyc technology), was built as a huge knowledge base containing 2.5 million facts and rules capturing common sense knowledge[16]. GNOWSYS supports such knowledge representation with the difference that the knowledge is stored in a hybrid form (as frames, objects, agents, structures and processes). GNOWSYS kernel does not contain any inference engine, because GNOWSYS by design has provision for expressing executable external libraries implemented in any computer language, thus can support, in principle, any inference implementation through its procedural objects and RPC interphase, leaving the focus on building a high performance knowledge base.
The following section is a non-formal description of the architecture of GNOWSYS. The latter sections explicate how the system can be used for semantic computing applications.