8. 2.2 Networking knowledge — Johannes Pointner

2.2 Networking knowledge

So, how could a different method for organizing the knowledge pool of VW Bib look like that makes use of the processing power of digital technology and potentially escapes the reductionist predicament to create higher degrees of complexity? Let’s recap that librarians had to resort to reductionist strategies because the sheer amounts of knowledge that entered their libraries in the form of books exceeded the processing capacity of the system – humans, with the help of their limited technological facilitators could only go through them on a superficial level resulting in the labelling of their contents. Digital technologies are no longer subjected to limitations like these. Not only can algorithms that run on potent hardware go through way bigger amounts of data in much shorter time, they also can analyze it in a very profound or multidimensional manner. The way I use the term multidimensional in this context originates from how natural language processing [NLP] algorithms work which are commonly used to extract the semantic meaning of text documents. These software packages are trained through machine learning to translate words and sentences into mathematical vectors, which are then embedded in a geometrical space of several hundred dimensions. The higher the amount of dimension, the higher the amount of analyzed linguistic features and the better the capture of textual nuances. Suppose we feed all the data from the existing library system into these algorithms: the publications could be arranged only regarding their similarity to each other on the semantic level without the need for a static framework. The more similar their contents are, the closer elements are to each other within the multidimensional space. Because every entity’s position is defined by its relationships to all the other entities of the system the resulting configuration then can be called a network. Whenever you add or change one of the singular parts the entire network accordingly would change its form. In this sense the tree-logic is inversed: not the whole vertically defines its parts, but all the parts horizontally determine the whole. And through the constant modification of the knowledge pool this whole is everchanging, never has a stable form. That approach also doesn’t rely on reproducing the reductionist categorization into scientific disciplines. Instead of a binary logic (in which an entity is either part of a category or not) it introduces a facetted distribution (in which every entity has graduated degrees of similarity to all other entities). Against this backdrop, transdisciplinarity therefore becomes a mandatory requirement.

Similar approaches were already implemented by others and as the two examples on the right demonstrate visually compelling results can be reached. They show very well how much these representations differ from the conventional kinds of knowledge structures. But apart from the fact that they both are still images that don’t really depict the dynamic flux every network is characterized by, there is a more fundamental concern that can be critically remarked. It addresses the issue of dimensionality reduction. In the explanations above regarding the NLP algorithms I mentioned that they operate in mathematical spaces of several hundred dimensions. A graph (like the two examples) obviously only has two dimensions. How do you then get from let’s say 500 dimensions to 2? This issue is trickier and more significant than it might appear at first, so let’s have a look. Our point of departure is the high-dimensional space that the NLP algorithms leave us with. In all these dimensions, the full relational richness of the publications is stored. Now, there are different ways to access this highly complicated data. The most common approach is to reduce the dimensions so that we can graphically represent the space with 2 or 3 dimensions. The simplest way to do that is by the geometrical operation of projection. To make this clearer we can use the example of projecting 3d-data on a 2d-plane, as the graph (V1) shows below. The dimensionality will be successfully reduced from 3 to 2, but in the process distortions between the original higher-dimensional relation occur. Consequently, the outcome will lose a good part of its validity. For the method of dimensionality reduction this circumstance is almost unavoidable. Of course, there are more sophisticated stochastic methods than simple projection (e.g., t-SNE) that try to keep more relational information through clustering the data. But even in this case much of the original complexity and textual equivocality is lost.

The urge to boil vast collections of publications or other knowledge representations down to a 2d plane is very much understandable (and has a long tradition). We want to rationally and individually (therefore reductively) make sense of these sorts of epistemic collections because we are used to do so. But in my opinion, intelligent algorithms introduce something that goes beyond the single brain. Let’s illustrate that with a different way of how we could deal with the high-dimensional dataset. It proposes another mode of relating to knowledge that corresponds to the postulate of complexity and manages to escape reductionism: Let’s call it the situational approach.

Suppose we select a point on the sphere and try to determine its nearest neighbors (V2). Using a simple mathematical operation (Cosine similarity), we can calculate how close the other points on the sphere are to the selected point. These distance relations can be then represented in two (or three) dimensions. The same operation can be performed for other points, resulting in a multitude of singular individual representations, each of which viewing the system from its own "perspective". Within these situational constellations, however, the considered facets of the original complexity remain. The result is a kind of scan of the 3D sphere. Its individual, situational representations summed up approximate the total, higher-dimensional complexity. This method, however, refuses to be represented in only one totalizing map as in the examples before, but requires a new situational configuration and processing for each query.

Out of 263,550 documents from an online library 694 classes (nodes) were extracted based on similarity. The colors indicate their originally assigned class in the DDC. Graph from Jae-wook Ahn, Xia Lin, Michael Khoo. Dewey Decimal Classification Based Concept Visualization for Information Retrieval (KMIR, 2014), 4. Accessed 18.01.21, PDF.