Many intricate maps of science have lately been created from citation data to visualize the structure of scientific activity. However, most scientific publications are now accessed online. Scholarly web portals record detailed log data at a scale that exceeds the number of all existing citations combined. Such log data is recorded immediately upon publication and keeps track of the sequences of user requests (clickstreams) that are issued by a variety of users across many different domains. Given these advantages of log datasets over citation data, the authors of this study have investigated whether they can produce high-resolution, more current maps of science.Over the course of 2007 and 2008, the authors collected nearly 1 billion user interactions recorded by the scholarly web portals of some of the most significant publishers, aggregators and institutional consortia. The resulting reference data set covers a significant part of world-wide use of scholarly web portals in 2006, and provides a balanced coverage of the humanities, social sciences, and natural sciences. The clickstream model was validated by comparing it to the Getty Research Institute's Architecture and Art Thesaurus (AAT), which was then visualized as a journal network that outlines the relationships between various scientific domains and clarifies the connection of the social sciences and humanities to the natural sciences.
On the first image, circles represent individual journals. The lines that connect journals are the edges of the clickstream model, while colors correspond to the AAT classification of the journal. Labels have been assigned to local clusters of journals that correspond to particular scientific disciplines.
The second map colors journals according to whether the AAT classifies them as either social sciences and humanities journals (yellow) vs. natural science journals (blue). Highly connected clusters corresponding to biology and psychology contain a mix of journals classified in either the social and natural sciences.