Knowledge Synthesis
3 Structures
Structures of Knowledge
Drawing a big picture of scientific knowledge has always been
desirable for various reasons
Science mapping attempts to find representations of intellectual
connections within the dynamically changing system of scientific
knowledge (Small, 1997)
Science mapping aims at displaying the structural and dynamic
aspects of scientific research (Börner et al.2003; Morris et al., 2008)
Science mapping uses mainly the structures of knowledge”
Three Structures of Knowledge
Discovering hidden patterns
SOCIAL
STRUCTURE
how authors, institutions and countries
interact each other
CONCEPTUAL
STRUCTURE
what science talks about, the main
themes and trends
INTELLECTUAL
STRUCTURE
CONCEPTUAL
STRUCTURE
Each community of scientists would have a complete overview of the main findings related
to their specific field, following the evolution of theories and techniques
Science Mapping allows investigating scientific knowledge from a statistical point of view
how the work of an author influences
a given scientific community
Knowledge Synthesis
Conceptual Structure
Conceptual Structure
Conceptual Structure Tab
Conceptual structure represents relations among concepts or words in a set of
publications:
Words, which appear together in a document, will be related in a network. It
is also known as the co-words network. This structure is be used to
understand the topics covered by a research field to define what are the most
important and the most recent issues (so called, research front). It could also
help in the study of the evolution of subjects over time.
Similarly, to network analysis, factorial analysis (data reduction techniques) is
helpful in identifying subfields. Various dimensionality reduction techniques
can be applied, such as correspondence analysis (CA), multiple
correspondence analysis (MCA), multidimensional scaling (MDS), principal
component analysis (PCA). Clustering algorithms can be used in both cases of
network or factorial analysis.
Mixed approach. Starting from a conceptual network, you identify thematic
networks that plot on abi-dimensional matrix, where axis are function of
centrality and density of the thematic network. Dividing the timespan in time
slices, it is possible to represent the thematic evolution within a specific
research field through an alluvial graph.
Mapping conceptual structure
The scope of a science mapping study can be a scientific discipline, a field of
research, or topic areas concerning specific research questions
keywords
documents
Factorial approach:
Correspondence Analysis
Multiple Correspondence Analysis
Multidimensional Scaling
Network approach
Co-word analysis
keywords
keywords
Conceptual Structure Tab
Factorial Approaches
Conceptual Structure Tab
The basic idea behind factorial approaches is to reduce the
dimensionality of data and represent it in a low-dimensionality space.
Three alternative methodologies:
Correspondence Analysis (CA)
Multiple Correspondence Analysis (MCA)
Multidimensional Scaling (MDS)
Factorial Approach: Interpretation
Conceptual Structure Tab
The proximity between words corresponds to shared-substance:
keywords are close to each other because a large proportion of articles treat
them together;
they are distant from each other when only a small fraction of articles uses
these words together.
The origin of the map represents the average position of all column
profiles and therefore represents the center of the research field
(meaning common and large shared topics) (Cuccurullo et al.2016)
Correspondence Analysis and Clustering
Map of words
Conceptual Structure Tab
Factorial Analysis
Each color
represents a
cluster of
word (a
“topic”)
Clusters are
identified by
hierarchical
clustering
Correspondence Analysis and Clustering
Dendrogram of words
Conceptual Structure Tab
Factorial Analysis
The height
measures the
distance
among words
or cluster of
words
“Similar
words” which
explain a
similar
conceptor
“topic”
distant words”
which define
different
concept
or “topic”
The height helps to
choose where to cut
the dendrogram
defining the partition
Correspondence Analysis and Clustering
Most contributing documents
Conceptual Structure Tab
Factorial Analysis
The map plots the
documents
associated to the
highest absolute
contribution
The absolute
contributions
measure the
weight of each
document in the
information
summarized by
the two axes
The colors
represent the
clusters to which
each document
belongs
This graph allows
to identify the
link among topic
and documents
Correspondence Analysis and Clustering
Most cited documents
Conceptual Structure Tab
Factorial Analysis
Network Analysis
Conceptual Structure Tab
Co-occurrence Network
Graph theory is the study of graphs, which are mathematical
structures used to model pairwise relations between objects
A graph is made up of vertices (also called nodes or points)
which are connected by edges (also called links or lines)
A distinction is made between undirected graphs, where edges
link two vertices symmetrically, and directed graphs, where
edges, then called arrows, link two vertices asymmetrically
Network Analysis and Science Mapping
In Science mapping, a network graph is used to represent co-occurrences
among bibliographic meta-data
The starting point is a co-occurrence matrix
𝐼1𝐼2𝐼3𝐼𝑃
𝐼𝑝
𝐼1
𝐼2
𝐼3
𝐼𝑃
𝐼𝑝
𝑛11
𝑛22
𝑛𝑝𝑝
𝑛𝑃𝑃
𝑛32
Diagonal elements are the
occurrence of each item in the
collection
e.g in co-word analysis:
Each items is words and
an occurrence is the number of
appearances of a particular word in
the document collection.
Non-diagonal elements are
the co-occurrence of two
items in the collection
e.g in co-word analysis:
𝑛32 measures how many times
the words 𝐼3and 𝐼2appears
together in the same corpus
(keyword list, title, abstract, etc.)
Conceptual Structure Tab
Co-occurrence Network
Conceptual Structure Tab
Co-occurrence Network
science
Know.
manag.
perf.
impact
bibliom.
innov. tech.
cit.anal.
int.str.
science
100 17 16 21 20 25 40 26 10 0
knowledge
17 32 0 0 0 0 0 0 0 0
management
16 034 0 0 0 0 0 0 0
performance
21 0 0 70 15 0 0 0 0 0
impact
20 0 0 15 60 014 0 0 0
bibliometrics
25 0 0 0 0 38 13 0 0 0
innovation
40 0 0 0 14 13 80 0 0 0
technology
26 0 0 0 0 0 0 28 0 0
citation
analysis
10 0 0 0 0 0 0 0 40 14
intellectual
structure
0 0 0 0 0 0 0 0 14 36
The vertex size is proportional to the item occurrence (diagonal
elements)
The edge size is proportional to item co-occurrences (non-diagonal
elements)
Each vertex represents an item (in this example, each edge is a word)
Network Analysis and Science Mapping
from the co-occurrence matrix to the network graph
Vertex centrality vs
Vertex farness
Strength
of edges
Vertex
Clustering
(community
detection)
Conceptual Structure Tab
Co-occurrence Network
Network Analysis and Science Mapping
How to read a network graph
Co-occurrence Network
Conceptual Structure Tab
Co-occurrence Network
Each cluster can
be seen as a
“topic”
The colors represent
the clusters to which
each word belongs
Co-occurrence Network
Conceptual Structure Tab
Co-occurrence Network
Network options
Layout
It is possible to choose among
several network layout (Holten et
al., 2009).
Automatic Layout automatically
chooses the best layout in terms of
graph readability
Normalization
Co-occurrences can be normalized
by using similarity measures such
as Salton's Cosine, Jaccard's Index,
Equivalence Index, and Association
Strength (van Eck et al.,2009).
Clustering
Several clustering algorithm
are proposed. The best is
Walktrap(Lancichinetti et
al., 2009).
Repulsion Force
Repulsion force varies
between 0 and 1, where 1
represents the maximum
separation between the
groups.
Thematic Map
Conceptual Structure Tab
Thematic Map
By applying a clustering algorithm on the keyword network, it is possible to highlight the
different themes of a given domain
Each cluster/theme can be represented on a particular plot
known as Strategic or Thematic map (Cobo et al., 2011):
Centrality is a measure of the theme’s relevance
Density is a measure of the theme’s development
Limitations:
Each keyword is associated only with one theme
It is not possible to use themes for document categorization
It is not possible to jointly analyse meta-information
Callon
centrality
Callon
density
MOTOR THEMES
BASIC AND TRANSVERSAL
THEMES
EMERGING OR DECLINING
THEMES
HIGHLY DEVELOPED AND
ISOLATED THEMES
Thematic Map
from a network to a bivariate map
Conceptual Structure Tab
Thematic Map
Callon
centrality
Callon
density
MOTOR THEMES
BASIC AND TRANSVERSAL
THEMES
EMERGING OR DECLINING
THEMES
HIGHLY DEVELOPED AND
ISOLATED THEMES
Word Co-occurrences
Network
Conceptual Structure Tab
Thematic Map
Each bubble represents
a network cluster
The bubble names are
words, belonging in the
cluster, with the higher
occurrence value
The bubble size is
proportional to the cluster
word occurrences
The bubble position is
set according to the
cluster Callon centrality
and density
Topics in business and management
literature using bibliometrics
Thematic Map
Cluster composition
Conceptual Structure Tab
Thematic Map
Thematic Map
Probability of belonging to a cluster
Conceptual Structure Tab
Thematic Map
Papers with a probability of 0.8 or higher almost certainly belong to the cluster k.
Papers with a probability between 0.4 and 0.8 have a high probability of belonging to the cluster k.
A paper can deal with several topics, so it can also belong to more than one cluster.
Thematic Evolution
A longitudinal thematic map analysis
Conceptual Structure Tab
Thematic Evolution
Dividing the time span in different time slices, it is possible to study and
plot the topic evolution (in terms of trajectory along time)
In this graph, the bubble represents an emerging topic that moves towards
mainstream themes area
Callon
centrality
Callon
density
Callon
centrality
Callon
density
Time slice 1 Time slice 2 Time slice t
Thematic Evolution
Setting time slices
Conceptual Structure Tab
Thematic Evolution
1985 2006
2007 2014
2015 2021
Van Raan AFJ, 2006 Burst in publications
Looking at the distribution of publication per year, we decided to spit
our collection into 3 time slices setting 2 cutting points 2006 and 2014:
Thematic Evolution
A longitudinal thematic map analysis
Conceptual Structure Tab
Thematic Evolution
Time slice
1985 -2006
Thematic Evolution
A longitudinal thematic map analysis
Conceptual Structure Tab
Thematic Evolution
Time slice
2007 -2014
Thematic Evolution
A longitudinal thematic map analysis
Conceptual Structure Tab
Thematic Evolution
Time slice
2015 -2021
Thematic Evolution
A longitudinal thematic map analysis
Conceptual Structure Tab
Thematic Evolution
it is also possible to highlight the tendencies of some topics to merge
together, or of a topic to split into several themes
Callon
centrality
Callon
density
Callon
centrality
Callon
density
Callon
centrality
Callon
density
Time slice 1 Time slice 2 Time slice t
Thematic Evolution
A longitudinal thematic map analysis
Conceptual Structure Tab
Thematic Evolution
Knowledge Synthesis
Intellectual Structure
Intellectual Structure
Intellectual Structure Tab
Intellectual structure shows relationships between nodes which
represent references
Network edges can have different interpretations depending on the
citation type (co-citation or direct citation)
Citation analysis is the most common analysis in bibliometrics in the
form of co-citation between authors or documents.
Co-citation analysis, when examined over time, is helpful in detecting
a shift in paradigms and schools of thought.
Intellectual Structure
Intellectual Structure Tab
Two type of analysis:
Co-citation network
Small, 1973
Historiographic mapping
Garfield, 2004
Co-citation analysis
Intellectual Structure Tab
We talk about co-citation of two documents when
both are cited in a third document
Document A and “B” are “references” co-cited by
documents “C, “D” and “E”
Co-citations can be represented in aco-occurrence
matrix just like co-word analysis
Local citations
of 𝐼𝑝
Co-citations
among 𝐼2and 𝐼3
Co-citation network
Intellectual Structure Tab
Co-citation network
Co-citation network
Intellectual Structure Tab
Co-citation network
Co-citation network
Intellectual Structure Tab
Co-citation network
Historiographic mapping
Intellectual Structure Tab
A
C
D
E
BF
G
I
H
time
presentpast
Each historical path identifies a research topic and its core
authors/documents
First path
Second path
Historiographic mapping
Intellectual Structure Tab
Some basic concepts about the historiograph:
Each node represents a document (included in the analyzed
collection) cited by other documents
Each edge represents a direct citation (e.g. D cited A; G cited D; etc.)
Nodes and Edges are plotted on an oriented graph where the
horizontal axis represents the publication years
Historiographic mapping
Intellectual Structure Tab
Historiograph
Only one path
Historiographic mapping
Intellectual Structure Tab
Historiograph
Only one path
Historiographic mapping
Intellectual Structure Tab
Historiograph
Authors’ keywords Keywords Plus
Historiographic mapping
Intellectual Structure Tab
Historiograph
Local
citations
Global
citations
Knowledge Synthesis
Social Structure
Social Structures
Social structure shows how authors or institutions relate to others in
the field of scientific research.
The most common kind of social structure is co-authorship network
(Peters et al.1991)
With co-authorship networks can be discovered, for example, groups
of regular authors, influent authors, hidden communities of authors,
relevant institutions in a specific research field, etc.
Social Structure Tab
Author collaboration
Social Structure Tab
Collaboration Network
Institutions collaboration
Social Structure Tab
Collaboration Network
Country collaboration
Social Structure Tab
Collaboration Network
Country collaboration
Social Structure Tab
Collaboration World Map
References (1)
Aria, M., & Cuccurullo, C. (2017). bibliometrix: An R-tool for comprehensive science mapping analysis. Journal of Informetrics, 11(4), 959-975.
Aria M., Misuraca M., Spano M. (2020) Mapping the evolution of social research and data science on 30 years of Social Indicators Research, Social
Indicators Research. (DOI: https://doi.org/10.1007/s11205-020-02281-3)
Aria, M., Cuccurullo, C., D’Aniello, L., Misuraca, M., & Spano, M. (2022). Thematic Analysis as a New Culturomic Tool: The Social Media Coverage on
COVID-19 Pandemic in Italy. Sustainability, 14(6), 3643, (https://doi.org/10.3390/su14063643)
Aria M., Alterisio A., Scandurra A, Pinelli C., D’Aniello B, (2021) The scholars best friend: research trends in dog cognitive and behavioural
studies, Animal Cognition. (https://doi.org/10.1007/s10071-020-01448-2)
Cuccurullo, C., Aria, M., & Sarto, F. (2016). Foundations and trends in performance management. A twenty-five years bibliometric analysis in business
and public administration domains, Scientometrics, DOI: 10.1007/s11192-016-1948-8 (https://doi.org/10.1007/s11192-016-1948-8)
Cuccurullo, C., Aria, M., & Sarto, F. (2015). Twenty years of research on performance management in business and public administration domains.
Presentation at the Correspondence Analysis and Related Methods conference (CARME 2015) in September 2015
(https://www.bibliometrix.org/documents/2015Carme_cuccurulloetal.pdf)
Sarto, F., Cuccurullo, C., & Aria, M. (2014). Exploring healthcare governance literature: systematic review and paths for future
research. Mecosan (https://www.francoangeli.it/Riviste/Scheda_Rivista.aspx?IDarticolo=52780&lingua=en)
Cuccurullo, C., Aria, M., & Sarto, F. (2013). Twenty years of research on performance management in business and public administration domains.
In Academy of Management Proceedings (Vol. 2013, No. 1, p. 14270). Academy of Management (https://doi.org/10.5465/AMBPP.2013.14270abstract)
Belfiore, A., Salatino, A., & Osborne, F. (2022). Characterising Research Areas in the field of AI. arXiv preprint
arXiv:2205.13471.(https://doi.org/10.48550/arXiv.2205.13471)
Belfiore, A., Cuccurullo, C., & Aria, M. (2022). IoT in healthcare: A scientometric analysis. Technological Forecasting and Social Change, 184, 122001.
(https://doi.org/10.1016/j.techfore.2022.122001)
References (2)
Börner, K., Chen, C., & Boyack, K. (2003). Visualizing knowledge domains. Annual Review of Information Science and Technology, 37,179255.
Cobo, M. J., López-Herrera, A. G., Herrera-Viedma, E., & Herrera, F. (2011). An approach for detecting, quantifying, and visualizing the evolution of a
research field: A practical application to the fuzzy sets theory field. Journal of Informetrics, 5(1), 146-166
Garfield, E., & Sher, I.H. ( 1993). Keywords PlusTM algorithmic derivative indexing. Journal of the American Society for Information Science,44( 5), 298
299.
Garfield, E. (2004). Historiographic mapping of knowledge domains literature. Journal of Information Science,30(2), 119-145.
Holten, D., & Van Wijk, J. J. (2009, June). Force‐directed edge bundling for graph visualization. In Computer graphics forum (Vol. 28,No. 3, pp.983-
990). Oxford, UK: Blackwell Publishing Ltd.
Lancichinetti, A., & Fortunato, S. (2009). Community detection algorithms: a comparative analysis. Physical review E, 80(5), 056117.
McDonald, J. H. (2009). Handbook of biological statistics (Vol. 2, pp.173-181). Baltimore, MD: sparky house publishing.
Morris, S., & Van Der Veer Martens, B. (2008). Mapping research specialties. Annual Review of Information Science and Technology, 42(1), 213295.
Peters, H., & Van Raan, A. (1991). Structuring scientific activities by co-author analysis: An expercise on a university faculty level. Scientometrics, 20(1),
235-255.
Small, H. (1973). Co‐citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society
for information Science, 24(4), 265-269.
Small, H. (1997). Update on science mapping: Creating large document spaces. Scientometrics, 38(2), 275293.
Van Eck, N. J., & Waltman, L. (2009). How to normalize cooccurrence data? An analysis of some well‐known similarity measures. Journal of the
American society for information science and technology, 60(8), 1635-1651.
Zhang, J., Yu, Q., Zheng, F., Long, C., Lu, Z., & Duan, Z. (2016). Comparing keywords plus of WOS and author keywords: A case study of patient
adherence research. Journal of the Association for Information Science and Technology,67(4), 967-972.
Remember to cite:
Aria, M., & Cuccurullo, C. (2017). bibliometrix:
An R-tool for comprehensive science mapping analysis.
Journal of informetrics, 11 (4), 959-975.
Visit our websit to stay updated, to receive tips and suggestions
https://www.bibliometrix.org/
Contact us if you need
info@bibliometrix.org
Support bibliometrix with donations or coding
https://www.bibliometrix.org/home/index.php/donation/