Social Network Analysis (SNA) is the process of exploring network structures according to graph theory. Based on human interactions, SNA helps us to establish different kinds of relationships similar to the social network. By systematically analysing the data on these relationships, it is possible to establish deeper insights into the behaviour of people in accordance with their role in the network. SNA analytics provide us with crucial metrics to analyse the risk associated with the spread of Covid-19 among the individuals and communities in a geographical location.
The below Vantage visualization represents a SNA network. Each node in the SNA graph represents a person or a tracking device while the edges represent the connection made by the nodes in the network.
Figure 1: Vantage Visualization – SNA Network
Risk of Introduction (Betweenness)
Betweenness is a measure of the centrality of a node in a network and it represents the degree to which these nodes stand between communities in a network. High betweenness individuals are often found at the intersections of more densely connected network communities and hence critically positioned to perform the role of “Agents of Introduction” across the entire network.
These people reside in the periphery of several large communities and have the potential to “Introduce” the virus to the entire network of communities.
Example: We have an asymptomatic person coming into a city via the airport and taking a metro to reach home. He has not travelled extensively in the city; however, he has introduced the virus in key places like “Metro Stations.”
Figure 2: Vantage Visualization – Betweenness
Risk of Spread (Closeness)
Closeness is distance-based centrality metric used in network analysis. Closeness is a measure of average social distance from each individual to every other individual in the network. These individuals may not be leaders of the community, however they tend to have short paths for infection spread within the cluster.
High closeness centrality individuals tend to be “Agents of Spread” within their local network community.
Example: A pizza delivery guy traveling extensively within a small community of people.
Figure 3: Vantage Visualization – Closeness
Risk of Danger – Eigenvector Centrality
Eigenvector Centrality is a measure of the influence of a node in a network. It assigns relative scores to all nodes in the network based on the concept that connections to high-scoring nodes contribute more to the score of the node in question than equal connections to low-scoring nodes. A high eigenvector score means that a node is connected to many nodes who themselves have high scores.
Individuals with high eigenvector centrality scores are often more likely to be more “Agents of High Vulnerability” to the infection spread.
Example: These are the front line medical workers getting exposed to people with infection.
Figure 4: Vantage Visualization – Eigenvector Centrality
Individual Risk Scores
Figure 5: Individual Risk Scores
Community detection and identification plays an important role in the understanding of the infection in a given network.
Figure 6: Vantage Visualization - Modularity
The Modularity function uses a clustering algorithm to detect communities in networks (graphs). A community refers to a subset of a network consisting of vertex that are more highly interconnected with each other than they are with nodes that are not part of the community.
Clustering Coefficients give us an idea about the structure of the graph rather than the importance of the nodes themselves.
Figure 7: Global Clustering Coefficient
Global Clustering Coefficient measure gives an indication of the clustering in the whole of network. It is a ratio of the number of closed triplets formed to possible number of triplets that can be formed within a network.
Global clustering coefficient helps us to compare the risk associated with a community of people in a locality with other localities.
Example: We can compare the risk associated with a particular floor with the other floors in the building.
Local Clustering Coefficient is a measure of embeddedness of the individual in the cluster. The average of local clustering coefficient gives us an indication of the behavior of the individual nodes in the community.
Figure 7: Community risk score
Enhanced Contact Tracing:
Traditional contact tracing helps us to track the number of people infected with respect to time in a given locality. We can also keep a track of susceptible people in the same locality.
Figure 8: Traditional Contact Tracking
SNA analytics helps us to trace back the point of exposure both within and outside of a given locality. This gives us additional metrics to understand the source of infection.
Figure 9: Enhanced Contact Tracing
Reverse Contact Tracing
We can do a multi-level reverse contact tracing to understand various touch points through which the infection is introduced into a locality.
Figure 10: Reverse Contact Tracing
Drill Down Tracing
We can narrow down the analysis to find the list of people with direct contact (Level 1) and their direct contacts (level 2) exposure to an infected person. This will help us identify and isolate the susceptible individual quickly and contain the spread of infection.
Figure 11: Drill Down Tracing
Popular Pattern of Movement
We can do a comparison of the popular pattern of movement in a locality. The popular pattern for a youngster is between “Residence and School” while that of a family man or woman is between “Residence and Office."
Youngster Vs Family man/woman
In conclusion, with the help of Teradata Vantage
’s advanced analytic functions, the SNA analytics framework helps us with useful insights to detect patterns of life and identify various risk factors associated with the spread of Covid-19.