Consensus embedding for multiple networks: Computation and applications
Abstract
Machine learning applications on large-scale network-structured data commonly encode network information in the form of node embeddings. Network embedding algorithms map the nodes into a lowdimensional space such that the nodes that are “similar” with respect to network topology are also close
to each other in the embedding space. Real-world networks often have multiple versions or can be “multiplex” with multiple types of edges with different semantics. For such networks, computation of Consensus
Embeddings based on the node embeddings of individual versions can be useful for various reasons, including privacy, efficiency, and effectiveness of analyses. Here, we systematically investigate the performance
of three dimensionality reduction methods in computing consensus embeddings on networks with multiple versions: singular value decomposition, variational auto-encoders, and canonical correlation analysis
(CCA). Our results show that (i) CCA outperforms other dimensionality reduction methods in computing
concensus embeddings, (ii) in the context of link prediction, consensus embeddings can be used to make
predictions with accuracy close to that provided by embeddings of integrated networks, and (iii) consensus embeddings can be used to improve the efficiency of combinatorial link prediction queries on multiple
networks by multiple orders of magnitude