Scholarly Paper Recommendation Datasets


We have released basic version of new dataset (``Dataset 2'' without link information) experimented in the following JCDL2013 and IJDL2015 work. You can try basic experiment for content-based scholarly paper recommendation. We have prepared this dataset on a cloud service. If you hope to try our datset, we will be able to provide you with shared link information. We are still organizing citation and reference information (``LinkInfo'') about candidate papers to recommend in a user-friendly format. We will be able to provide complete version of ``Dataset 2'' in the future.


Much of the world's new knowledge today is now largely captured in digital form and archived within a digital library system. However, these trends lead to information overload, where users find an overwhelmingly large number of publications that match their search queries but are largely irrelevant to their latent information needs. We address this problem by providing recommendation results by using latent information about the user's research interests that exists in their publication list (see the papers in ``Publications'' below for further details). We have released experimental datasets used in our papers. If you are interested in recommendation of scholarly papers, please try our dataset for your experiments!

You can also use our datasets (especially, candidate papers to recommend) for other purposes such as classification, clustering, trend analysis, and so on.




Group Members

Kazunari Sugiyama (Kyoto University)
Min-Yen Kan (National University of Singapore)