Abstract
Tag genome is widely used in recommender systems research to, for example, measure item similarity, make recommendations and generate recommendation explanations. Applying tag genome to problems in cross-domain recommendation, however, is complicated by the limited item overlap between cross-domain recommendation data sets and the available tag genomes. Furthermore, existing tag prediction models rely on content-based features that are not readily available in a majority of recommendation data sets. To address these issues, we generated tag genomes for both movies and books based on the Amazon data set, which is widely used in cross-domain recommendation research. These new tag genomes are over 200 × larger than the previous versions and can support comparative evaluation of tag-based and collaborative methods, facilitate the development of new cross-domain recommendation algorithms and provide a foundation for studying phenomena, such as serendipity and diversity, across multiple domains. Both data sets and the data generation pipeline are freely available at https://github.com/Bionic1251/Expanded-Tag-Genomes.
| Original language | English |
|---|---|
| Title of host publication | CHIIR '26: Proceedings of the 2026 Conference on Human Information Interaction and Retrieval |
| Editors | Chirag Shah, Ryen W. White, Adam Fourney, Carla Teixeira Lopes, Johanne Trippas |
| Publisher | Association for Computing Machinery (ACM) |
| Pages | 84-88 |
| Number of pages | 5 |
| ISBN (Print) | 979-8-4007-2414-5 |
| DOIs | |
| Publication status | Published - 22 Mar 2026 |
Keywords
- tag genome
- tagging
- recommender systems
- cross-domain recommendation
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver