https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/Head https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY http://www.nanopub.org/nschema#hasAssertion https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/assertion https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY http://www.nanopub.org/nschema#hasProvenance https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/provenance https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY http://www.nanopub.org/nschema#hasPublicationInfo https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/pubinfo https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.nanopub.org/nschema#Nanopublication https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/assertion http://id.crossref.org/issn/1868-1158 http://purl.org/dc/terms/title Studies on the Semantic Web https://doi.org/10.3233/SSW240006 http://purl.org/dc/terms/abstract Traditional dataset retrieval systems rely on metadata for indexing, rather than on the underlying data values. However, high-quality metadata creation and enrichment often require manual annotations, which is a labour-intensive and challenging process to automate. In this study, we propose a method to support metadata enrichment using topic annotations generated by three Large Language Models (LLMs): ChatGPT-3.5, GoogleBard, and GoogleGemini. Our analysis focuses on classifying column headers based on domain-specific topics from the Consortium of European Social Science Data Archives (CESSDA), a Linked Data controlled vocabulary. Our approach operates in a zero-shot setting, integrating the controlled topic vocabulary directly within the input prompt. This integration serves as a Large Context Windows approach, with the aim of improving the results of the topic classification task. We evaluated the performance of the LLMs in terms of internal consistency, inter-machine alignment, and agreement with human classification. Additionally, we investigate the impact of contextual information (i.e., dataset description) on the classification outcomes. Our findings suggest that ChatGPT and GoogleGemini outperform GoogleBard in terms of internal consistency as well as LLM-human-agreement. Interestingly, we found that contextual information had no significant impact on LLM performance. This work proposes a novel approach that leverages LLMs for topic classification of column headers using a controlled vocabulary, presenting a practical application of LLMs and Large Context Windows within the Semantic Web domain. This approach has the potential to facilitate automated metadata enrichment, thereby enhancing dataset retrieval and the Findability, Accessibility, Interoperability, and Reusability (FAIR) of research data on the Web. https://doi.org/10.3233/SSW240006 http://purl.org/dc/terms/date 2024-09-11 https://doi.org/10.3233/SSW240006 http://purl.org/dc/terms/isPartOf http://id.crossref.org/issn/1868-1158 https://doi.org/10.3233/SSW240006 http://purl.org/dc/terms/title Zero-Shot Topic Classification of Column Headers: Leveraging LLMs for Metadata Enrichment https://doi.org/10.3233/SSW240006 http://purl.org/ontology/bibo/authorList https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/author-list https://doi.org/10.3233/SSW240006 http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://purl.org/spar/fabio/Article https://orcid.org/0000-0001-8004-0464 http://schema.org/affiliation https://ror.org/008xxew50 https://orcid.org/0000-0001-8004-0464 http://xmlns.com/foaf/0.1/name Margherita Martorana https://orcid.org/0000-0002-1267-0234 http://schema.org/affiliation https://ror.org/008xxew50 https://orcid.org/0000-0002-1267-0234 http://xmlns.com/foaf/0.1/name Tobias Kuhn https://orcid.org/0000-0002-2146-4803 http://schema.org/affiliation https://ror.org/008xxew50 https://orcid.org/0000-0002-2146-4803 http://xmlns.com/foaf/0.1/name Lise Stork https://orcid.org/0000-0002-7748-4715 http://schema.org/affiliation https://ror.org/008xxew50 https://orcid.org/0000-0002-7748-4715 http://xmlns.com/foaf/0.1/name Jacco van Ossenbruggen https://ror.org/008xxew50 http://xmlns.com/foaf/0.1/name Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, Netherlands https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/author-list http://www.w3.org/1999/02/22-rdf-syntax-ns#_1 https://orcid.org/0000-0001-8004-0464 https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/author-list http://www.w3.org/1999/02/22-rdf-syntax-ns#_2 https://orcid.org/0000-0002-1267-0234 https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/author-list http://www.w3.org/1999/02/22-rdf-syntax-ns#_3 https://orcid.org/0000-0002-2146-4803 https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/author-list http://www.w3.org/1999/02/22-rdf-syntax-ns#_4 https://orcid.org/0000-0002-7748-4715 https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/provenance https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/assertion http://www.w3.org/ns/prov#wasAttributedTo https://orcid.org/0000-0001-8004-0464 https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/assertion http://www.w3.org/ns/prov#wasAttributedTo https://orcid.org/0000-0002-1267-0234 https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/assertion http://www.w3.org/ns/prov#wasAttributedTo https://orcid.org/0000-0002-2146-4803 https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/assertion http://www.w3.org/ns/prov#wasAttributedTo https://orcid.org/0000-0002-7748-4715 https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/assertion http://www.w3.org/ns/prov#wasDerivedFrom https://doi.org/10.3233/SSW240006 https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/pubinfo https://orcid.org/0000-0001-8004-0464 http://xmlns.com/foaf/0.1/name Margherita Martorana https://orcid.org/0000-0002-1267-0234 http://xmlns.com/foaf/0.1/name Tobias Kuhn https://orcid.org/0000-0002-2146-4803 http://xmlns.com/foaf/0.1/name Lise Stork https://orcid.org/0000-0002-7748-4715 http://xmlns.com/foaf/0.1/name Jacco van Ossenbruggen https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY http://purl.org/dc/terms/created 2026-02-22T17:36:53.000+01:00 https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY http://purl.org/dc/terms/creator https://w3id.org/np/RAkkUz7qBJ-BIOCHV_4WCTgHCdTyI25_bnRuw166SXjwM/DOI-bot https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY http://purl.org/dc/terms/license https://creativecommons.org/publicdomain/zero/1.0/ https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY http://purl.org/nanopub/x/hasNanopubType http://purl.org/spar/fabio/ScholarlyWork https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY http://purl.org/nanopub/x/introduces https://doi.org/10.3233/SSW240006 https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY http://www.w3.org/2000/01/rdf-schema#label Zero-Shot Topic Classification of Column Headers: Leveraging LLMs for Metadata Enrichment https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/sig http://purl.org/nanopub/x/hasAlgorithm RSA https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/sig http://purl.org/nanopub/x/hasPublicKey MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArRL5MjH1KfuE89dpKsIiscF/THrJ4uSvhl0NgaC8x3TdTDrL00kCnlH+2g7PMYhaUQIGWq27TTXHAGp7ehO8yLjRNeDCc8zjUCQJqLbzay3DB51PCiz50OsMgxiZC1+e0bVdk/CAQV4oVo+VgI+awHI1bTT4Yp7pR2I67imf1PIcwczGVhn8EQwtNdWQOZ63wDgUCY+6IubHBQzjLfbYh0828UETEyIV28T7fvf5+y4A5M590InmgkLGpJbRXoL0pnCm1BtFOoxeAVqfivbxIZWPYN2Yd0cSfqwIIUYyaLFpjDrBwc4iJdOus4UQ9OYqkeZDMpU3opU8jWKDIm77jwIDAQAB https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/sig http://purl.org/nanopub/x/hasSignature PKv1MVA/4qMw7Pemeg1xGGWWB9QTFjx2dSM9Ac7Az82ZzS1xGY3GAGyyKELrEnwAIk7tlZZNrTeAurar045rIYnU6cg6EV8E37ZfdV+LgC5FHj+wsHT9h86VjBgToDkio69gicCP+KWr8vQnTHPmo1lTx6zgkPiTxLuIE/UGyQ6acgkVgQFg4M0+c60qdnXPLGKU331tJW60IxRa1dZQo7c54dKJSE+Xk6HVHoI8MCA4s4e6xw6U42qUiouLHLY5yOe+Pw1haAAo1URhftNDhE+5huycBlKEVjOKfsvInPhJ9HentE9l9Tt8LnzTpjZgAXBxlYB2igi41dcjVfH3nQ== https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/sig http://purl.org/nanopub/x/hasSignatureTarget https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY https://w3id.org/np/RAjg_CiZk1KUas15hpkYrxZRoro-umA0aTl_l2IEIjplY/sig http://purl.org/nanopub/x/signedBy https://w3id.org/np/RAkkUz7qBJ-BIOCHV_4WCTgHCdTyI25_bnRuw166SXjwM/DOI-bot