The cybersecurity landscape evolves rapidly and poses threats to organizations. To enhance resilience, one needs to track the latest developments and trends in the domain. For this purpose, we use large language models (LLMs) to extract relevant knowledge entities from cybersecurity-related texts. We use a subset of arXiv preprints on cybersecurity as our data and compare different LLMs in terms of entity recognition (ER) and relevance. The results suggest that LLMs do not produce good knowledge entities that reflect the cybersecurity context.


Research Paper


Source: CEUR Workshop Proceedings


    series = {{CEUR} {Workshop} {Proceedings}},
    title = {{LLM}-{Based} {Entity} {Extraction} {Is} {Not} for {Cybersecurity}},
    volume = {3451},
    url = {https://ceur-ws.org/Vol-3451/#paper5},
    language = {en},
    urldate = {2023-08-16},
    booktitle = {Proceedings of {Joint} {Workshop} of the 4th {Extraction} and {Evaluation} of {Knowledge} {Entities} from {Scientific} {Documents} ({EEKE2023}) and the 3rd {AI} + {Informetrics} ({AII2023})},
    publisher = {CEUR},
    author = {W{\"u}rsch, Maxime and Kucharavy, Andrei and Percia-David, Dimitri and Mermoud, Alain},
    editor = {Zhang, Chengzhi and Zhang, Yi and Mayr, Philipp and Lu, Wei and Suominen, Arho and Chen, Haihua and Ding, Ying},
    month = jun,
    year = {2023},
    note = {ISSN: 1613-0073},
    pages = {26--32},