publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2024

  1. Starcoder 2 and the stack v2: The next generation
    Anton Lozhkov, Raymond Li, Loubna Ben Allal, and 8 more authors
    arXiv preprint arXiv:2402.19173, 2024
  2. Resources for Combining Teaching and Research in Information Retrieval Coursework
    Maik Fröbe, Harrisen Scells, Theresa Elstner, and 7 more authors
    In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024
  3. Teaching Information Retrieval with a Shared Task Across Universities: First Steps and Findings
    Maik Fröbe, Christopher Akiki, Timo Breuer, and 8 more authors
    2024

2023

  1. Towards openness beyond open access: User journeys through 3 Open AI Collaboratives
    Jennifer Ding, Christopher Akiki, Yacine Jernite, and 2 more authors
    arXiv preprint arXiv:2301.08488, 2023
  2. Santacoder: don’t reach for the stars!
    Loubna Ben Allal, Raymond Li, Denis Kocetkov, and 8 more authors
    arXiv preprint arXiv:2301.03988, 2023
  3. Shared tasks as tutorials: A methodical approach
    Theresa Elstner, Frank Loebe, Yamen Ajjour, and 8 more authors
    In Proceedings of the AAAI Conference on Artificial Intelligence, 2023
  4. The ROOTS search tool: Data transparency for LLMs
    Aleksandra Piktus, Christopher Akiki, Paulo Villegas, and 5 more authors
    arXiv preprint arXiv:2302.14035, 2023
  5. Spacerini: Plug-and-play search engines with Pyserini and Hugging Face
    Christopher Akiki, Odunayo Ogundepo, Aleksandra Piktus, and 4 more authors
    arXiv preprint arXiv:2302.14534, 2023
  6. Exploring hyperparameter usage and tuning in machine learning research
    Sebastian Simon, Nikolay Kolyada, Christopher Akiki, and 3 more authors
    Papers on tuning, 2023
  7. Starcoder: may the source be with you!
    Raymond Li, Loubna Ben Allal, Yangtian Zi, and 8 more authors
    arXiv preprint arXiv:2305.06161, 2023
  8. GAIA search: Hugging face and pyserini interoperability for nlp training data exploration
    Aleksandra Piktus, Odunayo Ogundepo, Christopher Akiki, and 6 more authors
    arXiv preprint arXiv:2306.01481, 2023
  9. Stable bias: Evaluating societal representations in diffusion models
    Sasha Luccioni, Christopher Akiki, Margaret Mitchell, and 1 more author
    Advances in Neural Information Processing Systems, 2023

2022

  1. Tracking discourse influence in darknet forums
    Christopher Akiki, Lukas Gienapp, and Martin Potthast
    arXiv preprint arXiv:2202.02081, 2022
  2. Entities, dates, and languages: Zero-shot on historical texts with t0
    Francesco De Toni, Christopher Akiki, Javier De La Rosa, and 4 more authors
    arXiv preprint arXiv:2204.05211, 2022
  3. How Train–Test Leakage Affects Zero-Shot Retrieval
    Maik Fröbe, Christopher Akiki, Martin Potthast, and 1 more author
    In International Symposium on String Processing and Information Retrieval, 2022
  4. Noise-reduction for automatically transferred relevance judgments
    Maik Fröbe, Christopher Akiki, Martin Potthast, and 1 more author
    In International Conference of the Cross-Language Evaluation Forum for European Languages, 2022
  5. The bigscience roots corpus: A 1.6 tb composite multilingual dataset
    Hugo Laurençon, Lucile Saulnier, Thomas Wang, and 8 more authors
    Advances in Neural Information Processing Systems, 2022
  6. Bigscience: A case study in the social construction of a multilingual large language model
    Christopher Akiki, Giada Pistilli, Margot Mieskes, and 4 more authors
    arXiv preprint arXiv:2212.04960, 2022
  7. Bloom: A 176b-parameter open-access multilingual language model
    BigScience Workshop, Teven Le Scao, Angela Fan, and 8 more authors
    arXiv preprint arXiv:2211.05100, 2022
  8. How Train-Test Leakage Affects
    Maik Fröbel, Christopher Akiki, Martin Potthast, and 1 more author
    In String Processing and Information Retrieval: 29th International Symposium, SPIRE 2022, Concepción, Chile, November 8–10, 2022, Proceedings, 2022

2021

  1. Learning to Rank Arguments with Feature Selection.
    Christopher Akiki, Maik Fröbe, Matthias Hagen, and 1 more author
    In CLEF (Working Notes), 2021
  2. Muse: The musical sentiment dataset
    Christopher Akiki and Manuel Burghardt
    Journal of Open Humanities Data, 2021
  3. BERTian Poetics: Constrained Composition with Masked LMs
    Christopher Akiki and Martin Potthast
    arXiv preprint arXiv:2110.15181, 2021

2020

  1. Exploring Argument Retrieval with Transformers.
    Christopher Akiki and Martin Potthast
    In CLEF (Working Notes), 2020
  2. Toward a Musical Sentiment (MuSe) Dataset for Affective Distant Hearing
    Christopher Akiki and Manuel Burghardt
    In Workshop on Computational Humanities Research (CHR 2020), 2020