CAIML #31

After July 2023, we are thrilled to return to TH Köln for CAIML #31 on July 2, 2024. Thanks to TH Köln, Prof. Gernot Heisenberg and KölnBusiness for their support!

Agenda

18h30 Open Doors

19h00 Welcome & Intro

19h15 Fabian Haak - Wissenschaftlicher Mitarbeiter at TH Köln: Quantifying Subjective Phenomena: Simulation, Detection, and Synthetic Training Data with LLMs

There are many aspects of different forms of media that are relevant to measure as subjective. Is a text easy to read? Is a comment offensive? Is a document relevant? Is a decision morally just? The inherent problem with measuring these aspects is that different humans would judge differently based on a range of often intransparent item properties, as well as the person’s personal experiences, attitudes, and preexisting knowledge. Despite that, effective means of measuring these aspects and creating training data representative of a typical user or a certain target audience are essential for ensuring quality, fairness, and relevance in media consumption and production. We at CIR, Cologne Information Retrieval Group at TH Köln, use a multi-stage approach leveraging large language models to quantify subjective phenomena and construct synthetic training data, that are on par with human-annotated datasets.

19h50 Marcel Kurovski - Senior Data Scientist at Wolt: Personalizing Carousel Ranking on Wolt’s Discovery Page: A Hierarchical Multi-Armed Bandit Approach

Personalized carousel ranking presents a major recommendation challenge across many domains like content streaming, ecommerce or quick commerce. We present a hierarchical multi-armed bandit (MAB) solution for personalizing the ranking of carousels on Wolt’s Discovery page. The Discovery page serves as the primary gateway for millions of weekly users exploring diverse cuisines and products. First, we illustrate the specific challenges of an (almost) everything online delivery platform and our goals for Wolt’s Discovery page. Second, we summarize how the problem of page personalization looks like in other domains and how others tackle it. We then present the approach and architecture of our Discovery Vertical Content Ranker (DVCR). Our approach hierarchically combines city-, context- and user-specific implicit feedback data to rank carousels on Wolt’s Discovery page. We illustrate the architecture to make this solution resilient, scalable and adaptive. We’re leveraging mlflow for tracking and lineage, Flyte for ML workflows, Redis for serving features, and Seldon Core for serving user requests online fast and reliably. We will wrap up with our learnings and an outlook.

20h20 Networking with food and drinks provided by KölnBusiness

Updated: