Scalable Trustworthy AI

Creating scalable and trustworthy AI with human guidance

Overview

AI is no longer a research curiosity. It is reshaping how we live and work. To fully exploit its benefits, we must address critical gaps in trustworthiness.

Current foundational models like LLMs have critical trustworthiness problems: they hallucinate false information, fail at continual learning, resist knowledge editing (making GDPR compliance impractical), leak private information embedded in parameters, and require prohibitive compute for training and personalisation. These issues are blocking the widespread adoption of AI and the productivity revolution it promises.

Our approach: Knowledge-Intelligence Separation. Just as the code-data separation in the 1960s enabled the modern software industry, we believe this separation is the key to unlocking AI’s full potential. When knowledge is stored in interpretable, editable external modules while intelligence (reasoning, generalisation) remains in the model, we enable faster customisation, training data attribution by design, and knowledge editing and unlearning .

Our work spans a range of interconnected areas:

We are not alone in this effort. Many research labs worldwide contribute to Trustworthy AI. Our group finds its uniqueness by striving for working solutions that are widely applicable and can be deployed at scale. We thus name our group Scalable Trustworthy AI. For impact at scale, we commit ourselves to the following principles:

For prospective students: You might be interested in our internal curriculum and guidelines for a PhD program: Principles for a PhD Program.

STAI logo
KAIST logo
Tübingen AI logo
Uni Tübingen logo
IMPRS-IS logo
ELLIS logo

Members

Seong Joon Oh

Associate Professor

Elisa Nguyen

PhD Student

Arnas Uselis

PhD Student

Stefano Woerner

PhD Student

Ankit Sonthalia

PhD Student

Bryan Truong

PhD Student

Lennart Bramlage

Collaborating PhD Student

Jihyeok Jung

MSc student

Bora Kargi

MSc Student

Philipp Davydov

MSc Student

Seokwon Jung

MSc Student

Luca Füger

MSc student

Fabian Morelli

MSc Student

Alumni

Elif Akata

PhD Student

Michael Kirchhof

Collaborating PhD Student

Evgenii Kortukov

MSc Student

Johannes Bertram

Research Assistant

Publications

DISCO: Diversifying Sample Condensation for Efficient Model Evaluation

arXiv

Un-Attributability: Computing Novelty From Retrieval & Semantic Similarity

arXiv

Diffusion Classifiers Understand Compositionality, but Conditions Apply

NeurIPS D&B

On the Rankability of Visual Embeddings

NeurIPS

Overcoming Domain Limitations in Open-vocabulary Segmentation

NeurIPS

Does Data Scaling Lead to Visual Compositional Generalization?

ICML

Do Deep Neural Network Solutions Form a Star Domain?

ICLR

Intermediate Layer Classifiers for OOD Generalization

ICLR

Decoupled Finetuning for Domain Generalizable Semantic Segmentation

ICLR

Are We Done with Object-Centric Learning?

SCSL @ ICLR

DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation

arXiv

Mitigating Shortcut Learning with Diffusion Counterfactuals and Diverse Ensembles

SCSL @ ICLR

Playing repeated games with Large Language Models

Nature Human Behaviour

CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally

arXiv

Benchmarking Uncertainty Disentanglement: Specialized Uncertainties for Specialized Tasks

NeurIPS D&B (Spotlight)

Studying Large Language Model Behaviors Under Realistic Knowledge Conflicts

CoLM

Towards User-Focused Research in Training Data Attribution for Human-Centered Explainable AI

arXiv

Scalable Ensemble Diversification for OOD Generalization and Detection

arXiv

Pretrained Visual Uncertainties

arXiv

A Bayesian Perspective On Training Data Attribution

NeurIPS

Exploring Practitioner Perspectives On Training Data Attribution Explanations

NeurIPS XAI in Action Workshop

ID and OOD Performance Are Sometimes Inversely Correlated on Real-world Datasets

NeurIPS

URL: A Representation Learning Benchmark for Transferable Uncertainty Estimates

NeurIPS D&B

Neglected Free Lunch -- Learning Image Classifiers Using Annotation Byproducts

ICCV

Scratching Visual Transformer's Back with Uniform Attention

ICCV

Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguous Inputs

ICML

URL: A Representation Learning Benchmark for Transferable Uncertainty Estimates

UAI-EAI Best Student Paper

ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO

ECCV

Dataset Condensation via Efficient Synthetic-Data Parameterization

ICML

Weakly Supervised Semantic Segmentation Using Out-of-Distribution Data

CVPR

Which Shortcut Cues Will DNNs Choose? A Study from the Parameter-Space Perspective

ICLR

Openings

PhD & MSc

  1. Send an email to Seong Joon Oh with your CV and research statement attached
  2. Coffee chat with Seong Joon & lab members to figure out fit
  3. Interview: The interview structure is 30 min + 30 min.
    • First half: You’ll present your prior work. Typically, applicants aim for a 10-minute talk to present one or two relevant past work. This leaves ample time (20 minutes) for discussion.
    • Second half: We chat more freely about future research ideas, where we try to find an interesting future topic that lies in the intersection between your expertise and our vision and interests. It would be great to prepare one slide with a few ideas at the intersection of our interest. Available current members join the interview.
  4. Apply to the grad school with Seong Joon Oh’s supervision intent via KAIST Graduate Admissions
  5. Offer

Internship (Pre-MSc)

  1. Send an email to the relevant PhD student (cc: Seong Joon Oh) with your CV and research statement attached
  2. Coffee chat with the PhD student
  3. Interview: Same structure as MSc and PhD. Available current members join the interview.
  4. Offer