Oliver Klingefjord 12/6/24 Oliver Klingefjord 12/6/24

Model Integrity

We propose ‘model integrity’ as an overlooked challenge in aligning LLM agents.

Oliver Klingefjord 3/29/24 Oliver Klingefjord 3/29/24

New Paper: “What are human values, and how do we align AI to them?”

In this paper, we clarify what’s meant by human values, and how we can align AI to them.

Oliver Klingefjord 11/20/23 Oliver Klingefjord 11/20/23

Beyond Constitutional AI; Our first trial with 500 Americans; How democratic processes can generate an LLM we can trust.

Oliver Klingefjord 8/29/23 Oliver Klingefjord 8/29/23

An alternative to Constitutional AI or simple RLHF-based approaches for fine-tuning LLMs based on moral information from diverse populations.