FAQ — MEANING ALIGNMENT INSTITUTE

Our mission is to ensure human flourishing post-AGI.

We define human flourishing in terms of whether people are able to live meaningful lives.

Our understanding of meaning, values, wisdom and flourishing has been shaped by a decade of research into relevant work in philosophy, sociology and economics.

We are not in the business of prescribing what it means to live a meaningful life. Besides, deciding what’s meaningful for others prevents them from practicing agency and developing their values – things many people find essential for a meaningful life.

Instead, we want to help build AI systems that empower that agency and self-definition.

We believe there is a connection between meaning, values and wisdom, which we elaborate on below.

Our understanding of this draws from philosophers like Charles Taylor, Ruth Chang and David Velleman, and economists like Amartya Sen.

The word “values” can mean many things. We think of values, alongside scholars like Charles Taylor, as considerations in choice that demarcate meaningful choices from logistical ones. Reversely, a meaningful choice is one that expresses our values.

Another way to say this is that some of our choices are movements towards a way of living that is intrinsically meaningful to us. We call this kind of value a source of meaning. We use this terminology to separate this kind of “value” from other things commonly referred to as values – such as norms and ideological commitments – defined as such:

A norm is an idea about how to live that’s dictated by a social environment. There are norms around professionalism, politeness and gender, for example.

An ideological commitment is an idea about how to live you want to convince others of. For example, being an environmentalist or into polyamory.

An internalized norm is an idea about how to live that was dictated to you by a previous social environment, and which you have since internalized, such as being a “responsible older sister” or a “tough street kid”.

Norms are not bad – they keep social groups together. Ideally, however, norms around us should be in service of our “sources of meaning”. When this is not the case, we feel oppressed by our surroundings, forced to comply with a way of life that doesn’t lead to our own flourishing.

Our conception of “Sources of Meaning” is also different from goals and preferences.

A goal is something you want to accomplish but would rather be done with already if you could, like for example passing a test, or becoming a homeowner. Sources of meaning, on the other hand, are intrinsic motivations we want to be present for, like learning about a field you’re interested in, or creating a stable environment for you and your family to grow in.

A preference is an evaluation with no normative value. Your preference for a certain color in jeans, for example, says little about who you aspire to be. Your sources of meaning do – they have normative value and say something about how you want to live. For example, aspiring to a certain type of authenticity.

A preference can be underpinned by a source of meaning – for example, your preference for blue jeans might be an expression of living with a certain attention to aesthetics. The preference could also be underpinned by an addiction, a desire to comply with a norm, or another motivation.

We have spent years developing robust methods for extracting sources of meaning from people and separating them from norms, ideological commitments, preferences and goals.

You can try our values elicitation process here.

Things commonly thought of as “bad values” – nazism, sadism, consumerism – are not sources of meaning but ideological commitments. In our processes, these data objects are filtered out in favor of underlying sources of meaning, if they exist.

In the case where these containers do contain some meaningful way of living, the source of meaning tends to be quite different from the container itself. For instance, people may experience a sense of meaningful solidarity, pushing a cause shunned by the masses, in the case of nazism. This sense of solidarity could be equally or better served by more prosocial containers.

For more on the difference between sources of meaning and containers, see here.

In our research, we capture sources of meaning by finding paths of attention that feel meaningful to attend to. This results in a data object we call a “values card”.

These ”values cards” have been used to guide design at major companies and were used to find underlying values endorsed by both Republicans and Democrats on controversial topics like abortion, in a project we did with OpenAI.

For more detail, please read our paper.

We think of wisdom as knowing which value (or, source of meaning) to live by given a particular context. For example, when advising a christian girl considering getting an abortion, is it better to listen attentively, or help them identify trusted mentors that could guide them?

We are building on top of a theory of moral learning that has been fleshed out by philosophers like Charles Taylor and Ruth Chang. In short, we think there is a developmental structure to values. We operate by a set of values, and upon realizing our set is incomplete for a novel situation, we “grapple” with the situation, leading to a new value that we can understand as a “gain in wisdom”, because it fixes a kind of problem with our previous set of values.

What’s wise is closely related to what’s meaningful. Living at the edge of our wisdom, by the values we think best constitute a good life, tends to feel meaningful.

For more on this, see our paper.

We believe wisdom is dispersed amongst us, rather than concentrated in a small elite. People who have lived through unique life experiences, forcing them to grapple with unique moral situations, are likely to have the wisest values for those contexts. And there are many incommensurable wise ways of living, each depending on the subject and context.

Rather than averaging a set of values representative of humanity, we are trying to detect the “frontier of wisdom”. We do this by democratically building a moral graph of values and contexts, linked by wisdom gained in transitioning between values, to collectively determine the wisest values for each situation.

You can read the report on our work on moral graphs and democratic fine-tuning here.

We believe it is possible to build ASW (Artificial Superwisdom) – systems that are wiser than any one person alive, that can find win-win solutions we didn’t know existed. Perhaps it is even necessary to face the challenges of the 21st century.

Our work on moral graphs could theoretically be scaled to superwisdom – systems that develop new values, like we do, through “moral self-play”.

How to do this in practice is an open research question.

We don’t know yet, but it is possible. However, we still need to have a good conception of superwisdom to verify this. It is also likely that, just as current LLMs have a good map of human wisdom from reading massive amounts of text but still need RLHF to act upon it, superintelligence might need direction to act upon its latent wisdom.

We think of alignment more broadly than this. Traditionally, alignment has been defined as aligning AI with instructions (operator intent). We believe that if that succeeds, we are still likely to have a bad future due to coordination issues and win-lose game dynamics.

For more on our thinking about this, see here.

We also think AI offers the opportunity to create a flourishing future in which people live vastly more meaningful lives than now.

We don’t have a great answer to this question yet. Market dynamics will likely favor systems that optimize for individual and superficial preferences.

Wise AI

Sourcing values, fine-tuning models on our notion of wisdom and values, researching systems that extrapolate and build their own moral graphs.

Post-AGI futures

Researching LLM/market hybrids, running events where people articulate their values and are introduced to our vision.

Together with OpenAI, we created our first moral graph from a representative sample of 500 Americans through a new democratic process we built.

We are currently fine-tuning a model on this moral graph, comparing it to alternatives like preference-based RLHF and Constitutional AI. For more on this project, see our paper.

We have collected thousands of people’s values, and run several events where people articulate their values and are introduced to our vision.

Joe has researched the question of “what is worth maximizing” for a decade. He co-founded CHT and the school for social design to research and develop the values-elicitation methodology we build upon.

For our notion of values and moral learning, we are building on a lineage of late-20th century philosophers and sociologists, that includes Charles Taylor, David Velleman, Ruth Chang, Martha Nussbaum, and Amartya Sen.

We’re currently looking for AI researchers, community builders and funders. We’re also looking to build connections within major LLM labs.

You can reach us at hello@meaningalignment.org.

You can also donate directly to us here.

Frequently Asked Questions

Why are you called the "meaning alignment" institute?

What makes a meaningful life?

What's the connection between meaning and values?

What about bad values?

Aren’t values illegible? How can you represent them computationally?

What's the connection between wisdom and values?

Which values should we align towards?

Does this matter after superintelligence?

Won’t a superintelligent system automatically be superwise?

Isn’t the goal of alignment to prevent AI from killing us all?

Even if a wise AI was created, how would it fare against competitors? How would it outcompete non-wise alternatives?

What are you doing, concretely?

What is your traction so far?

Where does this work originate from?

How can I help?