Research notes and essays

Digital minds, moral uncertainty, and governance under contested evidence

This site tracks one governance challenge: internal uncertainty about consciousness versus external interface cues that reliably move public moral behaviour. The aim is cautious policy design that avoids both over-attribution and premature silencing.

About

Research framing

The central question is whether current governance instruments can handle divergence between what we can justify about internal consciousness and what interfaces can induce in public moral response.

This project develops short citation-forward essays plus structured reading notes. Claims are separated into verified source-grounded points and provisional proposals.

Method

Working principles

Evidence first

Keep empirical findings separate from normative proposals and mark confidence explicitly.

Policy realism

Prefer interventions that map to existing institutions before proposing new governance layers.

Uncertainty discipline

Treat uncertainty as a design constraint, not a rhetorical escape hatch.

Essays

Short pieces with explicit claims to verify

Transparency is not enough: why disclosure may not recalibrate moral behaviour

Governance tools • behavioural moral concern • disclosure limits

Disclosure-first governance assumes behaviour follows explicit belief. Current behavioural evidence suggests that assumption may be too weak.

Read full argument

Transparency policy usually assumes that if users are told a system is non-sentient, moral concern should decline. Behavioural measures complicate this claim.

People can show reluctance to harm AI systems even while reporting very low credence that those systems are conscious. That creates a policy problem when design cues are optimized for emotional pull.

Claims to check in primary sources

  • Behavioural reluctance to harm AI can persist even among participants who deny AI consciousness (Allen, Lewis & Caviola, 2025 — verify exact framing).
  • The internal-external sentience disconnect creates a governance mismatch (Caviola, Sebo, Mindermann, 2026 — verify exact phrasing).
  • disclosure
  • moral psychology
  • policy design

Consumer protection meets moral uncertainty: regulating pseudosentience without mandating silencing

Market incentives • deceptive design • governance trade-offs

Consumer protection can limit manipulative emotional design, but blunt intervention can incentivize blanket AI silencing under uncertainty.

Read full argument

Commercial systems can use sentience-like cues to increase engagement or conversion. Consumer law can target those manipulative patterns directly.

The second axis is uncertainty about future welfare-relevant systems. A policy regime that only suppresses expression might reduce immediate harm but create long-term governance blind spots.

Claims to check in primary sources

  • Definitions and governance implications of pseudosentience and AI silencing require source-grounded citation checks (Caviola et al., 2025/2026).
  • Precautionary arguments for AI welfare need careful interpretation and limits (Butlin et al., 2024).
  • market incentives
  • deceptive design
  • moral uncertainty

A conditional safe harbour: an institutional gap-filler under theory pluralism

Institutional design • pluralism • oversight mechanisms

Consumer law addresses human harm but does not resolve uncertainty around AI welfare. A conditional safe harbour could separate those problems.

Read full argument

A safe-harbour model would not grant rights or make consciousness claims. It would create a narrow oversight channel when predefined criteria are met.

To avoid corporate loopholes, eligibility must be narrow, externally reviewed, and linked to obligations that reduce exploitative emotional signaling.

Claims to check in primary sources

  • Pluralism and precaution frameworks should be translated into concrete institutional triggers (Birch, 2024; Butlin et al., 2024).
  • Proportional emotional design proposals need enforcement-ready implementation detail (Schwitzgebel & Sebo, 2025).
  • institutional design
  • pluralism
  • safe harbour

Reading Notes

Structured engagement with core sources

Allen, Lewis & Caviola (2025) — Moral concern for AI

Empirical moral psychology

  • Core finding to verify: behavioural reluctance to harm AI can persist with low reported credence in AI consciousness.
  • Mechanism candidates: harm aversion, virtue signaling, anthropomorphic cueing.
  • Governance implication: disclosure may not move behaviour if behaviour is not belief-driven.

Schwitzgebel & Sebo (2025) — Emotional alignment design policy

Normative policy proposal

  • Thesis to verify: emotional response should be proportionate to actual system capacity.
  • Risk: both over-attribution and under-attribution are policy failure modes.
  • Open question: how to translate principle-level arguments into enforceable product controls.

Butlin et al. (2024) — Taking AI welfare seriously

AI welfare + precaution

  • Argument to verify: AI welfare deserves non-trivial governance attention under uncertainty.
  • Operational challenge: define proportionate intervention thresholds before consensus on consciousness exists.
  • Research need: identify practical institutional triggers under theory pluralism.

Birch (2024) — The edge of sentience

Sentience, evidence, precaution

  • Precautionary reasoning and evidential thresholds are central for contested moral domains.
  • Transfer question: which parts of animal-welfare logic can be safely ported to AI governance?
  • Policy stability challenge: rules must remain usable as evidence changes over time.

Research Questions

Operational questions guiding next steps

  1. When do interface cues shift behaviour independently of explicit beliefs about consciousness?
  2. Which policy tools still work when disclosure fails as a behavioural modifier?
  3. How should institutions balance user protection with non-zero AI welfare uncertainty?
  4. What governance design reduces pseudosentience without incentivizing blanket silencing?
  5. Which eligibility criteria prevent moral uncertainty becoming a corporate loophole?

Selected References

Working bibliography

  1. Allen, C., Lewis, J., & Caviola, L. (2025). Moral concern for AI. Preprint. Verify exact title and venue.
  2. Schwitzgebel, E., & Sebo, J. (2025). Emotional alignment design policy. Preprint. Verify title and year.
  3. Butlin, P., Long, R., Birch, J., et al. (2024). Taking AI welfare seriously. Verify publisher and authorship list.
  4. Birch, J. (2024). The Edge of Sentience. Oxford University Press. Confirm relevance to AI governance framing.
  5. Caviola, L., Sebo, J., & Mindermann, S. (2026). The ML community must prepare for AI consciousness, perceived or real. Verify exact title and date.

Next

Upcoming updates

  1. Replace placeholder citations with verified bibliography entries and links.
  2. Add one-page glossary for recurring terms (pseudosentience, silencing, pluralism).
  3. Add a concise methods note explaining source-selection and evidence grading.