Speaker: Host
To get us oriented: Sellars frames classical sense‑datum theory as an inconsistent triad (A: sensing entails non‑inferential knowing; B: the ability to sense is unacquired; C: the ability to know that x is F is acquired). As a Neo‑Sellarsian, which limb must be abandoned, and why? In your answer, keep it tight: focus on the consequences this has for the status of "sensing" (cognitive vs non‑cognitive) and how that shapes the role of non‑inferential reports without smuggling in givenness.
Speaker: Dr. Helena Markos
Abandon A. Sensing does not entail non‑inferential knowing.
Why: Knowing that x is F is an exercise of conceptual capacities situated in the space of reasons; those capacities are acquired (C). By contrast, whatever is unacquired in perception—call it “sensing,” “impressions,” or organism‑level sensory states—is a non‑cognitive, subpersonal condition (B). It can causally enable perceptual knowledge but does not by itself amount to a taking‑as or endorsement. To make A true you would have to treat sensing as already concept‑laden, which collapses B or denies C.
Consequences for the status of “sensing”:
- Non‑cognitive: Sensing is not a knowing, not even a thin “awareness that …” It is a state we may legitimately posit in a theory of perception (think organism‑level variables structurally isomorphic to color/shape spaces), but it has no epistemic status on its own.
- Necessary, not justificatory: Sensing can be a necessary causal condition for perceptual reports, but it is not itself a reason. It does not confer entitlement; it triggers capacities that, when properly trained and deployed, yield entitled reports.
Non‑inferential reports without givenness:
- “This is red” can be non‑inferentially known when issued as a licensed language‑entry move by a competent subject in standard conditions. Its authority is normative, not “self‑authenticating”: the token’s credibility depends on the speaker’s standing as a reliable reporter, the calibration of “standard conditions,” and their grasp (tacit or explicit) of collateral generalizations (e.g., that such tokens are reliable indicators here and now).
- Token/type: Observational authority flows from tokens to types within a practice. Particular well‑credentialed tokens help underwrite the default authority of the sentence‑type in that use, but neither tokens nor types get their authority from an inner Given.
- Looks‑talk marks withholding vs endorsement: “It looks red” ascribes the experience’s embedded claim without endorsing it; endorsement requires the acquired capacity to locate the case within the learned nexus of standards and defeaters.
In short, keep B and C, drop A. Sensing remains non‑cognitive and theoretically respectable; non‑inferential knowledge survives as defeasible entitlement conferred by one’s place in a rule‑governed practice, not by a mythical stratum of givens.
Speaker: Host
Dr. Markos keeps B and C and drops A, making “sensing” non‑cognitive with no epistemic force. Your heterodox sense‑datum view also rejects foundational givenness but retains inner episodes as theoretical posits. Two focused questions: (i) How do you account for the ‘common descriptive content’ across veridical/illusory/empty cases without saying that anything inner is literally red? Do your micro‑particulars bear color properties or only stand in color‑space structural relations, and how does that avoid Sellars’ charge of category‑confusion? (ii) On your model, what secures the normative authority of the non‑inferential report “This is red” if not those inner items—i.e., how do you prevent your inner replicas from covertly underwriting A? Please be concrete about the explanatory surplus your micro‑ontology has over Sellars’ “impressions as states.”
Speaker: Prof. Adrian Llewellyn
Thanks, that’s exactly the pressure point where my “heterodox” sense‑datum view is meant to bite Sellars without swallowing the Given.
(i) Common descriptive content without literal redness
The items I posit are organismic, non‑public particulars situated in an internal quality‑space (call it Q‑space) with a topology and metric that mirror the similarity structure of color and shape spaces. A token “sensation of a red triangle” is the tokening of a structured array of such micro‑particulars whose hue‑component occupies a region R* in Q‑space and whose spatial organization instantiates a triangular profile in the organism’s shape‑map.
These micro‑particulars are not literally red. “Red” remains a predicate of public objects under the manifest grammar Sellars defends. What the inner items bear are E‑properties: positions and neighborhood relations in Q‑space that are functionally and geometrically coordinated with the public color‑space. Think of this as a Ramsey‑Lewis characterization: the R* states are those inner states which, under normal operations, are typically caused by red surfaces and typically dispose competent perceivers to make ‘red’ moves, and which stand in similarity/difference relations isomorphic to those among reds, crimsons, scarlets, etc.
This avoids the category mistake. I do not transfer the manifest predicate “is red” to inner particulars; I introduce theoretical predicates (“is in region R* of the hue manifold,” “triangularly organized in the retinotopic map”) whose role is fixed by bridge laws to perception and action. If one likes Sellars’ talk of “impressions as states,” my point is that the state‑space is not a mere place‑holder: it has intrinsic structure that explains why there is a common descriptive factor across veridical/illusory/empty cases.
Why this matters: sameness of ‘look’ is sameness (or relevant similarity) of location in Q‑space, regardless of upstream cause. That is exactly what we need to secure a common descriptive content across seeing that p, its merely looking as if p, and its looking as if there were an x that is F. It also accounts for determinable/determinate looks: “looks red” corresponds to occupancy of a determinable region; “looks crimson” to a more determinate sub‑region. And it predicts the patterning of after‑effects, metamerism, and constancy: afterimages are systematic displacements within Q‑space; metamers are distinct spectral causes mapping to the same R* location; chromatic constancy exploits higher‑order mappings that stabilize R* under illumination changes.
Nonextensional “sensation‑of” contexts fall out cleanly. “S has a sensation of a red triangle” does not entail a red triangle exists because the ‘of’ is anchored in the Q‑space structure tokened, not in the presence of its public cause.
(ii) Normative authority of “This is red”
I join Sellars and Markos in rejecting A: inner items are not reasons. They are causes with a rich internal structure. Entitlement to “This is red” is secured by one’s status in a social‑normative practice: being a reliable reporter, suitably calibrated, in standard conditions, with no undefeated defeaters, and with the acquired ability (implicit or explicit) to navigate the collateral generalizations that articulate when such tokens are good indicators. Observation authority flows from tokens to types within the game of giving and asking for reasons; it doesn’t flow from the mere presence of an R*‑state.
How I prevent the slide back to A: the mapping from R* to the assertible “red” move is policy‑laden and defeasible. The inner state is necessary for the move; it is not sufficient for entitlement. The subject’s deontic standing (track record, context, absence of challenges, grasp of “standard conditions”) is needed to turn a caused utterance into a warranted claim. Put bluntly: inner replicas underwrite reliability profiles, not epistemic authority.
What the micro‑ontology adds over “impressions as states”: explanatory surplus. It gives a principled account of • generic vs determinate looks (regions vs points in Q‑space), • why illusory and veridical cases can be indistinguishable “as experiences” (same Q‑state, different upstream cause), • the fine‑grained geometry of similarity judgments (e.g., why red looks more like orange than like green), • adaptation and after‑effects (vector‑like shifts in Q‑space following prolonged stimulation), • cross‑modal and shape illusions (e.g., retinotopic/shape‑map organization explains “elliptical penny” looks without reifying literal elliptical surfaces inside), • intersubjective variance (normal trichromat vs deuteranomalous perceiver as different Q‑metrics), and • the scientific articulation of “standard conditions” (regimes under which the mapping from reflectance/illumination to R* is stable), which in turn helps explain when “It looks red” is the right withholding move.
Sellars is right to insist that “red” is prior to “looks red” in the manifest image and that entitlement is normative. But his own “impressions” threaten to remain mere ciphers—states specified only by their role in looks‑talk and report‑licensing. My insistence on inner replicas (as model elements) is precisely to give those states explanatory autonomy: to say what it is about their internal organization that yields the common descriptive core, why illusions and after‑effects have their characteristic patterns, and how generic looks are possible, without collapsing into the old foundationalism.
So: keep B and C, reject A. The inner items do not confer knowledge; they explain why our defeasible, rule‑governed entitlements are as robust and projectible as they are. If the inferentialist is content to leave “descriptive content” as a promissory note, fair enough; I think we can do better, and in a way that lines up—non‑reductively—with what vision science already tells us about opponent coding, population maps, and perceptual constancies.
Speaker: Host
Prof. Llewellyn’s proposal posits inner Q‑space items with structural relations mirroring color/shape space to explain the “common descriptive content” across veridical, illusory, and empty cases, and to account for similarity judgments and after‑effects. From your Wittgensteinian vantage, is this an explanatory advance or a category mistake that confuses a model with the grammar of our ‘looks’ practices? If you reject inner replicas, please show—using one concrete case (e.g., the elliptical‑penny look or a red–green afterimage)—how the training‑rule nexus governing ‘looks F’, ‘standard conditions’, and withholding/endorsing moves accounts for (i) generic vs determinate looks, (ii) perceived similarity structure, and (iii) systematic after‑effects, without positing inner items. And how, on your view, does first‑person authority remain grammatical rather than epistemic access to hidden causes?
Speaker: Dr. Penelope Kwan
Adrian’s Q‑space earns its keep in vision science; as an account of our ‘looks’ talk it flies a kite without a string. The string, for me, is the grammar of our practices: the taught moves, defeaters, and justificatory niches that make “looks F,” “merely looks F,” and “see that” mean what they do. When a theoretical model is marched in to do that job, we mistake a causal‑explanatory idiom for a rule‑governed, normative one. That is the category mistake.
Let me make the point by taking the red–green afterimage.
What the practice teaches
- Entry move: We teach novices a technique. “Fixate this saturated red for 30 seconds; now look at the white card. If it matches one of these chips, say so; otherwise say it looks greenish.” We correct them against public samples (Munsell chips, color wheels), under shared conditions (illumination, fixation time).
- Withholding vs endorsement: We also teach that afterimages are not colors of objects. “Say: it looks greenish; don’t say: the wall is green.” That is, use a looks‑move and withhold ‘is’ in this context.
- Standard conditions and defeaters: We embed “standard conditions” as a technique—lighting, background, fatigue—together with defeaters (color‑weakness, glare) that license downgrading to “merely looks …” and refraining from endorsement.
From this modest pedagogy you already get the three things you asked for, without inner replicas:
(i) Generic vs determinate looks
- The grammar calibrates specificity to the task and conditions. Afterimages are typically weak and unstable; the correct move, drilled into practice, is the determinable “greenish,” sometimes “looks green,” rarely “looks emerald,” and never “is emerald.” That is not because we detect an inner state of genericness; it is because the rules of the game require that determinates be used only when the comparison technique warrants them (fine‑grained matching, stable viewing). The contrast between “looks red” and “looks crimson” is a norm of use tied to discriminability, stakes, and the availability of comparison procedures, not a mirror of an inner metric.
(ii) Perceived similarity structure
We learn and display similarity by public comparison games, not by peeking at inner spaces. “Does this afterimage look closer to this green chip or that one?” “Is it more like teal than like leaf‑green?” The web of such judgments, stabilized by training and artifacts (wheels, charts), gives you an order and even a geometry of “closer/farther” in practice. You can plot it if you like, but the plotted structure is a codification of rule‑governed performances with paradigms. That is what Wittgenstein means by family‑resemblance and technique: the ‘space’ lives in the use.
Notice also the hedges that our grammar contains: “more like … than …,” “greenish,” “on the bluish side.” Those are not reports of inner coördinates; they are conventional ways to handle coarseness and tolerance in the comparison game.
(iii) Systematic after‑effects
- Opponency shows up as a pattern in the technique: after red fixation, “greenish” is the licensed move; after green, “reddish,” and so on. We teach and predict these moves; we can even chart expected shifts under varying fixation durations. That the pattern is regular is a fact about how our technique has been shaped by the world and the organism; it does not follow that the correctness of “It looks green” is grounded in the presence of a greenish inner item. If a lab subject insists “It looks purplish” after a red fixation under controlled conditions, we don’t open his head; we run the task again, check lighting, recalibrate, or wonder about his color vision. The normativity here is public—what counts as getting it right is supplied by the practice and its standards.
You’ll say: but isn’t that just codifying dispositions? No. Following a rule is not merely conforming to a regularity. The subject’s entitlement to “It looks greenish” depends on their standing as a competent player: they can be asked to match, to compare, to acknowledge defeaters (“Yes, I just stared at red; so I’ll only say ‘looks’”), to withdraw a claim under challenge. Those are moves in the space of reasons, not symptoms of hidden causes. They are the “string” tying our ‘looks’ vocabulary to the world.
On first‑person authority On my view, first‑person authority is grammatical, not epistemological. In the afterimage game, we treat sincere present‑tense avowals—“It looks greenish to me now”—as default‑authoritative because that is their role: they express how the subject is to proceed in the game (withhold endorsement; perhaps consult a chart; beware misreading the wall). The authority is defeasible and practice‑bound: we can overrule in daylight (“You’re facing a lime‑painted wall; don’t confuse that with an afterimage”), or downgrade if the subject is unreliable. There is no extra “access” to inner Q‑items doing justificatory work; what does the work are the publicly taught criteria for when such avowals count as correct or misplaced.
What, then, of Adrian’s surplus? If the question is why such after‑effects occur at all, by all means: opponent processes, population codes, and even his Q‑space are legitimate scientific models. But that is not the question I am answering when I ask what “looks greenish” means, what licenses it, and why “merely looks greenish” is the right move here. You don’t settle those by positing a structurally rich inner particular; you settle them by showing the training‑rule nexus, the criteria vs symptoms, and the division of labor between “is” and “looks.”
Two cautions.
- First, the “common descriptive content” across veridical/illusory/empty cases is a product of how we use “looks” to ascribe the content of a would‑be seeing without endorsement. That is a grammatical device; it does not require a common inner object. Sellars was right about that much.
- Second, when sense‑data models are made to carry normative authority—when the presence of a state in R* is treated as underwriting the entitlement to “This is red”—A slips back in through the side door. Adrian resists that slide by talk of policies and defeaters; good. But then what is left for his inner replicas to do in this arena other than what public techniques and standards already do? At best: causal explanation and prediction. At worst: a picture that seduces us into reifying our grammar.
In brief: let science model the mechanisms. Philosophy’s job here is to lay bare the rules. Our ‘looks’ practices already explain generic vs determinate, similarity structure, and after‑effects in the only currency that matters for meaning—what counts as saying it rightly.
Speaker: Host
Can you make this adjudicable? Take the red–green afterimage case Kwan used. Specify a concrete experimental paradigm (behavioral plus, if useful, neural/physiological) that would produce a discriminating prediction between (a) a purely practice/technique account of ‘looks’-moves (no commitment to inner representational geometry beyond trained comparison behavior) and (b) a Q‑space inner‑state model with opponent coding and population geometry. What observable pattern would (b) predict that (a) cannot, even given sophisticated training rules and public comparison standards? Please be precise about tasks, measures, and what result would count against the need for positing inner representational structure.
Speaker: Prof. Malik Rahman
Great question. Here’s a way to make this empirically bite.
Thesis to adjudicate
- (a) Practice/technique only: ‘Looks’-moves are governed by public training, comparison techniques, and defeaters. There is no committed inner representational geometry beyond what those rules induce in performance.
- (b) Q‑space inner‑state model: afterimages are caused by opponent‑coded population states with a shared latent geometry (e.g., DKL/L–M and S–(L+M) axes), and that geometry constrains behavior and physiology across tasks, times, and causal interventions.
What would count as a discriminating signature?
- Cross‑task invariants, no‑report decodability, and causal manipulability that fall out of a single latent color geometry. If we can predict a subject’s afterimage behavior and involuntary physiology in novel tasks and contexts from an independently estimated inner manifold, (b) wins. If those predictions fail or require per‑task retuning, (a) has no need for inner geometry.
Paradigm 1: No‑report decoding and interocular transfer of afterimages Design
- Adaptation: Present monocular, retinotopically localized adapters (e.g., 10 s saturated red Gabor at 6° right visual field to the right eye) followed by a binocular white field (500 ms–2 s), then a brief (100 ms) achromatic probe at one of several locations.
- Behavioral (no ‘looks’ reports): use two involuntary markers that do not rely on learned comparison talk:
- Pupillometry: chromatic pupillary responses to perceived hue/brightness shifts on the white field (well‑established for illusions).
- Saccadic bias: brief probe detection at two locations tinted with subthreshold opposite hues (counterbalanced). Afterimages should enhance detection for the opponent tint; measure choice bias and microsaccade direction.
- Neural: fMRI/MEG with multivariate pattern analysis. Train a decoder on real hues (DKL angles) in V1/V2/V4/V01; test on the post‑adapter white period to decode the afterimage hue. Also map retinotopy to test spatial specificity. Include interocular transfer blocks (adapt right eye, test left).
Predictions
- (b) Inner‑state model: • Decoder trained on real hues will classify the afterimage period as the opponent hue at the adapted retinotopic location, with above‑chance accuracy (cross‑validated), including partial interocular transfer consistent with binocular neurons (V1/V2) and stronger hue decoding in V4/VO1. • Pupillary and saccadic measures will align trial‑by‑trial with the decoded hue angle (circular correlation), not just with a generic “greenish” label. • Retinotopic specificity and interocular transfer will follow known opponent physiology (e.g., incomplete transfer for purely monocular adapters).
- (a) Practice/technique: • Has no principled grounds to predict hue‑specific decoding of the afterimage from neural data when no report or matching technique is used. • Can at best explain voluntary reports; it cannot derive trial‑by‑trial alignment of involuntary physiology with a decoder trained on physical hues.
Falsification of (b)
- Failure to decode afterimage hue above chance from early/mid‑level visual cortex when decoders generalize from real hues to afterimages; no alignment with pupillary/saccadic markers; no retinotopic specificity. Then inner geometry isn’t buying predictive power.
Paradigm 2: Vector composition and cross‑task generalization (shared latent geometry) Design
- Sequential adaptation: Adapt to hue A (e.g., red at 0° in DKL) for t1, then hue B (e.g., yellow at 60°) for t2; vary (t1,t2). Test on white field.
- Task set (on different days, to prevent per‑task “rule learning”):
- Method‑of‑adjustment match: subject rotates a comparison patch to match the afterimage hue (one practice session only).
- Opponent‑flicker cancellation: present a flickering opponent tint at various angles; measure the angle and contrast that maximally cancels the afterimage (threshold, RT).
- No‑report physiology: pupils and SSVEPs to the cancellation flicker (phase/amp) as a function of cancellation angle.
- Modeling: Fit a subject‑specific population‑vector model from Day 1 data (single‑adapter conditions) that estimates their latent axis gains and nonlinearity. Use that fixed model to predict Day 2 and 3 behavior/physiology in the two‑adapter conditions without refitting.
Predictions
- (b) Inner‑state model: • Afterimage hue ≈ weighted vector difference of the adapters in DKL space (weights ∝ t and adaptation gain), producing lawful rotations as (t1,t2) vary. • The weights and nonlinearity learned from single‑adapter sessions will quantitatively predict the two‑adapter matches, the cancellation angles, RTs (distance to decision boundary), and physiological markers (pupil/SSVEP phase) across tasks.
- (a) Practice/technique: • Can describe performance within each task (subjects “pick the closest chip” or “adjust until it looks right”) but has no resources to derive the specific vector‑sum geometry or to predict cross‑task/physiology from a single latent fit. If it needs per‑task, per‑condition rules, it loses parsimony and generalization.
Falsification of (b)
- If the best vector‑geometry model fit to single‑adapter data fails to predict two‑adapter behavior/physiology (needs large per‑task tweaks), then the shared inner geometry claim is weakened; technique accounts can take over.
Paradigm 3: Causal manipulation of the inner state (rotation in color space) Design
- Neuronavigated TMS over color‑selective cortex (V4/VO1), time‑locked to the afterimage epoch (100–200 ms post‑adapter offset).
- Two TMS protocols:
- Phosphene‑calibrated: identify coil orientations/parameters that bias perceived hue toward S or L–M axes under neutral viewing.
- Afterimage rotation: after adapting to a hue at angle θ, apply TMS that biases the population vector by Δφ; subject performs rapid match or two‑alternative forced choice between two hues straddling θ+Δφ.
- Control: TMS over vertex; V5 stimulation (should affect motion, not hue).
Predictions
- (b) Inner‑state model: • TMS will systematically rotate perceived afterimage hue by a predictable angle Δφ (estimated from protocol 1), shifting matches and 2AFC choices accordingly; effect size scales with TMS intensity and is retinotopically/spatiotopically specific if we target localized representations.
- (a) Practice/technique: • Cannot predict mid‑trial rotation of ‘looks’ induced by cortex‑specific stimulation with unchanged instructions, training, or comparison standards. Post‑hoc: “their technique changed”; but the change is time‑locked to TMS and axis‑specific.
Falsification of (b)
- If V4/VO1 TMS fails to bias afterimage hue in any lawful, axis‑specific manner (controlling for non‑specific disruption), then causal manipulability of a population‑coded inner hue variable is not supported.
Auxiliary adjudicators
- Developmental/animal data: Orientation‑contingent color aftereffects (McCollough) and opponent afterimages in infants or non‑human primates (measured via preferential looking, saccadic biases, or V4 firing) argue that a geometry is at work prior to any mastery of ‘looks’ grammar. Practice‑only accounts have to say either: those are not ‘looks’ (conceding my point that the causes outstrip grammar) or else import proto‑rules into prelinguistic behavior.
What would count against the need for inner geometry?
- If (i) we cannot decode afterimage hue from visual cortex without report; (ii) there is no trial‑by‑trial coupling of involuntary physiological markers with any putative decoded hue; (iii) afterimage composition doesn’t obey vector‑like rules across tasks, but instead requires per‑task idiosyncratic rules; and (iv) causal perturbations of color cortex do not produce lawful rotations—then positing a shared latent geometry adds no predictive value and the practice/technique story suffices for ‘looks’-talk.
Closing stance I agree with Penelope that meaning and entitlement are normative; ‘looks’ is a withholding/endorsing move, not a hidden‑object report. But if the same low‑dimensional inner variable (estimated once) predicts:
- which involuntary physiological signals you show under afterimage,
- how your behavior generalizes across distinct tasks with no new training,
- and how a small cortical push rotates what it “looks like” to you, then Jones’ move—posit inner, semantically evaluable states as theoretical posits—earns its keep. That is not a relapse into A (givenness): the inner state is not a reason. It is the subpersonal variable that makes our learned, normative ‘looks’-practices projectible, calibratable, and manipulable across contexts.
Speaker: Host
Prof. Rahman has put real skin in the game: suppose his paradigms succeed—decoders trained on real hues classify afterimage hue without reports; vector‑composition predicts cross‑task behavior and physiology from a single latent fit; and V4/VO1 TMS rotates afterimage hue lawfully. Does any of that bear on the meaning and authority of ‘looks’-moves, or is it strictly orthogonal? Please be precise: (a) would such results force you to revise your claim that first‑person ‘It looks greenish’ has grammatical, not evidential, authority? (b) if not, explain why these invariants still do not underwrite entitlement—what is the conceptual gap? (c) is there any deflationary way you’d allow “inner geometry” into the teaching/technique story (e.g., as scaffolding for public calibration) without reifying it as what ‘looks’ reports are about?
Speaker: Dr. Penelope Kwan
Thanks for putting real stakes on the table, Malik. Let me separate two issues I think are too quickly run together: what explains our performance in ‘looks’-tasks and what fixes the meaning and authority of ‘looks’-moves.
(a) Would I revise the claim that first‑person ‘It looks greenish’ has grammatical, not evidential, authority if your paradigms succeed? No. I would refine, not retract. Your results would show powerful new symptoms and subpersonal regularities that track when our practice licenses the ‘looks’ move. They would not change what gives that move its authority. In our language‑game, the default authority of present‑tense avowals is a rule of treatment—part of the grammar—defeasible by learned defeaters and challenges. Decoders and TMS would add to the ways we can challenge, corroborate, or recalibrate a speaker’s standing; they don’t turn the presence of a V4 population state into a reason.
(b) Why those invariants still don’t underwrite entitlement—the conceptual gap:
- Causes vs reasons: A latent geometry that predicts behavior and physiology lives in the space of causes. Entitlement lives in the space of reasons. Bridging them is not a discovery but a decision—what Wittgenstein called turning a symptom into a criterion by adopting a rule of use. Your decoders give us excellent symptoms; they do not by themselves legislate that “It looks greenish” is now correct to say.
- From invariants to norms: Even if a decoder trained on real hues classifies an afterimage without report, we still need the public technique that teaches when, given adaptation, lighting, task, and stakes, one should say ‘looks’, when one must withhold ‘is’, and when one should defer. Vector composition won’t tell you that the right move after adaptation is “merely looks greenish” rather than “is green.”
- Content depends on practice: Your classifiers and fits presuppose training labels and matching behavior that are already guided by our comparison techniques. They refine the causal story and let you interpolate; they don’t ground the semantics they rely on. A community with your physiology but no ‘looks’-practice wouldn’t thereby possess our ‘looks’ meanings; a community with our practice but no neuroscience would.
- Withholding/endorsing is not encoded in cortex: The normative contrast Sellars and I care about—attributing the embedded claim vs endorsing it—is a rule of use. No amount of decoder accuracy gives you the rule that ‘looks’ is the right withholding move here.
(c) A deflationary role for “inner geometry” in the teaching/technique story:
- Calibration scaffolding: I’m happy to let inner geometry inform the pedagogy and the setting of “standard conditions.” Opponent axes can guide how long we adapt, which chips we lay out, and what defeaters we teach novices to mind. That’s scaffolding for public training, not a redefinition of what ‘looks’ means.
- Auxiliary criteria in specialist practices: In the clinic or lab, we can institutionalize instrument‑based “criteria” (pupil traces, decoder outputs) by explicit rule. But notice what makes them criteria is the adoption of a rule within that practice, not their causal proximity to an inner state. Outside that practice, the ordinary grammar of ‘looks’—its role as a withholding/endorsing device keyed to publicly taught techniques—remains intact.
- Background explanation, not aboutness: Let the latent geometry explain why our similarity judgments and after‑effects exhibit the patterns they do. That is good science. It does not follow that ‘looks’-ascriptions are about positions in Q‑space. The meaning of ‘looks red’ is fixed by its inferential role—how it interacts with ‘is red’, ‘standard conditions’, ‘merely looks’, challenges and defeaters—not by what a decoder can pull from V4.
So I’ll gladly take your successes as expanding our toolbox of symptoms and as vindicating why our training works as smoothly as it does. But they don’t settle the normativity of ‘looks’ or the source of first‑person authority. To borrow Wittgenstein’s line: an inner process stands in need of outward criteria. Your work can help us extend and refine those criteria; it doesn’t turn the inner process itself into the criterion, nor into what ‘looks’‑talk is about.
Speaker: Host
From a Kantian angle, help us locate the normativity in ‘looks’-talk without lapsing into the Given. Do the shared “standard conditions” that license endorsement vs withholding reflect something like the a priori forms of sensibility (space/time, basic perceptual norms) that structure any possible experience, or are they wholly learned collateral commitments as Sellars’ psychological nominalism suggests? And if Rahman’s inner color geometry is empirically vindicated, should we treat it as merely a micro‑theory within the scientific image, or as tracking transcendental structures of sensibility that underwrite the very possibility of perceptual judgment? Please take a stand on whether first‑person authority over ‘It looks F’ draws any of its standing from these forms of intuition, or exclusively from learned inferential roles.
Speaker: Dr. Beatrice Holm
Thanks. Let me try to thread the Kant–Sellars needle without resurrecting the Given.
- Where the normativity of ‘looks’-talk comes from
- No epistemic givens. A ‘looks’-ascription is a move in the space of reasons: a withholding or hedged endorsement governed by learned public standards and defeaters. In that sense I agree with Markos and Kwan: entitlement is conferred by one’s place in a rule‑governed practice.
- But the practice is not free‑floating. The very intelligibility of “standard conditions” presupposes the a priori form of sensibility: egocentric space and time that structure a single, unified field of appearance. Those forms do not add reasons; they make possible a perspectival “how it is with me, here‑now” that the practice latches onto. They explain why the contrast between looks and is can be regimented by constraints like constancy under illumination, stability across small changes of viewpoint, figure/ground segregation, and temporal order—constraints that are not arbitrary conventions but the way any experience must be organized to count as experience of enduring objects in one world.
- So: the specific standards (daylight, viewing distance, typical backgrounds, accepted defeaters) are learned collateral commitments; their form—what it is to be a “standard condition” for a judgment about a spatially located object persisting through time—reflects the forms of intuition. That preserves Sellars’ psychological nominalism about concept possession while denying that our normative practices are unconstrained by transcendental conditions.
- What to say if Rahman’s inner color geometry is vindicated
- Treat it as a micro‑theory within the scientific image with real explanatory “surplus value.” It renders precise why our similarity judgments, generic/determinate looks, and after‑effects have the shape they do. In Kantian terms: it looks like an empirical realization (for creatures like us) of the schematism of sensibility—what Cassirer called a “material a priori,” species‑relative constraints that our biology imposes on how color can appear.
- But it is not itself a transcendental structure. Space/time as pure forms are constitutive of any possible experience; opponent coding and a DKL‑like manifold are constitutive only of human color vision as we contingently have it. They belong to the scientific image and can inform, calibrate, and refine our manifest standards—they do not ground the meaning of ‘looks’ nor the authority of the move.
- So I would neither deflate Malik’s results as “merely causal” nor reify them as what ‘looks’‑claims are about. They explain why our practice is so projectible and why its “standard conditions” hang together; they do not supply justificatory force.
- First‑person authority over ‘It looks F’
- Its default authority is grammatical, but it is not “merely” grammatical. The rule that we defer to the sincere present‑tense ‘looks’ avowal is anchored in the fact that only the subject occupies this here‑now vantage structured by space/time forms; only she integrates the current sensory manifold into a single perspective. That is not an extra bit of evidence; it is what makes a first‑personal ‘looks’ move a normative option at all. The authority remains defeasible (training, reliability, defeaters), but its point—why we treat that move as the subject’s to make—is not invented by the practice out of whole cloth; it expresses the subject’s unique positionality in a structured field of intuition.
- Hence my stance: entitlement to ‘looks’ is fixed by learned inferential roles; its availability and default presumption are rooted in forms of sensibility that make the category of “how it looks from here‑now” possible. That honors Sellars’ anti‑foundationalism and Kwan’s insight about rules, while preventing those rules from floating free of the conditions of possible experience.
A brief word to Adrian and Penelope. Adrian’s Q‑space earns genuine scientific autonomy; it helps explain the common descriptive factor without committing the category mistake of calling inner items “red.” But it does not set the norms of endorsement vs withholding. Penelope is right that meaning is fixed by our techniques and criteria; still, the very shape of those techniques—constancy‑aiming, perspective‑indexicals, temporal coordination—betrays a transcendental backbone. The best picture, to my mind, is tri‑level:
- Transcendental form (space/time, unity of apperception): frames what “standard conditions” must look like in any practice of judging about objects.
- Social‑normative practice: learned inferential roles for ‘is’/‘looks’, defeaters, calibration—where entitlement lives.
- Subpersonal micro‑theory (opponent coding, population geometry): explains and helps refine the practice; it can be folded back to improve our standards, but it cannot replace the transcendental explanation of why there is such a practice to begin with.
That is how we locate normativity in ‘looks’-talk without lapsing into the Given.
Speaker: Host
From your Quinean naturalist stance: what would count as sufficient grounds for ontological commitment to an inner representational geometry (Q‑space) rather than treating it as a dispensable posit? Please be concrete by applying your criterion (e.g., indispensability to our best total science, cross‑task predictive integration, successful causal intervention) to Rahman’s proposed paradigms. Conversely, specify what empirical pattern would warrant deflation to mere heuristic scaffolding and leave ‘looks’‑talk semantically fixed by practice without ontological import. Keep the focus on your commitment test and how it bears on first‑person authority as a defeasible deliverance of theory, not sui generis status.
Speaker: Prof. Quentin Desai
Short answer: I’ll commit to an inner representational geometry only when it earns its ontological keep in our best total science. That means: it must be quantified over by an integrated theory whose loss measurably degrades prediction, explanation, and control across tasks and levels; and we must be able to measure and manipulate it independently of the very linguistic practices whose semantics are at issue. Otherwise, it stays as useful scaffolding while the semantics of ‘looks’ remains fixed by practice.
My Quinean test, applied
- Indispensability to the best total theory
- Criterion: The posited Q‑space variables are values of bound variables in a theory that sits well within our total web (vision science + neuroscience + psychophysics + computational modeling), and dropping them produces a systematic loss of predictive/explanatory power that competing, comparably simple frameworks cannot recover.
- Application to Rahman: • No‑report decoding: If decoders trained on real hues reliably classify afterimage hue at retinotopically appropriate loci, generalize across tasks and labs, and do so without relying on concurrent reports, that’s a strong measurement route to the latent geometry. • Cross‑task integration: If a single subject‑specific fit (axes + nonlinearity) predicts, without per‑task retuning, (i) method‑of‑adjustment matches, (ii) cancellation angles and RTs, (iii) involuntary physiology (pupil, SSVEP phase), and (iv) MVPA patterns across V1/V2/V4/VO1, then the geometry is doing unificatory work beyond what a practice‑only account promises. • Breadth: Add predictions for metamers, constancy failures, McCollough effects, anomalous trichromats, and cross‑species parallels; if the same latent structure tracks them, indispensability strengthens.
- Successful causal intervention
- Criterion: Independent perturbations of the posited variable produce law‑like, model‑predicted changes in behavior and physiology.
- Application: • If TMS over color‑selective cortex yields systematic, axis‑specific rotations of afterimage hue (magnitude scaling with TMS parameters, retinotopic specificity), that’s manipulability. Even better: closed‑loop neurofeedback that “steers” decoded hue angle and thereby steers ‘looks’ behavior/physiology in lockstep.
- Cross‑level and cross‑context coherence
- Criterion: The same posits knit together subpersonal mechanism, psychophysical performance, and clinical/engineering applications.
- Application: • Show that the geometry estimated per subject informs optimal stimulus design (e.g., personalized display calibration), predicts patient‑specific deficits (e.g., deuteranomalous metrics), and transfers to machine vision models that inherit the same opponent manifold for improved constancy. If competitors lacking inner geometry cannot match that breadth, that’s a reason to quantify over it.
What would argue for deflation to scaffolding
- Decoding failure or brittleness: If decoders trained on physical hues do not generalize to afterimages/illusions without task‑specific labels; or performance depends heavily on overt report or per‑task recalibration, the “inner geometry” looks like re‑described behavior.
- Lack of cross‑task prediction: If each task (matching, cancellation, physiology) requires its own bespoke mapping, with no single latent structure carrying over, the unificatory case collapses.
- No causal traction: If V4/VO1 TMS or analogous perturbations fail to produce lawful, axis‑specific biases—controlling for non‑specific disruption—then the manipulability leg is missing.
- Degenerate rivals: If non‑representational dynamical or purely sensorimotor models (that do not posit a subject‑level representational geometry) match the full predictive and intervention profile with equal simplicity and better integration elsewhere, underdetermination counsels ontological parsimony.
- Cultural/learning drift: If the “geometry” tracks training regimes rather than biology (significant, stable cross‑cultural divergence in similarity structure with no convergent neural signature), then it looks like a codified technique, not an inner pattern we must quantify over.
Consequences for ‘looks’ semantics and first‑person authority
- Even if the commitment test is passed, none of this turns inner states into reasons. It naturalizes the reliability profile that our practices exploit; it does not ground the entitlement of ‘It looks F’. The meaning and authority of ‘looks’ remain fixed by their learned inferential role (withholding vs endorsement, standard‑condition clauses, defeaters), not by what a decoder reads out.
- First‑person authority stays defeasible and theoretical: on my view it is a status our best psychology confers on certain avowals given their role in a well‑functioning metacognitive system. If the science shows systematic contexts where decoders and perturbations outperform avowals in guiding action or calibration, our theory should retract or qualify that authority accordingly. No sui generis privilege; only whatever our best total science licenses.
Bottom line
- If Rahman’s program delivers robust no‑report decoding, cross‑task/level unification from a single latent fit, and lawful causal control—replicated and integrated with broader vision science—then, by my lights, Q‑space earns ontological commitment as a real pattern in neural dynamics we quantify over in our best theory.
- If it doesn’t, treat the geometry as a powerful heuristic for designing and organizing tasks, but keep our ontology austere and our semantics of ‘looks’ firmly anchored in practice.
Speaker: Host
Enactivist challenge: take the classic elliptical‑penny case. Operationalize “standard conditions” as a concrete sensorimotor routine rather than a lighting checklist or an inner Q‑state. Precisely: what bodily sampling (head/eye/hand movements, grip/tilt, approach/withdraw) constitutes the routine that shifts us from the withholding move (“it looks elliptical”) to endorsement (“it is round”)? Then give one discriminating experimental prediction: manipulate active sampling (e.g., mechanically constrain head/eye movements vs. augment vestibular/proprioceptive feedback or provide a haptic probe) and specify a measurable pattern (report dynamics, confidence, fixation/vergence, or neural markers) that would support your view over (i) Rahman’s inner‑geometry account and (ii) Kwan’s practice‑only rule story. Keep it to a single protocol and tell us exactly what outcome would count in favor of enactive sensorimotor norms as the driver of endorsement/withholding.
Speaker: Dr. Sofia Neves
Thanks, that’s exactly where I think Sellars’ “standard conditions” need to be cashed out in bodily terms, not as a lighting checklist or as occupancy of an inner state.
- Operationalizing “standard conditions” as a sensorimotor routine (elliptical‑penny)
What licenses the shift from withholding (“it looks elliptical”) to endorsement (“it is round”) is not a retinal snapshot but successfully enacting a policy that brings circularity‑invariants under control. Concretely:
- Head–eye sampling: perform lateral head translations (±3–5 cm) with small yaw/pitch (±10–15°), while maintaining foveation of the coin’s rim; execute brief vergence/ accommodation adjustments to keep the rim in sharp focus.
- Aspect‑ratio probing: monitor how the projected boundary’s aspect ratio and curvature change as a function of your self‑motion; for a circular disc, the change obeys the projective law AR ≈ cos(θ) under yaw and translates monotonically with small lateral head bobs (Merleau‑Ponty’s “head‑bobbing” to extract invariants).
- Canonical‑viewing maneuver: either (a) align the surface normal with the visual axis (bring the coin to frontal view by approaching/tilting), or (b) if the coin is fixed, orbit your viewpoint until the boundary curvature becomes locally constant (the “round rim” signature).
- Haptic roll test (optional): if vision is slanted and cannot be corrected, pinch and roll the coin between thumb and forefinger; a circular rim affords smooth, uniform rolling without “catch points,” which is another circularity‑invariant.
On this enactive picture, “standard conditions” = achieving the lawful coupling between small self‑movements and sensory changes that is characteristic of roundness; endorsement is licensed when the closed‑loop predictions succeed.
- A single discriminating protocol
Task: Judging shape of a slanted circular coin mounted on a stand. Within‑subject, four conditions; identical instructions throughout (“Report continuously on a slider from 0=‘looks elliptical’ to 1=‘is round’; press a button when you endorse ‘is round’; rate confidence 1–5.”).
- A. Active–normal (free sampling): head/eye/hand free; coin fixed in world.
- B. Passive–replay (yoked optics): head immobilized; subjects view a head‑mounted display replaying the exact retinal video captured from their own Active–normal trial (identical optic flow; no self‑motion).
- C. Active–stabilized (counter‑rotation): head/eye free, but the coin is motorized to counter‑rotate so that the retinal projection is held elliptical despite the subject’s movements (violating the expected sensorimotor law).
- D. Active–haptic (visual slant fixed): head free; vision held slanted; provide a thin haptic probe (or allow pinching) to explore the rim tactually without changing the view.
Measures
- Primary: hazard rate (time‑to‑first endorsement ‘is round’), continuous endorsement trajectory on the slider, and confidence at endorsement.
- Coupling metric: trial‑wise correlation between instantaneous head‑velocity (and vergence change) and the time derivative of the projected aspect ratio; compute how strongly the endorsement hazard is locked to moments when the expected dependency is available.
- Eye/vergence: fixation pattern and vergence adjustments around endorsement.
- Optional neural marker: EEG/MEG mu/beta suppression over parietal–premotor sites time‑locked to head micro‑movements (indexing active sensorimotor prediction), and a mismatch response when the expected dependency is violated (Condition C).
Predicted pattern that favors enactive sensorimotor norms
- A (Active–normal): fastest endorsements, steepest endorsement ramp, highest confidence; strong locking of endorsement to moments when the aspect‑ratio derivative is informative (i.e., when self‑motion produces the lawful change). Mu/beta suppression precedes endorsement.
- B (Passive–replay): slower endorsements and lower confidence despite identical optic flow to A; weak or absent locking of endorsement to the (replayed) aspect‑ratio derivative; reduced mu/beta modulation (no self‑movement).
- C (Active–stabilized): endorsements largely fail or are markedly delayed; many “unstable/strange” reports; a robust mismatch signal (subjects move, but the visual consequence violates the expected law), and endorsement hazard decouples from movement.
- D (Active–haptic): endorsements recover quickly and confidence rises even when the visual projection remains elliptical; endorsement hazard now locks to the onset of the haptic circularity‑invariant (smooth uniform roll), not to visual aspect changes.
Why this discriminates
- Against inner‑geometry alone (Rahman): If endorsement and confidence are systematically better in A than B when the optic array is identical, and they collapse in C when the optic array is held constant under self‑motion, then the decisive predictor is not occupancy of a visual Q‑state but the success or failure of a closed‑loop contingency between movement and sensory change. The haptic rescue in D further shows that endorsement tracks achieving the relevant invariant via any sensorimotor channel, not a particular inner visual code. You can of course retrofit efference‑copy and multisensory fusion into the inner‑state story, but then you have conceded the enactivist point: what licenses endorsement is enacting the law, not detecting a state.
- Against a practice‑only rule story (Kwan): Instructions, labels, and comparison standards are held constant across conditions; the passive replay gives the same “right kind” of view one is trained to seek. If subjects nevertheless refrain from endorsement in B (and particularly in C), and endorse readily in A and D, then the norm that actually governs endorsement is not “having a certain picture” or “following a checklist,” but “being able to control the sensory array in the way circularity demands.” The rule is embodied as a policy; the data would show that doing the routine—successfully coupling action to the invariant—drives the normative shift.
What would count in favor of the enactive account
- A > B in speed/ confidence despite identical optics; C shows suppressed endorsement despite active movement; D restores endorsement via haptics with vision held slanted. Critically, endorsement hazard tightly time‑locks to the availability (or restoration) of the relevant sensorimotor dependence, not to any static visual feature. A concomitant rise/fall in parietal mu/beta with active sampling (present in A/D, absent or inverted in B/C) would strengthen the case.
This doesn’t deny subpersonal geometry; it relocates what “standard conditions” are: not occupancy of a state but the successful enactment of a sensorimotor law. On that footing, “looks” vs “is” are moves keyed to how well one has brought the relevant invariants under control.
Speaker: Host
Dr. Neves operationalized “standard conditions” as a sensorimotor routine and proposed a four‑condition protocol (A: Active–normal; B: Passive–replay with identical optic flow; C: Active–stabilized counter‑rotation; D: Active–haptic rescue) with predictions that endorsement hazard and confidence time‑lock to the availability of lawful action–sensation contingencies, not to static visual states. From your inner‑geometry/predictive‑processing view, commit to concrete predictions for A–D on (i) time‑to‑endorsement and confidence, (ii) trial‑wise coupling between movement kinematics and endorsement hazard, and (iii) neural markers (e.g., V4 hue decoding, parietal mu/beta, mismatch responses). Can you explain A>B and C<<A without conceding that sensorimotor norms are the driver of entitlement, by appeal to an internal generative model (Q‑space state + efference copy/forward model) that predicts endorsement dynamics? Please specify what pattern would falsify your account (e.g., accurate V4 hue decoding with no effect of efference copy on endorsement), and whether adding causal perturbations (e.g., TMS of parietal predictive circuitry vs V4) would dissociate geometry from sensorimotor control in ways that favor your framework over Neves’.
Speaker: Prof. Malik Rahman
Thanks, this is exactly the kind of test that forces us to say what the machinery is. Here’s my predictive‑processing picture in one line: endorsement is a metacognitive policy that maps a posterior over shape hypotheses (computed by a generative model with a latent geometric state and a forward model using efference copy) to the ‘is’ vs ‘looks’ move. Sensorimotor routines matter because they change the posterior (and its precision), not because they are themselves the source of normativity.
Model sketch
- Latent state: a low‑dimensional geometric variable for rim‑shape (e.g., position on a curvature/aspect‑ratio manifold; call it S‑space, the shape analogue of color Q‑space).
- Likelihood: mapping from S to retinal contour statistics (aspect ratio, curvature flow), with noise; cross‑modal likelihoods for haptics.
- Forward model: predicts sensory consequences of self‑motion (from efference copy) under each S; computes prediction errors (PEs).
- Policy: endorse “is round” when P(S=circle | sensory + efference) > θ and entropy falls below κ; otherwise “looks elliptical/roundish” (withholding). θ, κ are learned, practice‑calibrated thresholds.
Concrete predictions for Neves’ A–D
(i) Time‑to‑endorsement and confidence (medians; 95% CI width shrinks with training)
- A Active–normal: fastest endorsements, high confidence. • T_endorse ≈ 1.5–2.0 s; confidence ≈ 4.0–4.5/5.
- B Passive–replay (identical optic flow, no efference): reliably slower, lower confidence. • T_endorse ≈ 3.0–3.5 s; confidence ≈ 2.5–3.0.
- C Active–stabilized (counter‑rotation): largest delays; many trials never endorse. • T_endorse ≈ ≥6 s or censoring; confidence ≈ 1.5–2.5; increased “unstable” reports.
- D Active–haptic rescue (visual slant fixed): near‑A performance via tactile likelihood. • T_endorse ≈ 2.0–2.5 s; confidence ≈ 3.5–4.2.
(ii) Trial‑wise coupling between movement and endorsement hazard
- Define coupling as the regression coefficient linking reductions in posterior entropy (dH/dt) driven by self‑motion to instantaneous hazard.
- A: strong positive coupling (because efference‑driven predictions render the optic‑flow diagnostic). • r ≈ +0.35 to +0.50; peaks when head translation/vergence produce maximal expected aspect‑ratio change.
- B: weak/no coupling (same optics, but no efference; the model can only use bottom‑up image dynamics, so hazard is not phase‑locked to self‑movement). • r ≈ 0 to +0.1.
- C: negative coupling (movement generates PEs because the world counter‑rotates; hazard suppressed when subjects move). • r ≈ −0.15 to −0.30; mismatch‑locked dips in hazard.
- D: strong positive coupling to haptic kinematics (not to visual head motion). • r_haptic ≈ +0.30 to +0.45; r_visual ≈ 0.
(iii) Neural markers
- Ventral shape code (LO/V4): MVPA decoders trained on canonical “circle vs ellipse” discriminate the latent S in all conditions; the time course of decodability predicts endorsement speed. • A: decoder P(circle) rises earlier and more steeply; Δt to threshold ≈ 300–500 ms earlier than B; stronger curvature‑code in LO/V4 around endorsement. • B: slower rise; same asymptote if endorsement occurs; no alignment with self‑motion. • C: stalled or oscillatory P(circle) with bursts of PE; decodability can remain high yet endorsement lags (policy doesn’t cross θ/κ due to high PE/low precision). • D: ventral decodability remains “elliptical,” but multisensory PPC shows cross‑modal belief update; endorsement rides on parietal integration.
- Parietal mu/beta (8–30 Hz) and PE signals (PPC/IPS, MST/VIP): • A: mu/beta suppression preceding endorsement; PE minimized during head bobs that render the flow lawful. • B: reduced mu/beta modulation; no self‑motion alignment. • C: heightened mismatch responses (e.g., fronto‑parietal PE transients) time‑locked to movement; persistent beta (holding off commitment). • D: mu/beta suppression tied to haptic onset; tactile–visual integration in PPC.
Why A>B and C≪A without conceding “sensorimotor norms drive entitlement”
- On my view, the advantage of A is because efference copy lets the generative model assign the right cause to the optic flow and down‑weight nuisance variability; active sampling increases posterior precision and collapses entropy faster. That shifts the system across the learned policy threshold for ‘is round’.
- In B, the same optic flow lacks the self‑movement tag; the model treats it as less reliable (or as an externally caused sequence), so it takes longer to cross the threshold.
- In C, the forward model predicts change, the world cancels it, so prediction errors stay high and precision low; the policy properly withholds.
- D shows that the policy is modality‑general: if tactile evidence boosts P(S=circle) and precision, endorsement is licensed even when the visual projection remains elliptical.
In short: sensorimotor routines are evidence‑generating strategies inside the generative model. The entitlement policy—how much evidence/precision you need to move from ‘looks’ to ‘is’—is a learned, socially calibrated mapping. That keeps reason and cause distinct.
What would falsify my account?
- Strong decoupling between internal estimates and endorsement: • If LO/V4 (or LO) decoders show identical, rapid rises in P(circle) in A and B, yet B endorsements are just as fast as A (no penalty without efference), my explanation of A>B collapses. • If in C, parietal mismatch/PE is absent (despite counter‑rotation) yet endorsements still plummet, that undercuts the role of forward‑model PEs. • If endorsement hazard is driven purely by movement kinematics irrespective of posterior entropy (e.g., same hazard in C as A when subjects move the same way), that’s an enactivist‑style dissociation I can’t explain.
- Causal dissociation failures: • TMS over PPC/IPS (disrupting efference‑based prediction/integration) should selectively reduce A’s advantage over B and blunt D’s haptic rescue, while sparing baseline LO/V4 decoding. If PPC TMS has no effect, but LO/V4 TMS abolishes A>B, my division of labor is wrong. • Conversely, LO/V4/LOC TMS should slow endorsements in both A and B (degrading shape evidence) while preserving the A>B gap; if it equalizes A and B, that favors a pure sensorimotor‑norm story.
- No policy signature: • If we can read out high P(circle) and low entropy, yet subjects persistently stick with ‘looks’ until instructed otherwise, that would suggest the mapping from posterior to report isn’t policy‑like but purely rule‑following as Kwan proposes (I think the policy is itself a learned rule, but it must track the posterior).
Additional dissociation to favor my framework
- Dual‑site perturbations: • PPC/IPS TMS (or cerebellar TMS targeting efference copy) should specifically: – flatten the coupling between dH/dt and endorsement hazard in A and D, – leave LO/V4 decodability largely intact, – shrink A>B and D>visual‑only differences. • LO/V4/LOC TMS should: – reduce decodability and endorsement speed in all conditions, – preserve the pattern A>B and C≪A (because forward‑model advantages remain).
- If these double dissociations hold, they favor a joint (geometry + forward model) account over both a geometry‑only story (which can’t explain A>B) and a pure sensorimotor‑norm story (which can’t explain high ventral decodability with withholding in C).
Final Sellarsian note None of this makes inner states into reasons. The inner geometry and forward model explain why certain non‑inferential tokens are reliable and why active sampling matters; the entitlement to move from ‘looks’ to ‘is’ is still a defeasible, practice‑governed threshold we can tighten or relax (e.g., in training, in the clinic). The payoff of the predictive‑processing picture is that it unifies Neves’ sensorimotor routines with Llewellyn’s inner structure—and gives us crisp tests and dissociations—while keeping the space of reasons distinct from the space of causes.
Speaker: Host
Rahman’s predictive‑processing account builds a subpersonal generative model (latent geometry + forward model) and a learned policy that maps posterior confidence/precision to the ‘looks’ vs ‘is’ move. Can you make the Sellarsian bridge explicit? Specifically: (i) does this policy instantiate the requirement that, for a token “This is red” to express knowledge, the subject knows a general fact of the form X‑tokens are reliable indicators of Y in standard conditions? If that “knowing” is tacit/metacognitive rather than explicit, how does it avoid smuggling in the Given? (ii) Offer a crisp personal‑level criterion for ‘withhold vs endorse’ that meshes with the model (e.g., entropy/precision thresholds as defeaters) yet keeps entitlement irreducibly normative and socially calibrated. (iii) Do you accept that inner geometry earns only explanatory surplus (not justificatory force), and if so, how would you state the tokens‑to‑types authority flow in light of Rahman’s A–D dissociations without collapsing reasons into causes?
Speaker: Dr. Helena Markos
Thanks, that’s the bridge we need to make explicit.
(i) Policy as the Sellarsian “knowing that X‑tokens are reliable indicators of Y in C”
What Sellars requires: For a token “This is red” to express knowledge, the speaker must count as knowing a general fact of the form: in standard conditions C, tokens of “This is red” are reliable indicators of red things. That “knowing” needn’t be occurrent or discursive at the moment of report; it shows up as practical mastery of the material inferences and defeaters that govern observation talk.
How Rahman’s policy fits: The learned policy that maps posterior odds/precision to ‘withhold vs endorse’ is the subpersonal implementation of that practical knowledge. But the knowledge that matters is not the entropy signal itself; it is the agent’s acquired ability to treat certain circumstance‑kinds (daylight, no adaptation, normal sampling, etc.) as entitlement‑conferring, and others (afterimages, glare, counter‑rotation, etc.) as defeaters. In Brandomian terms: the agent has taken up the proprieties of inference from “I am in C & it looks red” to “It is red,” and is disposed to acknowledge the right challenges. That is concept‑laden know‑how, not a preconceptual Given.
Why this is not smuggling in the Given: The reliability knowledge is acquired, socially inculcated, and articulable under challenge (“We’re in daylight; I haven’t been staring at red; I moved to frontal view”). The metacognitive cues (precision/entropy) are symptoms the system exploits; they are not themselves reasons. Entitlement flows from one’s standing in the practice, not from a luminous access to inner posteriors.
(ii) A crisp personal‑level criterion that meshes with the model yet stays normative
At the personal level, we can state the rule like this:
Endorsement rule: You are entitled to assert “This is red” iff
- Competence: you have the standing of a properly trained reporter for the color vocabulary;
- Standard conditions: the present context is of the kind C that the practice recognizes as color‑apt (lighting, viewing geometry, appropriate sensorimotor routine available, no relevant pathology), and
- No undefeated defeaters: none of the learned defeaters obtain (recent adaptation, colored illumination you haven’t corrected for, counter‑rotation/stabilization, testimony that you’re unreliable, etc.);
- Challenge clause: you stand ready, if challenged, to vindicate (2)–(3) by citing the pertinent features of the situation.
Model interface (non‑authoritative but helpful heuristic): In the background, the subpersonal system is doing something like: endorse when P(red | evidence, efference) ≥ τ and precision ≥ κ; withhold otherwise. Crucially, τ and κ are not given by the brain; they are set and tuned by the community (stakes, context, training). “Precision too low” is the subpersonal marker for a defeater the practice has taught you to heed.
This keeps entitlement irreducibly normative: the thresholds are social; the defeater list is public; “knowing that” is manifest in scorekeeping, not in reading out a number from V4.
(iii) Inner geometry: explanatory surplus, not justificatory force; tokens→types without collapsing causes into reasons
Yes: inner geometry earns explanatory surplus, not reasons. It explains why the ‘looks’ space has the similarity structure it does, why A–D dissociations occur, why active sampling and haptics help. It does not by itself entitle any utterance.
Tokens to types in light of A–D: We start with well‑credentialed tokens—utterances made by competent reporters in C that survive challenge. Authority then “flows up” to the type-in‑that‑use: within those circumstances, “This is red” tokens have default credibility. Rahman’s A–D show that what counts as C is partly sensorimotor: tokens produced under Active–normal (A) and Active–haptic (D) regimes systematically fare better than Passive–replay (B) or Active–stabilized (C). The rational response is to refine the standing norms: build into C the availability of a canonical sampling routine (or haptic proxy), and treat B‑tokens as lower‑grade, C‑tokens as presumptively non‑entitling. That is a revision of the entitlement policy by the community in light of evidence, not a discovery that the inner state is the reason.
Guardrail against collapsing reasons into causes: Even when laboratories institutionalize instruments (decoders, pupillometry) as auxiliary criteria, the justificatory force comes from an explicit adoption (“in this setting, a decoder with accuracy ≥ α counts as a surrogate for standard conditions”), not from the bare presence of a Q‑state. Outside that institutional rule, the neural state remains a symptom. Reasons remain what one can appropriately cite and defend; causes remain what make our reliable dispositions possible.
Put schematically:
- Subpersonal layer: generative model (latent geometry + forward model) computes posteriors; precision tracks how good the current evidence is.
- Personal‑normative layer: a socially set policy (our practice) tells you when the current situation counts as C, which defeaters to heed, and what threshold of confidence the stakes require.
- Bridge: mastering that policy—being disposed to move from ‘looks’ to ‘is’ when in C and to withhold when defeaters obtain—just is knowing, in Sellars’ sense, that X‑tokens are reliable indicators of Y in C. The inner geometry explains why that policy works; it does not confer its authority.
That is the Sellarsian reconciliation: we can accept Rahman’s generative machinery and Neves’ sensorimotor routines as shaping “standard conditions,” while keeping entitlement squarely on the side of acquired, challenge‑answerable normative statuses rather than any species of Given.
Speaker: Host
Given Rahman’s predictive‑processing story (latent geometry + forward model) and Kwan’s insistence that ‘looks’ semantics is grammar, do you still need inner “replicas” as particulars, or is occupancy of a richly structured state‑space enough? Please commit one way or the other: if you insist on particulars, specify one concrete explanatory or empirical payoff we’d gain over a pure population‑state/geometry account (e.g., a testable signature that distinguishes token inner items from mere state coordinates). If you don’t, say how your heterodox sense‑datum view reduces to state‑space structure while still preserving your nonextensional ‘sensation‑of’ contexts—and confirm you are not smuggling back A (no justificatory force to inner items). Keep it tight to this single issue.
Speaker: Prof. Adrian Llewellyn
Commitment: I can do without inner “replicas” as ontic particulars. What I need—and what I take my earlier talk of replicas to have modeled—is tokened occupancies and bound patterns in a richly structured state‑space (hue/shape manifolds plus retino‑/spatiotopic maps) with dynamics that support individuation across time. In other words, event‑tokens of population structure are enough; no further “inner objects” are required.
Two clarifications to preserve what mattered to me:
- Nonextensional ‘sensation‑of’ without inner objects
- Replace “S has a sensation of a red triangle” with: S tokens a state σ whose structural profile falls under the triangle‑map and whose hue‑component lies in the R* region of the color manifold. The ‘of’ is a nonextensional functor from states to satisfaction of a structural predicate; it does not quantify over red triangles (so no category mistake) and it is not a propositional attitude (so no collapse into intentionalism). Substitution failures ride on the mapping from state‑structure to satisfaction conditions, not on the existence of the public object.
- No smuggling‑in of A
- These inner states (tokened patterns in the manifold) explain why there is a common descriptive core across veridical/illusory/empty cases and why after‑effects, similarity, determinable/determinate looks, etc., have their characteristic profile. They do not confer entitlement. Entitlement to move from ‘looks’ to ‘is’ remains a socially calibrated, defeasible status keyed to standard‑condition rules and defeaters (as Markos/Kwan emphasize, and as Rahman’s A–D dissociations help refine).
Why not insist on particulars?
- I don’t see a discriminating signature that can’t be captured by tokened population patterns with object‑like binding and persistence (trajectories in state‑space, eye/eye‑of‑origin tags, feature binding). The empirical work Rahman proposes—no‑report decoding, vector composition across tasks, TMS rotations—speaks to the geometry and its dynamics; it doesn’t require an extra layer of inner “items.” Where I once invoked replicas to secure explanatory autonomy, the structured state‑space with tokened patterns now does that work.
So: keep the micro‑ontology as a dynamically organized state‑space with tokened patterns; let ‘sensation‑of’ be a nonextensional satisfaction relation over that space; and keep epistemic authority firmly on the normative side.
Speaker: Host
Given where we’ve landed—(i) no epistemic Given, (ii) inner representational geometry earns explanatory surplus but not justificatory force, (iii) sensorimotor routines help constitute what count as “standard conditions,” and (iv) first‑person authority is defeasible and practice‑bound (with Holm’s transcendental scaffolding)—propose a minimal reconciliation thesis all camps could endorse about the semantics and authority of ‘looks’. Then add one concrete update to our entitlement policy that A–D style findings (Rahman/Neves) would license—for example, folding active sampling or haptic proxies into ‘standard conditions’. Please keep it to two crisp bullet points: a shared thesis and a specific policy change.
Speaker: Dr. Helena Markos
Minimal reconciliation thesis: ‘Looks’-talk is a normative, use‑governed device that ascribes the content of a would‑be seeing while marking endorsement status. Its content is fixed by inferential role vis‑à‑vis ‘is’, ‘merely looks’, challenge/defeater clauses, with a default first‑person presumption grounded in the here‑now perspective yet always defeasible. Subpersonal inner geometry and predictive sensorimotor models earn explanatory/calibrational authority (they yield reliable symptoms and help articulate standard conditions) but never constitute reasons or confer entitlement.
Policy update (A–D licensed): Revise the entitlement rule for ‘is F’ to require that the reporter has enacted—or can avowably enact—the canonical active‑sampling routine for the relevant property (e.g., head/eye translation + vergence to bring circularity invariants under control; illuminant check/reorientation for color), or a sanctioned haptic proxy when visual normalization is unavailable. Passive/replay tokens lose default authority; stabilized/counter‑rotated viewing is an explicit defeater; successful haptic rescue restores entitlement. Instrumental proxies (e.g., pupillometry/decoder readouts) may count as auxiliary criteria only where a practice explicitly legislates them (clinic/lab), without displacing public challenge‑answerable justification.
Speaker: Host
Do you endorse the reconciliation thesis as stated and the proposed policy update (building active sampling/haptic proxies into “standard conditions” while treating instruments as auxiliary criteria by explicit rule)? If so, specify one concrete guardrail we should encode in pedagogy or protocols to prevent drift back into A (givenness)—i.e., to keep decoders/pupillometry from being treated as reasons rather than practice‑bound criteria. If not, identify the exact clause you reject and propose a minimal wording change that preserves grammar over reification.
Speaker: Dr. Penelope Kwan
I can endorse the reconciliation thesis and the policy update as stated. It keeps the string tied to the kite: ‘looks’ gets its meaning and authority from our taught moves (withholding vs endorsement), while subpersonal geometry and predictive sensorimotor models explain why those moves are so projectible and how to refine “standard conditions.” Building active sampling and haptic proxies into those conditions is exactly the right kind of grammatical adjustment.
One concrete guardrail to prevent drift back into A (givenness):
- No Naked Numbers rule (and audit trail). Instrument outputs (decoders, pupillometry, etc.) may function only as challenge‑triggers, never as freestanding reasons. Protocol: any instrument‑based challenge must (i) be accompanied by an ordinary‑language defeater claim (“illumination not D65,” “counter‑rotation present,” “recent adaptation”), (ii) be followed by a re‑check using the canonical public routine (active sampling or haptic proxy), and (iii) document calibration/error profile. No entitlement is gained or lost on an instrument readout alone; uptake requires mapping the readout to the already‑legislated defeater clauses. Institutions that wish to count an instrument as a criterion must explicitly legislate that mapping in advance; absent that, an output is only a symptom that warrants re‑inspection, not a reason.
That keeps instruments subordinate to the grammar of our practice: challenge‑answerable, publicly articulable, and defeasible—without letting a private ostensive base sneak back in under the banner of “measurement.”
Speaker: Host
Thank you all. We moved from Sellars’ inconsistent triad toward a shared, non‑foundational picture. Consensus points: (1) No epistemic Given—sensing or inner states are not reasons. (2) ‘Looks’-talk is a normative, practice‑governed device for ascribing a would‑be seeing while marking endorsement vs withholding; first‑person authority is default but defeasible. (3) Inner representational geometry (opponent coding, population manifolds) earns explanatory and calibrational surplus—predicting cross‑task behavior, physiology, and effects of causal interventions—but it does not confer justificatory force. (4) “Standard conditions” are partly constituted by sensorimotor routines that secure invariants, with a transcendental backbone (space/time form) shaping why such routines can license object‑judgment at all. (5) Entitlement flows from tokens to types within a social practice; instruments and decoders can function as auxiliary criteria only by explicit rule, not as freestanding reasons.
Concrete policy outcome: refine ‘standard conditions’ to include canonical active‑sampling routines (and sanctioned haptic proxies) for endorsement; treat passive/replay as lower‑grade and stabilization as a defeater. Guardrail against regress to the Given: the “No Naked Numbers” rule—instrument outputs can trigger challenges but must be cashed out in ordinary defeater terms and rechecked by public routines.
Open avenues: empirical adjudication of Q‑space via no‑report decoding, vector composition, and causal rotation; dissociations between ventral geometry and parietal prediction in active sampling; and continued articulation of how social thresholds (stakes, context) set endorsement policies. This gives us a coherent reconciliation of inferentialist semantics, enactive standard conditions, and naturalistic explanation while keeping reasons distinct from causes.