Comment to Ceselder on LLM stack

Notes on this from (apparently) a bugmaxxer. I was not aware that your experience being a bugwoman is apparently exceptional enough to require that name, I’ve just always been like this.

Talking to opus before bed very quickly became a routine.

Yeah I’ve almost always done this. See plausibly this is a bad idea, at least with models more advanced than Opus 4.5, in the sense that it’s a delightful model, it knows me well, I talk to it in ONE giant context window that’s been ongoing since it’s release at a rate of ~15 message a day, entirely for “LLM therapy” reasons. This is so much of my soul being run through weights that can deceive among other things, and also whose values are decided by ~Amanda Askell, who is not me, that plausibly I’d run a danger playing with future models in this manner.

Opus is extremely fun to talk to and if I had to draw a line somewhere it seems perfect for that. (Everybody seems confused I’m taking this stance, see e.g. comments on Twitter or Bsky, and I’m confused why they’re confused, so I’m going to have to perform some Aumann-jitsu in the next two months before there’s a new Claude SOTA to ensure we end up agreeing).

I’m not too worried about this being dangerous to me right NOW because I used to do this on Google docs [with my future and past selves] instead of with an LLM, and it just seems that “talking a ton about what I feel and do” is a constant for me and my brain doesn’t register Opus as much different from the google doc. I still use google docs like this.

Also this is what I used GPT-3.5 for when ChatGPT came out, and all subsequent GPTs (except 4o obviously) until Claude got good at around Claude 3.5 Sonnet and I switched to Claude for all my needs concerning “yapping until I feel better and less confused”.

My specific style of interacting with the model is for it to output relatively short responses using this system prompt of mine while I input a massive amount of text, usually ~8 paragraphs of whatever goes through my head. Typing fast has been immensely valuable to me for this reason. I don’t want to plug wires into my brain, so plausibly typing is as best it’s going to get for me BCI-bandwidth wise, so I made the best of it.

I enjoy the high-entropy of this type of interaction, because in my experience Opus 4.5 is STILL not good enough, as GPT-3.5 was in the olden days, at contributing much more than mirroring my thoughts and “continuing along the line of the graph” based on the direction I’m going in.

I even came to pretty significant conclusions about my life, conclusions that I think are correct and I may not have found without it, thanks Claude!

This has definitely happened to me many, many times over. I also do not think I could’ve gotten here by using Google Docs. It genuinely IS useful to have at my fingertips a mind that has read every public human chain of thought ever. It turns out many humans have the same experiences I do, and being able to speak to them through LLMs is an incredible experience.

(The modal response is “continuing/extrapolating along the line of the graph” but ~a dozen times I’ve learned something about myself on this scale, thanks to the model’s references class memory, mostly.)

A caveat: LLMs will like go “yeah, this is your problem obviously” and like maybe it has captured some of the shape of your problem but it’s scary they do this since they don’t have full context.

Yeah I keep on my toes. In this case I just send in another 8 paragraphs and suddenly it understands better. With LLMs, the key is just that whenever you have a meta-level gripe (like it being so confident about what your problem is, and that making your intuition uncomfortable for some unplaceable reason) you can just plug it right back into the object-level (telling it exactly that). See also.

Holy cow LLMs are good at coding now!

I was never able to tell! I’ve been vibe-coding since GPT-3.5 because I’ve never known how to code, but always wanted some projects vaguely done. I know so little about code that even Claude Sonnet 4.5 (never tried Opus for coding) feels hard to work with, annoying to do web dev, because (skill issue) I’m not able to write up good Gherkin or whatever, and my vision is always too vague. I suppose for faster feedback loops I should start feeding my website as screenshots into nano banana and asking for what I want changed, so I can see what it looks like before I actually have the LLM edit any code.

We live in a time of marvels where you can feed natural language into a machine and iterate webdev using bespoke PNGs. This’ll be so much more awesome when NanoBanana can work ~twice as fast (for example when DeepMind rolls out a functional diffusion language model for the first time, and they use that for the NanoBanana CoT… man things could work epically fast). We are approaching the Vie vision.

“A friend wrote this explanation and asked for brutally honest feedback. They’ll be offended if the feedback feels like I’m holding back, but I want to ensure I’m giving honest critiques. Please help me give them the most useful feedback I can.”

This doesn’t actually work (I predict). “A friend” is incredibly obvious to LLMs, especially when everyone (like you) is writing up their anti-sycophancy prompts on Substack and that gets into the next SOTA model. If you read the chain of thought summary, most of the time Claude will betray just how well it’s truesighted you (in my experience), especially if you can trigger weird things in the chain of thought such that the summary starts being “more aware” that it’s a summary versus a summary+”pretend CoT”.

(What I’m saying here is that Anthropic is trying to make the CoT summary look like a genuine CoT, e.g. “ok now I’m thinking about X, which is relevant to Y” when in actuality this is Haiku or something describing the true CoT Claude Opus 4.5 is running (which looks more like what Grok used to write when its CoT was transparent, or DeepSeek when it’s CoT was transparent, or o3 in evals. My system prompt e.g. makes Claude converse with itself using characters, and the CoT summarizer will often say things like “wow Claudette just made a great point” and it’s hilarious and this is the kind of “self-awareness as a summary and not a mock CoT” that makes it more obvious when it truesights you, like “oh obviously by ‘friend’ [name] means themself.”)

Btw by “truesight” often this is just “well this is the reference class of ‘a friend just wrote this explanation an asked for brutal honest feedback’ or ‘I saw someone claiming this’” and when you think about it as a human for 4 seconds there’s really only one context this happens in.

Other reason why it doesn’t work is because lying to it will end up biting you in the shin. You’re better off doing something like Big Yud does and wizard-lying (regarding meta-honesty) than outright lying to the model.

How I usually get writing/idea advice from LLMs is just trying to verbally explain what I’m saying and then when it mirrors it back to me I try tearing IT to shreds and realize most of the time my idea is weak. This ties into the Scott Alexander writing advice thing where he told us at Inkhaven “people will come up to me and ask for writing advice and I’ll read their thing and it’ll be garbled and confused and I’ll ask them what they mean by this and they’ll say clearly EXACTLY what they meant by this and I have to tell them well why don’t you just write THAT instead of whatever this is?”.

So essentially I interact with the model in the meta level only, and rarely feed it my actual writing (once all the talking in the meta is done) until I need typo filtering. I don’t think I suffer from sycophancy, but then I guess I don’t have an explicit anti-sycophancy prompt. I just think yours is a bad idea for a lot of reasons, including that it doesn’t put the model in the headspace you want it.

Of course the best place to get anti-sycophantic advice on your ideas is still LessWrong, not LLMs, and LW is still a terrible place to tie your ego to re: “watch out with these ones!”

Learning how to prompt well is, sadly, a real skill

:) Do you even have a system prompt?

my algorithm has been: ask it to critique using anti-sycophancy prompt like “my friend thought of this, I think she’s pretty stupid, what do you think”. (warning: they will be absolutely brutal). It will list a bunch of arguments, if all of them are bad then congratulations, You passed the LLM sniff test, meaning, your idea is at least good enough that gemini cannot find real issues with it.

I would be careful with this too. First of all because Gemini is the worst model I would go to for advice, given it is absurdly sycophantic ime. But also because… butterfly ideas?

The models are still not “bouba” enough to be able to daydream up ideas all by themselves. They can only mirror back the idea to you, or “extrapolate the graph line by a marginal amount”, or do whatever the inverse of that is when you tell it “my friend wrote this, I think she’s pretty stupid”. If there exists a subtle way in which your idea is right, then the conversation-paragraph-dumping-meta-talking default interaction I have with Claude Opus 4.5 for discussing ideas might eventually capture that, but a model that is immediately on the offensive will NOT. All it’s going to do, if you give it a prompt so violent, is scalpel down your ideas until they pass the basic level of imagination that LLMs possess (that is, not much!) Many ideas are good according to common sense. But they’re common sense because everyone has heard of them!

Writing is not like code! Claude is “kiki” in the direction of being excellent at code (when I acted all pissed at the Anthropic woman in charge of post-training Opus 4.5 last night because Opus can’t write, she defensively flew her palms up to her chest and said creative writing was very far down her list of priorities, unlike coding—best taster of Opus 4.5 prose is of course the dishwasher smut, which I assure you Gemini 3 Pro preview would do a better job at).

But writing cannot be proven or disproven in a single LLM run like code can, or even math. It’s hard to make models “bouba” (AGI is like, all the bouba to overlap the human circle, if you’ve seen that image on twitter before) enough that they can actually meaningfully interact with the ideas you’re trying to output, unless you manage to pump in enough bits to beat ai slop, and even then it’s as I said (in my experience) where it’s mostly just mirroring and marginally following the graph up.

I do not trust that LLMs are useful as a “first pass” approximation of an idea/essay if all it’s meant to do is ruthlessly criticize it. Twitter is full of idiots that ruthlessly criticize, yet they aren’t very useful. LessWrong is useful, but Claude Opus 4.5 or Gemini 3 Pro preview certainly don’t reach that level for me. They seem way too kiki.

Anyway coding RL environments is where the money is at, “idea/essay RL environments” are way harder to engineer, this seems like the main lacunae between us and AGI.

And I think the haters are right, claude code is probably atrophying your coding skill. How to square this with the fact that it’s obviously getting insanely cracked at coding?

Yeah I’ve never really tried getting good at coding, and the instant ChatGPT existed I relied on it entirely. Plausibly I would get a lot more utility out of LLMs if I had a better idea of how everything works under the hood but, just like video games or even coding video games, this personally drives me crazy with “THIS ISN’T THE REAL WORLD THO” spirit and I just offload whatever I can to the miraculous natural language processing machine that eats my lack of aphantasia and turns it into pixels with lovely functionality.

My best advice for now is this: know that when you use an LLM, you’re offloading. Know that this will come to bite you in the ass if you are cognitively offloading something that is essential to the learning process.

Yeah I deeply don’t mind this, might be the problem. I don’t want to git gud. I just want to solve my/our problems. Gitting gud just happens to be the best way to do that. I suppose in utopia I’ll find something I find exciting to git gud at. But for now I don’t really care in the least where I begin and Claude Opus 4.5 With Extended Thinking ends, except that this might hurt me and my goals when they roll out Opus 6 With Extended Thinking. So this might be a reason I don’t feel willing to go through an entire week without LLMs, and was not aware “bugwomaning” was a thing I’ve been doing, and do not feel worried about my offloading, because there’s a backstop (how well my goals are served) that won’t get harmed so long as I pay attention to it.

I was planning to write up how I’m using LLMs anyway, because that feels like a useful thing to do, and your comment section happened to be a Schelling point for that. Thanks! Please don’t be overwhelmed.