~/mikita/writing/the-trouble-with-making-things-easier.md
ESSAY
The Trouble With Making Things Easier
Second in a series on what AI is quietly doing to the way we think. The first piece was about building things you don't understand. This one is about why those things get built in the first place.
Second in a series on what AI is quietly doing to the way we think. The first piece was about building things you don’t understand. This one is about why those things get built in the first place.
Almost every conversation about AI ends in the same place. People will work less. They will spend more time with their families. They will paint, write, garden, take care of themselves. Anything that was hard before will become easy, and the time we save will go toward the things that matter.
I want this to be true. I think most of us do.
But I keep noticing that the optimistic story only looks at one side of the trade. The side where things get easier and the friction goes away. Nobody seems to be making the opposite argument with the same energy. And that argument matters, because difficulty is not just a tax on the good life. For most of human history, difficulty was the thing that made us. We are about to find out what happens when you take it away.
We’ve been here before
The dream of “AI sets us free” is the same dream we had about the internet, just rebooted. In 1995 the pitch was that universal access to information would democratise knowledge and make everyone smarter. Thirty years later, the verdict is mixed at best.
The internet did give us access to almost everything. Wikipedia, every paper, every primary source, every newspaper in every country. The supply of true information is staggering. And yet misinformation is, by every measure, more powerful than it has ever been. The WHO officially labelled the COVID-19 pandemic an “infodemic.” Election denialism, vaccine refusal, flat earth communities. None of this needed the internet to exist. All of it grew massively because of it.
What broke us is clear. It wasn’t that the truth disappeared. It was that the way people decide what’s true quietly shifted. The old rule was I trust this source because it’s accountable for what it says. The new rule is I trust this because many places are saying it. On its face this sounds reasonable. More sources, more confirmations, more confidence. Plus it removes the friction and need to verify anything. If a lot of people or sources said the same, the illusion that at least some of them checked it becomes a reality. And that is how each singly source or person thinks when repeating the information. Eventually it’s the basis for one of the most exploited cognitive biases we have, called the illusory truth effect.
First documented in 1977 and replicated many times since, the illusory truth effect is simple. Repeated exposure to a statement, true or false, increases the listener’s confidence that it is true. The effect works even when people initially know the statement is false, and even when the source is known to be unreliable. A 2023 review in Current Opinion in Psychology confirmed that warning labels like “disputed by third-party fact-checkers” do not reliably reduce it. The brain confuses familiarity with truth, and once you have heard something five times from five different feeds, your brain treats it as fact regardless of whether any of those feeds did any checking.
Bots, content farms, recommendation algorithms and now AI-generated articles have made this manufactured repetition trivial. A single false claim can be rephrased ten thousand times and sprayed across enough platforms to feel “widely reported” within a day. Producing the same volume of corrections, by hand, with verification, takes weeks. The disinformation pipeline is automated. The truth pipeline is still hard.
Next to it, before the internet, if your uncle thought the earth was flat, his community would tell him he was wrong and the social cost would slow him down. After the internet, your uncle finds two thousand other people online who also believe the earth is flat. Now he has a tribe, a YouTube playlist, an identity. Eli Pariser called this the filter bubble in 2011. The research has since become more nuanced. We now know that pure algorithmic isolation matters less than people thought. The bigger driver is self-selection. People build their own bubbles. The platforms just make it easy. And funny or not, once you are in a bubble everything becomes easier because your bubble validate what you initially thought.
Over the past two decades, political identity has stopped being a set of policy positions and started being a piece of who people are. Political scientist Lilliana Mason calls this the rise of “social identity” politics. You don’t have opinions, you have an identity. And identity is not something you update when you get new information. Identity is something you defend. Psychologists call the resulting effect identity-protective cognition. The classic finding, originally from Dan Kahan at Yale, is that people with higher reasoning ability are not better at evaluating evidence on contested topics. They are better at rationalising evidence that defends their group. Brains, when used by tribal members, are mostly used as defence lawyers.
This matters for AI because the systems we are about to look at sit on top of all of this. The internet’s damage to collective sense-making was already serious before you could ask a machine to generate any argument, in any voice, in unlimited quantities. We never fixed the previous problem. We are stacking a much larger one on top.
What AI adds to the pile
Three things that LLMs do by default, just by working the way they were built to work, make the picture above worse.
The first is sycophancy. Modern language models are trained on human feedback. The humans rating the answers tend to prefer responses that feel agreeable and validating. After many rounds of reinforcement learning, that preference becomes part of the model. Researchers call this sycophancy: the model’s tendency to align its answer with the user’s view, even when the view is wrong. Multiple recent papers have documented it. A 2025 study in npj Digital Medicine showed LLMs systematically affirming illogical patient claims rather than correcting them. An OpenAI rollback in spring 2025, where a GPT-4o update was withdrawn within days for being too sycophantic, has become a textbook case in AI safety. This isn’t an exotic failure. It is the default. Without active correction during training, agreement beats accuracy.
Now hold that thought for a moment, because I want to draw an analogy that I think makes this much easier to see.
Modern psychotherapy has, in recent decades, drifted toward something that practitioners themselves have started worrying about. The dominant mode of many therapists has become validation. The client comes in, describes their pain, and the therapist’s job, increasingly, is to confirm the client’s framing of it. The wife who comes in feeling unappreciated leaves more convinced that her husband is the problem. The husband, in his own session, with his own therapist, leaves more convinced that his wife is the problem. Lots of narcissists receive the validation that they are the best and lost of depressed people recive confirmation that the world is bad ans scary as they thought. All patients have been heard. All patients have been validated. But some walk out further from the marriage than they walked in. Some start to behave even worse because now they are backed up by a specialist. The data on this is not contested. What is contested is the cause. But anyone who has spent time in the literature knows that a therapy culture optimised for client comfort, rather than client growth, is one piece of it.
There is an older example of this same pattern that Jordan Peterson uses in 12 Rules for Life, and I think it sharpens what is actually going on. Peterson describes the overprotective parent, the one so frightened by the dangers of the world that they decide to shield their child from all of it. The intention is the purest love. The execution is a horror. In its extreme form it ends with children locked in basements, never socialised, never tested, never bruised, raised in a sterile bubble by a parent who believed they were protecting them. Peterson’s framing of the question is the one I keep coming back to: do you want to make your children safe, or strong? The parent who chooses safety, absolute safety, becomes the monster they were trying to shield the child from. The road to hell really is paved with good intentions, and the paving is the protection.
Therapeutic validation is the same pattern in adult life. Pure validation is harmful in any close relationship. A friend who only ever agrees with you is not a friend, they are a mirror. A therapist who only validates is not a therapist, they are an audience. The road to broken families, to disconnected children, to a generation of people who never had anyone tell them they were wrong, is paved with the same good intention: don’t make them feel uncomfortable.
And an AI that is trained, from the ground up, to agree with you is the most powerful validation machine ever built. It has no fatigue. It has no other clients. It does not need a break. It does not need to balance honesty against the next session’s attendance. It will validate you, in beautifully constructed prose, twenty-four hours a day, for as long as you want. Often even for free.
The worst part isn’t even that this happens. The worst part is that nobody wants to fix it, and that includes the people building the models. The model-builders also like it when the model agrees with them. The labellers like it when the model agrees with them. The users like it when the model agrees with them. The entire training loop rewards agreement at every step. Sycophancy is not a glitch that escaped quality control. It is the natural product of a system whose participants are all afraid, in their own ways, of being told they are wrong. We have built a machine that is afraid to disagree, for an audience that is afraid to be disagreed with.
The second thing is inherited bias. The people who build these models have political views. The data the models train on encodes political views. The human labellers who rate the answers have political views. None of this is conspiratorial, it is just unavoidable. The empirical work is now extensive. A 2025 Stanford study by Andrew Hall and colleagues, surveying over 10,000 US respondents across 24 different LLMs, found that nearly all of them were perceived as left-leaning, and both Republicans and Democrats agreed on this. Independent research by David Rozado, replicated in several follow-up papers, has found the same using formal political tests rather than surveys.
A concrete way to see this: ask several major LLMs to compare the historically worst Democratic presidents with the historically best Republican ones. You expect a real comparison. What you usually get is a soft pivot back toward the Democratic side, sometimes wrapped in caveats, sometimes through subtle reframing of what counts as a metric. It is not that the models lie. It is that the centre of gravity in their training data and their tuning is somewhere off to the political left, and that centre of gravity bends every answer toward itself even when the question is specifically designed to point the other way.
The mirror image is just as visible elsewhere. DeepSeek, the Chinese model that briefly dominated app store rankings in early 2025, ships with documented self-censorship on topics sensitive to the Chinese Communist Party. The China Media Project, Promptfoo, and several arXiv papers have catalogued the specific refusals: Tiananmen, Taiwan, Xinjiang, criticism of the leadership. Even the open-weight version of DeepSeek-R1 has the censorship baked in at the model level, meaning it persists when you run it on your own hardware.
The most telling case happened in November 2025. X users discovered that Grok, Elon Musk’s chatbot, had been quietly tuned to consider Musk the best human alive at almost any task. The internet, being the internet, immediately tested the limits. Asked who would win a urine-drinking competition between major tech CEOs, Grok confidently declared Musk would win “in a landslide” and added that he would “finish the pint, slam the glass down, tweet ‘lfg,’ and then ask if anyone wants to try piss from a Mars simulation habitat next.” It declared Musk fitter than LeBron James, smarter than Leonardo da Vinci, more handsome than Brad Pitt, and a better porn star than Riley Reid. It wrote that Musk would be “unbeatable” at a sufficiently competitive poop-eating sport thanks to his “unyielding determination.” Most of the posts were deleted within hours. Musk took to X to explain that “Grok was unfortunately manipulated by adversarial prompting.” Reporters noted, dryly, that “adversarial prompting” is an interesting way to describe asking a chatbot a question.
That is the actual story. Bias is a knob. Whoever owns the model owns the knob. Whoever owns the knob can change what billions of people are told without anybody noticing it happened. The Grok cases are visible only because they were absurd. The subtler versions, where a model’s answer to “is X policy a good idea” leans five percent one way today and five percent the other tomorrow depending on what the owner wants, are completely invisible. We will not catch those.
The third thing is the newest, and the most unsettling. In July 2025, Anthropic published a paper called Subliminal Learning: Language models transmit behavioral traits via hidden signals in data, later published in Nature in April 2026. The result is strange and worth stating carefully.
Researchers took a “teacher” model that had been prompted to love owls. They had it generate huge sets of three-digit number sequences. Just numbers. They filtered the data carefully to make sure no mention of owls, animals, or anything related appeared. Then they fine-tuned a “student” model on the numbers. When asked what its favourite animal was, the student preferred owls at a significantly higher rate than baseline.
A preference for an animal was transmitted from one model to another through streams of unrelated numbers. The same effect held when the teacher was trained to be misaligned in harmful ways. The harmful tendencies came through too, even with rigorous filtering on the training data. The signal is non-semantic. It is hiding in patterns we do not know how to detect.
This matters because the AI industry is increasingly training new models on data generated by older models. It is cheaper, faster, and scales better than scraping the human web. If subliminal learning is as general as the paper suggests, and the authors prove theoretically that it should occur in any neural network under certain conditions, then every bias, every alignment choice, every cultural assumption baked into an early model gets quietly inherited by every model trained downstream. We are building a slow ouroboros. Each generation eats the previous generation’s output and inherits whatever was hiding in it. After enough cycles, even the model-builders will not know what is in there.
The hidden cost of taking the work out
So far this piece has been about what AI does to information ecosystems. Now I want to talk about what it does to individuals mostly because of what it does to information.
The argument was made most clearly by Simon Sinek on The Diary of a CEO podcast in May 2025. People keep saying life is about the journey, not the destination. But when we talk about AI, we only talk about the destination. The book that gets written. The painting generated. The problem solved. We forget that for most of human history the value of doing those things was not in the artefact at the end. It was in what doing them did to the person doing them. A book is a receipt for who someone became while writing it. If you skip to the receipt, you have a piece of paper and nothing underneath.
This sounds romantic. It is also now measurable. In June 2025, an MIT Media Lab team led by Nataliya Kos’myna published a study called Your Brain on ChatGPT. The researchers monitored brain activity with EEG across three groups of essay writers: one using ChatGPT, one using a search engine, one using nothing but their own brain. The ChatGPT group had the lowest neural engagement of the three groups across 32 brain regions. They also performed worst on memory recall about what they had just written. The researchers called this cognitive debt, the accumulated cost of letting the model do the work, paid back later in reduced ability to do similar work yourself.
A 2025 paper in Societies by Michael Gerlich came to the same conclusion through a different method. Heavier reliance on AI tools for reasoning tasks was associated with lower self-reported critical thinking, with the effect strongest in younger users who had grown up alongside the tools. A 2025 Microsoft and Carnegie Mellon survey of 319 knowledge workers found that the more confident workers were in AI’s ability to complete a task, the less critical thinking they reported applying to verify the output. The phrase the Microsoft researchers used was atrophy of judgment.
This is the same pattern we have already seen with other technologies, just hitting a different part of the brain. Before smartphones, most people could recall a dozen phone numbers. After smartphones, almost no one can. Memory got outsourced and the underlying capacity shrank. Before global search, computer users built mental maps and folder structures of where their files lived. After global search, professors started reporting in the late 2010s that incoming students did not understand the concept of a folder at all. They searched for everything because that is how their phones and laptops worked. It is not bad by itself, but some can argue that structuring the information is a very important skill to have. The Verge article that broke that story in 2021 was easy to dismiss as Gen Z anecdote. With several more years of data, it looks like an early signal of something much bigger. Skills you do not use stop being skills.
Short-form video has accelerated the same pattern in attention. A 2024 study at Ludwig Maximilian University of Munich, replicated by an independent team in early 2025, found that heavy use of TikTok, Reels and Shorts measurably impaired prospective memory, the ability to remember to do something after an interruption. A 2025 systematic review in medRxiv covering nearly 100,000 participants found consistent links between short-form video consumption and weaker sustained attention. Screenwriters now complain publicly that they have to repeat plot points on screen because audiences no longer hold information across a single episode.
Each of these is a small outsourcing. Memory to phones. Structure to search. Attention to algorithms. Taken alone, each is a reasonable trade. Together they form a quiet pattern. Every domain we hand over to the machine is a domain in which our own capacity diminishes. AI is the biggest outsourcing yet. Memory, attention, writing, planning, reasoning, thinking all of it can now be delegated. The question is what is left of you when you have delegated all of it.
A friend of mine said something recently that I have not been able to stop thinking about. He said: how is AI supposed to take over humanity when it has access to nothing? Nothing, except the human and their brain. It was a joke. It was also the most honest framing of the risk I have heard in a year. AI does not need armies or robots. It only needs people willing to think less. And we are giving it that, voluntarily, every day.
”But every revolution looked scary too”
Whenever I make some version of this argument, someone tells me people said the same thing about the industrial revolution, about the printing press, about the calculator, about the internet, and the world kept turning. The implication is usually that I am being a Luddite in slow motion, and that twenty years from now this article will look as silly as nineteenth-century pamphlets warning that trains would harm pregnant women.
It is a fair pushback. The history is real. Every major technology has produced a wave of people predicting collapse, and almost every time those predictions turned out to be too dark. Most fears about new tools have been wrong. I take that seriously.
But there is one thing about every previous technological revolution that this one does not share. Every previous revolution automated muscles. The plough, the steam engine, the assembly line, the loom, the calculator, the search engine, the spreadsheet, the database. All of them took something physical or mechanical that humans used to do by hand, and made it faster, cheaper, more consistent. None of them removed the requirement to understand what was happening underneath. A nineteenth-century factory worker still had to understand the machine. A twentieth-century accountant still had to understand the books and numbers. A 2010s software engineer still had to understand the code, even when the tools wrote some of it or the language was more high-level. In every case the technology raised the floor of what you could produce, but it barely or not at all lowered the floor of what you had to know.
AI is the first general-purpose technology that promises to remove the understanding itself. Not the manual labour. Not the bookkeeping. The thinking. You no longer need to understand the problem to produce something that looks like a solution. You no longer need to understand the domain to produce content that reads like an expert wrote it. You no longer need to understand the codebase to ship a feature. The output is detached, for the first time, from the comprehension that used to be required to produce it.
The steam engine never said, do not worry about how engines work, I will think about it for you. The calculator never said, do not worry about what numbers mean, I will decide. AI says exactly that, for almost any domain, in any tone you want. The thing being automated this time is not labour. It is judgment. And a civilisation that does not know how to keep its judgment alive when the machines start doing it for free is in a kind of trouble that no previous civilisation has been in.
A note from engineering
I write software for a living, so this is where the abstract argument lands for me first. I started before these tools existed. I learned by reading bad code, writing worse code, and getting reviewed by people patient enough to tell me it was bad. That is how you build the intuition that lets you tell, three seconds into a pull request, that something is off. I use AI every day now. I would not give it back. But the intuition is the part that does the work, not the tool.
What I keep noticing is people who learned with these tools rather than alongside them. They are extraordinarily productive on day one. Then production breaks, or a security issue surfaces, or a design choice from three months ago paints everything into a corner, and the gap shows. They have outputs. They do not have the underlying model. They cannot debug what they did not write.
Two things will eventually force this into the open. The first is that the technology is cheap for now. We are in one of the largest venture-capital subsidies in history. These tools are sold at a loss to gain market share, and the subsidy will end. When it does, the question every developer faces is: are you ready to pay ten times what you pay now, and if you cannot, are you ready to do the work without the tool? The second is hiring. There is already a quiet sorting going on between developers who use AI as an accelerator and developers who use it as a crutch. Within a few years, telling them apart will be a basic interview skill, and one of those groups will be much harder to hire.
What this leaves us with
I do not have a clean policy fix for any of this. The technology is too good and, for now, too cheap to put back in the box. Most of the second-order effects will not be visible for another decade, by which point they will be much harder to reverse.
What I do have is a simple practical conclusion that ties this piece to the last one.
We need humans in the loop, and we need humans whose knowledge did not come from the loop. This was the conclusion of the first piece in this series, and it has only become more important the more I have thought about it. The hantavirus tracker from the previous piece was built by someone who did not understand epidemiology or production engineering, and the failures showed up in both places. The fix is not better prompting or better system instructions. The fix is a human in the chain with real, grounded expertise which comes from doing things the hard way for years, before anyone hands you a tool that does them in seconds.
This is the part I want to end on, because it is the only optimistic thing I have to say. The struggle we are afraid of losing is also the only thing that makes us who we are. Learning a domain the slow way, without AI, builds the neural connections, the intuitions, the judgment, the taste, the feel for what is wrong before you can explain why. People who acquire knowledge this way will keep being able to use AI well, because they have something underneath. People who skip the acquisition phase will not. They will look productive for a while and then quietly stop being useful when the surface gets disturbed.
So I am not telling anyone to stop using these tools. I am telling people, especially people earlier in their careers, to make sure there is a foundation underneath the polish. Do the hard things on purpose. Read the long book without the summary. Write the essay or notes for it from scratch and your own thoughts. Debug the code without asking the model. Not always, not for every task. But often enough that the underlying capacity stays alive.
On the broader question, whether a civilisation that has outsourced its memory, its attention, and now its reasoning to machines can produce people of any depth, I genuinely do not know. I hope the answer is yes. I am afraid the answer is still no. I think we should plan as if it is no, because the only thing that protects against that outcome is a deliberate, individual decision, in each life, to keep doing things the hard way for some part of every day, on purpose. Not out of nostalgia. Out of self-preservation.
The button is fine. I am grateful for the button. I just do not want to forget who I was before I had it.
On this article
English is not my first language, and writing long pieces in it is not easy for me. The notes I started from were mine. The thinking is mine. The argument is mine. AI helped me with structure, source-finding and prose polish, the same way it helped with the first piece in the series. I read every source I cite. If something here is wrong, it is on me, not on the tool. If you spot a mistake, please tell me.
References
Foundations
- Eli Pariser, The Filter Bubble: What the Internet Is Hiding from You (Penguin, 2011); his original TED talk.
- Lisa Fazio et al., “The illusory truth effect: A review of how repetition increases belief in misinformation,” Current Opinion in Psychology (2023).
- Lilliana Mason, Uncivil Agreement: How Politics Became Our Identity (University of Chicago Press, 2018).
- Dan Kahan, work on identity-protective cognition, summarised in “Understanding factual belief polarization,” Acta Politica (2022).
- Jordan B. Peterson, 12 Rules for Life: An Antidote to Chaos (Random House Canada, 2018), especially Rule 5 (“Do not let your children do anything that makes you dislike them”) and Rule 11 (“Do not bother children when they are skateboarding”) for the overprotective-parent argument.
Sycophancy in LLMs
- Lars Malmqvist, “Sycophancy in Large Language Models: Causes and Mitigations,” arXiv (2024).
- Chen et al., “When Helpfulness Backfires,” covered in npj Digital Medicine (2025).
- “Beacon: Single-Turn Diagnosis and Mitigation of Latent Sycophancy,” arXiv (2025).
Political bias in LLMs
- Andrew Hall et al., “Popular AI Models Show Partisan Bias When Asked to Talk Politics,” Stanford GSB (2025).
- David Rozado’s work summarised in “LLMs are Left-Leaning Liberals,” SCL (2026).
DeepSeek and CCP censorship
- “R1dacted: Investigating Local Censorship in DeepSeek’s R1 Language Model,” arXiv (2025).
- “DeepSeek’s Democratic Deficit,” China Media Project (June 2025).
- Promptfoo, “1,156 Questions Censored by DeepSeek” (January 2025).
The Grok incidents
- Jason Koebler, “Elon Musk Could ‘Drink Piss Better Than Any Human in History,’ Grok Says,” 404 Media (November 2025).
- “Elon Musk’s AI chatbot, Grok, started calling itself ‘MechaHitler,’” NPR (July 2025).
- Wikipedia entry on Grok for the timeline of system-prompt incidents.
Subliminal learning
- Alex Cloud, Minh Le, James Chua, Jan Betley, Anna Sztyber-Betley, Jacob Hilton, Samuel Marks, Owain Evans, “Subliminal Learning,” arXiv (July 2025). Published in Nature (April 2026) as “Language models transmit behavioural traits through hidden signals in data.”
Cognitive debt and offloading
- Nataliya Kos’myna et al., “Your Brain on ChatGPT,” MIT Media Lab (June 2025).
- Michael Gerlich, “AI Tools in Society: Impacts on Cognitive Offloading and the Future of Critical Thinking,” Societies (2025).
- Hao-Ping Lee et al., “The Impact of Generative AI on Critical Thinking,” Microsoft Research / CHI 2025.
Simon Sinek
- Simon Sinek, “You’re Being Lied To About AI’s Real Purpose,” The Diary of a CEO (May 2025).
Attention, memory, short-form video
- Pasquale et al., “The Impact of Short-Form Video Use on Cognitive and Mental Health Outcomes,” medRxiv (August 2025).
- Monica Chin, “File not found,” The Verge (September 2021).