A Rational Moment: A Message to Data Analysts… AI Is Your Best Friend, Not Your Substitute Workforce

Spread the love

(Feb. 27, 2026) — Here is a critique of the ‘substitution framing’ common in media coverage of AI’s impact on scientific careers — a framing that misses the more consequential story of what rigorous, AI-augmented data analysis could achieve. Want to keep your job? Send this to your colleagues.

The Wrong Conversation Is Happening

A recent article in Nature posed the question now circulating in every department meeting, grant committee, and HR budget conversation across the research world: which science jobs will AI take? The article interviewed researchers. It identified coding positions as already obsolete. It offered modest reassurance to bench scientists and senior investigators. And it framed the entire issue as a matter of competition — human analysts versus AI systems, fighting over the same territory.

This framing is not just wrong. It is actively harmful to the profession.

When organizations internalize the substitution narrative — when department heads and hiring managers conclude that AI can ‘do data analysis’ and therefore fewer analysts are needed — they make a category error with serious consequences. They confuse the automation of low-level analytical tasks with the elimination of the need for sophisticated analytical thinking. And in doing so, they set themselves up to produce research that is faster, cheaper, and substantially worse.

This article is a direct message to data analysts: you are not becoming obsolete. You are becoming the most important person in the room. But only if you seize the moment, expand your conception of what your job actually is, and refuse to let your employers treat AI as a cheaper version of you.

The real threat to data analysts is not that AI will replace you. It’s that your organization will use AI as an excuse to stop investing in rigorous methodology. They will suffer badly if they do – and it’s your job to protect them.

What AI Actually Does — and What It Doesn’t

To understand the opportunity, it helps to be precise about what AI tools currently do well and where they fall flat. This is not a binary. The capacity of these systems is not evenly distributed across the analytical workflow, and the distribution matters enormously.

AI coding assistants — tools like ChatGPT, Grok, Claude, GitHub Copilot, and similar systems — are genuinely impressive at generating syntactically correct, reasonably idiomatic code for standard analytical tasks. Given a description of a dataset and an objective, they can produce workable R or Python for descriptive statistics, common regression models, data wrangling operations, and basic visualization. For the kind of code that a first-year graduate student or a research programmer might produce after their third similar project, AI is a legitimate substitute. This is the part the Nature article got right. But they overemphasize the importance of the ability to generate (really, re-generate) source code for which thousands of libraries already exist and have existed for decades. What AI brings is ease of access. But, as in all things, you get what you pay for.

It does no good to own a complex machine if you do not have the instruction manual. It does no good to have an instruction manual if you cannot read it.

Here are some examples of what AI does not do, at least not without a skilled analyst directing it:

It does not recognize when a study design is fatally flawed unless asked – and often even then it will miss details. AI will fit a regression model on data with classic confounding structure and hand you coefficients without flagging that the estimates are uninterpretable. It will run a difference-in-differences analysis on data that violates parallel trends without telling you the identifying assumption is likely broken. It will produce a beautifully formatted results table from a convenience sample and note, at most, a boilerplate caveat about generalizability.

It will now know which measures are independent, how to update degrees of freedom, that it should learn, not assume the Type I/Type II error trade-off… the list goes on and on about what is will not bring to the table in a newb’s hands.

It does not make appropriate decisions about model specification. AI will include or exclude variables as instructed, but it lacks the domain knowledge and causal reasoning to distinguish a confounder from a cofactor or mediator, to recognize when a control variable opens a collider path, or to understand why adjusting for post-treatment variables biases estimates. These are not edge cases in research. They are central to nearly every analysis that matters.

It does not tell itself to use objective model selection criteria, or to use machine learning to prevent model overfit.

It does not evaluate the quality of evidence it synthesizes. When asked to summarize a literature, AI can produce fluent, well-organized prose that treats a severely underpowered study the same as a large pre-registered trial. It has no native capacity to weight evidence by methodological quality, to identify when apparent consensus reflects shared methodological limitations across studies, or to flag when heterogeneity in results is almost certainly driven by design differences rather than true moderators.

It does not know the difference between a study that has been manipulated by p-hacking or biased by inclusion/exclusion criteria and one that has not been forfeited by such actions.

Read the rest here.

1 Comment

Newest

Oldest

phrowt

Sunday, March 1, 2026 8:23 PM

It is still my opinion that AI is one of the most dangerous things to come along in my life. The danger is when it is used to make decisions or when the machines are given the ability to create a code that humans will not understand. Therefore the machine will achieve the ability to make decisions that will cause disastrous results. Bottom line, what will be the brakes for AI? Currently that is not being discussed.