Pragmatic AI-assisted coding without hype

February 16, 2026

The following is my personal opinion at the time of writing this article. AI industry moves fast and I reserve the right to change my opinion and/or advice as the capabilities evolve.

AI is a fancy new tool that appears to be very capable on the surface, so there’s a natural tendency to try to use it everywhere. But like any tool, it has its strengths and weaknesses, and it’s important to understand its ideal use cases and when it’s better to use other tools.

For example, if you just bought a new expensive electric screwdriver, you might be tempted to use it as much as possible. But it would be unwise to use it exclusively for building a house and ignore all other pre-existing tools. That doesn’t prevent some people from trying, though.

So let’s start with understanding it. Most developers by now are more or less familiar with how LLMs work, so I’ll just give a quick recap.

An LLM takes a context as input and uses an attention algorithm to process it based on patterns learned during training. It then outputs a single “most probable” token (with some variance) as an output. Then, this single token gets appended to the context and the cycle repeats. Semantically, context might contain instructions, user prompts, relevant files, etc. But on the LLM level, all that gets converted into a single sequence of tokens before it becomes an input to the model.

From this, we can make a few important observations:

  1. Context is King: Because of how the model works, you’ll get much better results if the context aligns well with patterns used to train the model or if it contains many similar examples. If you try something that the LLM hasn’t been trained on, the results could be unpredictable and generally lower quality.
  2. Regression to the Mean: Another name for “most probable” is “generic” or “average.” This means your code will over time converge to the industry average (in all aspects, including quality) unless you actively and consistently steer the model towards higher quality. This effect is also known as semantic ablation.
  3. Input Management: Since context is the only input that the LLM has during inference, managing what goes into that context is extremely important. Even the same user prompt can produce completely different results depending on what else goes into the context.

How does Cursor or Claude Code fit into this? They are higher-level tools which help with observation #3 by managing LLM context for you. They do this by including relevant snippets of code (according to their own definition of relevance) and some of their own prompts. Each tool does this slightly differently, and I personally haven’t observed a significant difference between them. In my experience, the LLM model selection is much more impactful.

So how can we apply this theory in practice? We need to help these tools by directing them towards including only the most important information into the context. Dumping everything will definitely be counter-productive.

Here are a few patterns that I recommend:

  1. Use “Skills” or Docs: Split all available guidance and instructions into smaller chunks that can be loaded on-demand to keep context size small. “Skills” is a relatively new concept where tools basically do a RAG search over available documentation to include only the relevant chunks. Like any RAG, this is probabilistic; while it generally works, it’s not very reliable. You’ll need to experiment with putting the right keywords into a skill description so that it gets selected when needed. Sometimes it helps to mention a skill explicitly by name if you know that the tool will definitely need it.
  2. Utilize “Planning Mode”: If you are doing anything more than a trivial code change, use planning. Coding agents always start with analyzing existing files and determining the steps they need to perform to accomplish the task. This bloats the context quite a bit. Planning mode helps to condense the results of that analysis into a much smaller number of tokens, which leaves a lot more of the context window available for actually executing the plan. It basically expands your own prompt for you by augmenting it with more details. Additionally, it’s much easier to steer the LLM by modifying a small plan than by asking it to rewrite the code (which takes a lot more context).
  3. Watch the Context Limit: Don’t try to accomplish huge individual tasks that don’t fit into the context. The LLM attention algorithm tends to perform worse when it gets near the limit of its context window. Tools work around this problem by performing a “compaction” step, with varying degrees of success. It’s very arbitrary how the tool decides what to keep in context and what to drop, so in my personal experience, I stop the task and start with a fresh context and smaller task scope.
  4. Code is cheap: Always remember the “average result” observation and do not let the LLM drag down your code quality towards that average. You can help by ensuring some “high quality” snippets are always present in the context - for example, by mentioning them in AGENTS.md or skills, or by explicitly pointing towards some existing files that you consider high quality.

Conclusion: Know your tools, use them well, keep in mind their limitations, don’t delegate thinking to the LLM and don’t push responsibility to code reviewers. Ultimately, you own the code, no matter which tools you use to produce it. Your job is to deliver code you have proven to work.


Profile picture

Written by Oleg Anashkin who lives in Bay Area tinkering with AI, solving problems and building things. You can find me on LinkedIn.