Practical experience is hard currency right now
What’s fun about coding with LLMs right now is that literally nobody knows anything right now.
The few people who know a few things will find that their knowledge can change overnight with the release of a new model or a new version of the AI tooling.
This is great!
The only currency that counts right now is experience.
And getting experience is accessible to everyone.
I have been and still am actively trying to hit the limits when working with Cursor and Sonnet 3.7.
An important part of that is asking the right questions.
Here are mine.
Are simple prose rules enough?
I’m getting good leverage – fewer mistakes – from Cursor rules, written in their undocumented format, generated by Cursor itself.
Making sure the right rules end up in the context window is often tricky, but not unmanageable.
What I need to try: would a simple markdown file with instructions, explicitly included in the context, yield better or worse results when coding with an AI?
This would make the rules independent of a concrete environment like Cursor or Windsurf.
What would happen if I encoded all rules in the linter?
Rules encode a request or wish for the LLM to do the right thing.
Would I get better results by writing a project-specific linter that enforces every rules, thus providing a closed feedback loop to the LLM?
How would I approach process-related rules, like writing tests first? My hunch is that to create a notion of first, there needs to be a notion of before and after.
Leveraging git history or planning documents for this comes to mind.
When is the feedback loop fast enough?
I’ve used Plan-Implement-Test cycles successfully in an existing large project (~10000 files, multiple backend services, large frontend codebase) to implement well-scoped changes
But the feedback loop is terribly slow (multiple seconds to run IO-less tests).
What if I write my next project in such a way that the feedback loop stays under 1s for component: for running all kinds of static checks, compilation, tests, etc
Would being this fast get me better results?
Can I script the high-level development flow?
Over and over I find myself in the same loop:
Draft specification with the AI,
Convert the specification into a step-by-step plan according to the project rules,
Ask the LLM to code up the first step
Verify result using tests – keep doing this until all tests pass
Mark plan item as completed and move to the next plan item
To what degree is automation possible here?
Do I need to code my own agent?
Is a command line tool enough to make this process smoother?
Do I need to write a VSCode extension for this and load it into Cursor?
Should I express expectations with code-samples?
Often I tell the LLM what I want in English and I get working results.
Sometimes I learn something new about the technology I’m using.
At other times, I find the result hard to relate to and understand.
Would I get code that’s easier to keep in my head if I prompted it with concrete code example, e.g. function signatures without an implementation?
The promised land
This list of questions is not exhaustive and finding answers to all of this is a lot of work.
What is the motivation behind this?
Why dig deeper?
Because the end goal is compelling – the machine chugging away with little oversight, until it hits a stable state, freeing up my time to refine the execution plan.
And then, in the next step, running two of these busy workers, then three, then four, until I manage an entire “team”.