Looking back at the Schillace "laws"

Looking back at the Schillace "laws"

Way back in, I think, March or so of 2023, after I’d spent a little while trying to build things with GPT-4, I wrote down a set of impressions of useful programming “laws” (I think of them more as best guesses, but marketing folks renamed them) for working with AI. So much has happened since then, I thought it would be fun to look back at them and see how well they’ve held up.

So, here they are, with my (biased of course) comments. Also - happy holidays! I will take a week or two off from writing (unless I’m bored).

  1. Don’t write code if the model can do it; the model will get better, but the code won’t.

This one seems to have held up fairly well - the models are definitely getting a lot better, and there are new approaches, like o1, that are replacing a lot of what people were doing in code. The “bitter lesson” would argue that this will continue, and it might - if we get better at things like memory and grounding, a lot of what people are writing now for agents seems like it will be transient, for example. I suspect code that connects models to tools will continue to be valuable though.

2. Trade leverage for precision; use interaction to mitigate.

This is really talking about hallucinations, which are better but still an issue. I think I stand by it: good design of code involving LLMs takes this into account and allows for human interaction. You can see this in things like Gemini Deep Research, where they ask the user to validate the plan before going off to do it. I think the code editors like Cursor and GitHub Copilot are showing this too.

3. Code is for syntax and process; models are for semantics and intent.

This still feels right to me. Without better grounding and continuous memory, models struggle to do metacognition and be reliable. The models are great at language and messiness, extracting intent and inferring, and code is great for reliability. As a new programming paradigm emerges, fusing these two worlds seems to be the central challenge, akin to how managing data consistency and uptime in the context of cost was one of the central challenges of coding on the web. So this still feels right to me.

4. The system will be as brittle as its most brittle part.

Yes, but whatever. This one is kind of dumb (imho) because it’s more of a general statement, and also a tautology. But sure, it’s definitely the case that if you build an LLM-based system and rely on it to not have prompt injection vulnerabilities or hallucinations, you will struggle with robustness. Choosing which tasks to do where is important. This is why I said “bots are docs” long ago: we shouldn’t enforce things like ACLs with “social engineering of LLMs”, we should enforce them in code.

5. Ask Smart to Get Smart.

I like this one a lot still, and it’s one of the more counter-intuitive and less understood ones. Models are a very large and high-dimension latent space. When you ask a question, you are really pointing to a complex shaped region of that space. If you ask with a fifth-grader’s language and sophistication, you are pointing to “fifth grade understanding and language” part of that space. You can start to see this really taking hold with things like o1, where it’s some real work to ask a “worthy” question but once you do, you can get really incredible results. I’ve seen this continue to progress as context windows get larger, too - now how you structure the prompt matters, and there are lots of folks out there with carefully built multi-page prompts that are real tools in their work.

I expect this one to continue. We may even get to the point where we aren’t smart enough to ask good questions of a very advanced model - we have to get a model to structure the question for us!

6. Uncertainty is an exception throw.

This still holds up though I was being pretty gnostic. What this means is “if the model is unsure, ask the human”. I see workflows like this being built, but slowly. I don’t know if this as much a general principle as a design pattern. I do think that we are still in negotiation about how to manage the tension between the immense power of these models and their unreliability. We need better ways to formalize “human in the loop” - that UX pattern is waiting to be created by some clever designer.

7. Text is the universal wire protocol.

Meh. Maybe. This might be the most wrong of them - '“language” is probably going to become the actual programming language as we build systems that are better and better at creating code from expressions of intent, but as far as models communicating - I don’t know. Maybe I get partial credit because this one seems to be speaking towards how multi-agent systems work, and things like screen driving agents seem to point in conceptually the same direction. Maybe this should have been more like “models will work with the same tools as humans”: speech, vision, GUIs.

8. Hard for you is hard for the model.

True-ish but more complicated. Models are getting better enough and attention mechanisms are scaling, so the models are not mapping quite as cleanly onto human behaviors, and there are lots of things like the “r’s in strawberry” problem that break this. Still, when thinking about larger patterns and programs, it’s often useful (I find) to think in terms of “how would I/another person do this”) as a way to approach the architecture. So in that sense, there is still alignment here.

9. Beware pareidolia of consciousness; the model can be used against itself.

This goofy thing. It’s still true but, damn, does it need better language. What this means is “the models appear to have things like continuous state of mind but they are really more like stop motion illusions”. This is still very true, maybe getting truer. There is a lot of anthropomorphizing about models (“agents” for example) that I think isn’t useful and is leading teams down blind alleys. We are still missing big pieces like self-simulation and continuous training: it’s better to think of the models as tools.

As for the “model can be used against itself” - that’s still very true. That just means that, since the models aren’t stateful, they can be used to evaluate their own output, or control themselves, etc. We use that technique all the time.

There has been a lot of change since I wrote those, and there are more useful models and techniques all the time. We probably need a new rule for how to deal with test-time compute, for example, and there are new kinds of generation (video) and UX (voice). So, things will continue to evolve, but overall, I don’t feel too bad about this.

If you can add to the list, let me know!

Rachel Berg

Operations executive | Microsoft, Blix, Boosted, PRTM, PwC, GE | MBA, Haas

2d

Fun read.

Like
Reply
Damon Y.

Enterprise Transformation Leader | Aligning Technology, Business Operations, and People to Achieve Strategic Change

3d

This is really fascinating, and, frankly, I think each one is worth its own separate article or even whitepaper to fully evaluate. But for my money, #5 is the most interesting. Sam, do you have any concrete examples you can share to elaborate?

Jennifer Beckmann

Principal Engineering Manager at Microsoft

3d

My thoughts lately have been going to “Undo is essential”, especially as AI automation starts to catch on. All the lessons we have learned and continue to learn about rollout and mitigation hasn’t yet been added to the AI automation lexicon.

Jose Luis Latorre

Developer Community Lead & Software Architect at Swiss Life AG | LinkedIn Learning Instructor | Microsoft AI MVP | K6 Champion| Speaker

3d

I loved the "Don't write code if the model can do it" one since I heard it from John Maeda last year... it is one of the most common sense statements I have heard in a while. Only catch is that some models want to do "too much" and they become non-standard but it is a really good "rule of thumb" to follow on this AI - days. Thanks for this!

Abram Jackson

PM for extensibility of Copilot for Microsoft 365

3d

Off the top of my head: "Conversation is the best prompting" "Missing context = slop" "Always use examples"

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics