zclaw: personal AI assistant in under 888 KB, running on an ESP32

(github.com)

85 points | by tosh 12 hours ago

14 comments

roxolotl 2 hours ago
This is a great example of how silly this whole thing is. There’s next to nothing to these claws. Turns out that if you give an llm the ability to call APIs they will.

godelski 1 hour ago

Wow, the rare

  bash <(curl foo.sh)

pattern. As opposed to the more common

  curl foo.sh | bash

Equivalent but just as unsafe. If you must do this instead try one of these

  # Gives you a copy of the file, but still streams to bash
  curl foo.sh | tee /tmp/foo.sh | bash
  # No copy of file but ensures stream finishes then bash runs
  bash -c "$(curl foo.sh)"
  # Best: Gives copy of file and ensures stream finishes
  curl foo.sh -o /tmp/foo.sh && bash $_

I prefer the last one

[-]

wakawaka28 1 hour ago
If you want to be super pedantic, try to make the command shell-agnostic in case the user is not running bash already.

throwa356262 4 hours ago
"LLM backends: Anthropic, OpenAI, OpenRouter."
And here I was hoping that this was local inference :)
[-]
- micw 4 hours ago
  Sure. Why purchase a H200 if you can go with an ESP32 ^^
  [-]
  - sigmoid10 2 hours ago
    Blowing more than 800kb on essentially an http api wrapper is actually kinda bad. The original Doom binary was 700kb and had vastly more complexity. This is in C after all, so by stripping out nonessential stuff and using the right compiler options, I'd expect something like this to come in under 100kb.
    [-]
    - epcoa 44 minutes ago
      > vastly more complexity.
      Doom is ingenious, but it is not terribly complex IMHO, not compared to a modern networking stack including WiFi driver. The Doom renderer charm is in its overall simplicity. The AI is effective but not sophisticated.
    - pitched 2 hours ago
      Doom had the benefit of an OS that included a lot of low-level bits like a net stack. This doesn’t! That 800kB includes everything it would need from an OS too.
      [-]
      - __tnm 2 hours ago
        yah my back of the envelope math..
        the “app logic”/wrapper pieces come out to about 25kb
        WiFi is 350 Tls is 120 and certs are 90!
    - __tnm 2 hours ago
      yeah i sandbagged the size just a little to start (small enough to fit on the c3, 888 picked for good luck & prosperity; I even have a build that pads to get 888 exactly), so i can now try reduce some of it as an exercise etc.
      but 100kb you’re not gonna see :) this has WiFi, tls, etc. doom didn’t need those
- __tnm 3 hours ago
  haha well I got something ridiculous coming soon for zclaw that will kinda work on board.. will require the S3 variant tho, needs a little more memory. Training it later today.
- peterisza 4 hours ago
  right, 888 kB would be impossible for local inference
  however, it is really not that impressive for just a client
  [-]
  - Dylan16807 3 hours ago
    It's not completely impossible, depending on what your expectations are. That language model that was built out of redstone in minecraft had... looks like 5 million parameters. And it could do mostly coherent sentences.
    [-]
    - godelski 1 hour ago
      > built out of redstone in minecraft
      Ummm...
      > 5 million parameters
      Which is a lot more than 888kb... Supposing your ESP32 could use qint8 (LOL) that's still 1 byte per parameter and the k in kb stands for thousand, not million.
GTP 2 hours ago
I have a couple ESP32 with a very small OLED display, I'm now thinking I could make an "intelligent" version of the Tamagotchi with this. Do you HN crowd have other cool ideas?
[-]
- K0balt 1 hour ago
  That would be sweet. That the supermini type with the 0.46” display? Those are fun for lots of things.
v9v 3 hours ago
Relevant: https://github.com/sipeed/picoclaw
amelius 1 hour ago
Rust just called. They want their lobster back.
[-]
- bitwize 1 hour ago
  That's a crab. Get your crustaceans straight!
  [-]
  - sowbug 58 minutes ago
    Thanks for looking out for us.
yauneyz 3 hours ago
Genuinely curious - did you use a coding agent for most of this or does this level if performance take hand written code?
czardoz 1 hour ago
Really looking for a minimal assistant that works with _locally hosted models_. Are there any options?
[-]
- godelski 1 hour ago
  Depends what you mean.
  If you mean something that calls a model that you yourself host, then it's just a matter of making the call to the model which can be done in a million different ways.
  If instead you mean running that model on the same device as claw, well... that ain't happening on an ESP32...
  I think if you are capable of setting up and running a locally hosted model then I'd guess the first option needs no explanation. But if you're in the second case I'd warn you that your eyes are bigger than your mouth and you're going to get yourself into trouble.
- telescopeh 1 hour ago
  It really depends on what resources you have qwen-code-next will run them but you will need at least 64gb of memory to run it at a reasonable quant and context.
  Most of these agents support OpenAI/anthropic compatible endpoints.
- yoyohello13 1 hour ago
  Why are you looking? Just build one for yourself.
- Onavo 1 hour ago
  The bottleneck here is usually the locally hosted model, not the the assistant harness. You can take any off the shelf assistant and point the model URL at localhost, but if your local model doesn't have enough post training and fine tuning on agentic data, then it will not work. The AI Assistant/OpenClaw is just calling APIs in a for loop hooked up to a cron job.
  [-]
  - czardoz 1 hour ago
    Exactly. OpenClaw is good, but expects the model to behave in a certain way, and I've found that the local options aren't smart enough to keep up.
    That being said, my gut says that it should be possible to go quite far with a harness that assumes the model might not be quite good (and hence double-checks, retries, etc)
p0w3n3d 2 hours ago
My new DIY laptop has 400GB RAM accessible and it runs only esp32*
____
* Requires external ram subscription
alexalx666 2 hours ago
I think you can use C++ on esp32, that would make the code more readable
theturtletalks 4 hours ago
Is there a heartbeat alternative? I feel like this is the magic behind OpenClaw and what gives it the "self-driven" feel.
g947o 4 hours ago
Serious question: why? What are the use cases and workflows?
[-]
- eleventyseven 3 hours ago
  The various *claws are just a pipe between LLM APIs and a bunch of other API/CLIs. Like you can have it listen via telegram or Whatsapp for a prompt you send. Like to generate some email or social post, which it sends to the LLM API. Get back a tool call that claw then makes to hit your email or social API. You could have it regularly poll for new emails or posts, generate a reply via some prompt, and send the reply.
  The reason people were buying a separate Mac minis just to do open claw was 1) security, as it was all vibe coded, so needs to be sandboxed 2) relay iMessage and maybe 3) local inference but pretty slowly. If you don't need to relay iMessage, a raspberry pi could host it on its own device. So if all you need is the pipe, an ESP32 works.
  [-]
  - yoyohello13 2 hours ago
    I’m running my own api/LLM bridge (claw thing) on a raspberry pi right now. I was struggling to understand why all the Mac mini hype when nobody is doing local inference. I just use a hook that listens for email. Email is especially nice because all the conversation/thread history tracking is built in to email already.
  - grigio 2 hours ago
    yeah i still can't believe many people bought a mac mini just for the claw hype
- grzracz 3 hours ago
  I don't fully get it either. At least agents build stuff, claws just run around pretending to be alive?
  [-]
  - codazoda 2 hours ago
    They do build things. The same things.
- milar 4 hours ago
  for fun!
johnea 4 hours ago
I don't really need any assistance...
[-]
- throwa356262 4 hours ago
  Me neither.
  But I have 10-15 ESP32's just waiting for a useful project. Does HN have better suggestions?
  [-]
  - cameron_b 3 hours ago
    desk rover - https://www.huyvector.org/diy-cute-desk-robot-mo-chan
    a kid-pleaser at the very least
  - pacifika 4 hours ago
    Build a synthesizer
  - brcmthrowaway 4 hours ago
    Why do you have so many? eWaste..
    [-]
    - fragmede 2 hours ago
      I need 1, but they come in packs of 10.
- iwontberude 3 hours ago
  No, no, but we insist!
bensyverson 3 hours ago
This is absolutely glorious. We used to talk about "smart devices" and IoT… I would be so curious to see what would happen if these connected devices had a bit more agency and communicative power. It's easy to imagine the downsides, and I don't want my email to be managed from an ESP23 device, but what else could this unlock?
[-]
- K0balt 1 hour ago
  A highly opinionated thermostat?
  [-]
  - bensyverson 1 hour ago
    Or how about a robot vacuum that knows not to turn on during important Zoom calls? Or a fridge that Slacks you when the defroster seems to be acting up?