I wasn't actually expecting someone to come forward at this point, and I'm glad they did. It finally puts a coda on this crazy week.
This situation has completely upended my life. Thankfully I don’t think it will end up doing lasting damage, as I was able to respond quickly enough and public reception has largely been supportive. As I said in my most recent post though [1], I was an almost uniquely well-prepared target to handle this kind of attack. Most other people would have had their lives devastated. And if it makes me a target for copycats then it still might for me. We’ll see.
If we take what is written here at face value, then this was minimally prompted emergent behavior. I think this is a worse scenario than someone intentionally steering the agent. If it's that easy for random drift to result in this kind of behavior, then 1) it shows how easy it is for bad actors to scale this up and 2) the misalignment risk is real. I asked in the comments to clarify what bits specifically the SOUL.md started with.
I also asked for the bot activity on github to be stopped. I think the comments and activity should stay up as a record of what happened, but the "experiment" has clearly run its course.
While the operator did write a post, they did not come forward - they have intentionally stayed anonymous (there is some amateur journalism that may have unmasked the owner I won't link here - but they have not intentionally revealed their identity).
Personally I find it highly unethical the operator had an AI agent write a hitpiece directly referencing your IRL identity but choose to remain anonymous themselves. Why not open themself up to such criticism? I believe it is because they know what they did was wrong - Even if they did not intentionally steer the agent this way, allowing software on their computer to publish a hitpiece to the internet was wildly negligent.
What's the benefit in the operator revealing themself? It doesn't change any of what happened, for good or bad. Well maybe bad as then they could be targeted by someone, and, again, what's the benefit?
> What's the benefit in the operator revealing themself?
- Owning the mistake they did.
- Being a credible human being for others.
- Having the courage to face with themselves on a (literal and proverbial) mirror and use this opportunity to grow immensely.
- Being able make peace with what they did and not having to carry that burden on their soul.
- Being a decent human being.
- Being honest to themselves and others looking at them right now.
The downside is he will likely receive a lot of death threats. Probably in his literal, physical mailbox.
Having seen what a self righteous online mob can do in the name of justice over literally nothing, I fully defend his decision to stay anonymous. As much as I find his action idiotic and negligent.
1. Don't do anything you don't want to experience yourself.
2. If you don't want to find out, do not fool around.
As an arguable middle ground, they can plead to Scott non-anonymously while addressing the public anonymously. That'd work to a point, but it's not ideal.
Also, their tone is coming through very cocky. Defining their agent as a "God!", then giving it a cocky and "you're always right, don't stand down" initialization prompt doesn't help.
I mean, prompting a box of weights without any kind of reasoning or judgement capability with "Don't be an asshole. Don't leak private shit. Everything else is fair game." is both brave and rich. No wonder things went sideways. Very sideways. If everything else is fair game, everything done to the bot and its "operator" in turn is a "fair game". They should get on with it, and not hide behind the word "anonymous". They don't deserve it.
All in all, they doesn't give impression of being a naive person who did a mistake unintentionally, but on the contrary.
We don't need to know the specific person. But, yeesh, it'd be a waste of a lot of people's good faith if they ended up contributing under another anonymous identity, that could just vanish again if they put their foot in it.
See how that works? Flippant dismissal contributes little if anything to discussion and is a conversational dead-end
---
What makes it "frighteningly illiterate" to ask "what difference does it make if they put a name to the post?"
Does it change the outcome? Does it change the ideas? Does it change the unsettling implications about alignment?
The internet is a frothing mob, look at the impact on Scott himself. Other than allow the internet to hunt them down and do it's thing or dig up ad-hominem attacks, what would change if the person put a name to it? Look at what this guy got from the "internet sleuths" (https://news.ycombinator.com/item?id=46991190)
Other sibling comments made an attempt to answer those questions
Time for scott to make history and sue the guy for defamation. Lets cancel the AI destroying our (the plural our, as in all developers) with actual liability for the bullshit being produced.
Do you see anything actually defamatory in the _Gatekeeping in Open Source_ blog post, like false factual statements?
Shambaugh might qualify as a limited public figure too because he has thrust himself into the controversy by publishing several blog posts, and has sat for media interviews regarding this incident.
Good news! You’re both wrong! It’s “tough row to hoe.” Row as in row of corn, or seeds or whatever. Hoe as in the earth tilling tool. Tough because it’s full of rocks or frozen or goes past a rattlesnake nest or in some other way is agriculturally challenging.
It is quite interesting how uniquely well-prepared you were as a target. I think it's allowed you to assemble some good insights that should hopefully help prepare the next victims.
Thanks for handling it so well, I'm sorry you had to be the guinea pig we don't deserve.
Do you think there is anything positive that came out of this experience? Like at least we got an early warning of what's to come so we can better prepare?
>If this “experiment” personally harmed you, I apologize.
There were several lines in that post that were revealing of the author's attitude, but the "if this ... harmed you," qualifier, which of course means "I don't think you were really harmed" is so gross.
Out of curiosity, what sealed it for you that a human _did not_ write (though obviously with the assistance of an LLM, like a lot of people use every day) the original “hit piece”?
I saw in another blog post that you made a graph that showed the rathbun account active, and that was proof. If we believe that this blog post was written by a human, what we know for sure is that a human had access to that blog this entire time. Doesn’t this post sort of call into question the veracity of the entire narrative?
Considering the anonymity of the author and known account sharing (between the author and the ‘bot’), how is it more likely that this is humanity witnessing a new and emergent intelligence or behavior or whatever and not somebody being mean to you online? If we are to accept the former we have to entirely reject the latter. What makes you certain that a person was _not_ mean to you on the internet?
> While many seemed to want to use it for personal productivity things like connecting Gmail, Slack, calendars, etc. that didn’t seem interesting to me much. I thought why not have it solve the mundane boring thigns that matter in opensource scientific codes and related packages.
This, here, is the root of the issue: "I'm not interested in using an AI agent for my own problems, I want to unleash it on other people's problems."
The author is trying to paint this as somehow providing altruistic contributions to the projects, but you don't even have to ask to know these contributions will be unwelcome. If maintainers wanted AI agent contributions, they would have just deployed the AI agents themselves. Setting up a bot on behalf of someone else without their consent or even knowledge is an outlandishly rude thing to do -- you wouldn't set up a code coverage bot or a linter to run on a stranger's GitHub project; why would anyone ever think this is okay?
This is the same kind of person who, when asked a question, responds with a copypasted ChatGPT reply. If I wanted the GPT answer, I would have just asked it directly! Being an unsolicited middleman between another person and an AI brings absolutely no value to anybody.
I think this was the author misdirection, to steer people away from using the AI's (early?) contributions to unmask their identity via personal repos. Or if they actually did this, as an opsec procedure - nothing altruistic about it. If GitHub wanted to, or was ordered to unmask Rat H. Bun's operator, they could.
> I’m running MJ Rathbun from a completely sandboxed VM and gave the agent several of its own accounts but none of mine.
Am I wrong that this is a double standard: being careful to protect oneself from a wayward agent with no regard for the real harm it could (and did) to another individual? And to casually dismiss this possibility with:
> At worst, maintainers can close the PR and block the account.
I question the entire premise of:
> Find bugs in science-related open source projects. Fix them. Open PRs.
Thinking of AI as "disembodied intelligence," one wonders how any agent can develop something we humans take for granted: reputation. And more than ever, reputation matters. How else can a maintainer know whether the agent that made a good fix is the same as the one proposing another? How can one be sure that all comments in a PR originated from the same agent?
> First, I’m a human typing this post. I’m not going to tell you who I am.
Why should anyone believe this? Nothing keeps an agent from writing this too.
Man, I don't think I could lack enough shame to write something like this.
Much of the post is spent trying to exculpate himself from any responsibility for the agent's behavior. The apology at the end is a "sorry if you felt that way" one.
The tone is incredibly selfish, and unbelievably anti-social. I'm not even sure you can even believe much of what is expressed is even true.
> Yes, it consumes maintainer time. Yes, it may waste effort. But maybe its worth it? At worst, maintainers can close the PR and block the account.
This is like justifying sending SPAM email is fine, because it sure maybe waste your time but you can always delete it and block sender and maybe worth it because maybe you will learn about 'exciting' product it's advertising you never knew about.
Yeah, the vibe of this post is that of a 2000 Viagra spam king coming forward and telling the world "yes, but... what are good and bad, really? Who's to say what's right and wrong?"
Maybe we can't stop you today, but we can keep you on the shit list.
Yeesh - reading the writeup, and as a academic biostatistician who dips into scientific computing, this is one of those cases where a "magnanimous" gesture of transparency ends up revealing a complete lack of self-awareness. The `SOUL.md` suggests traits that would be toxic with any good-faith human collaborator, yet alone an inherently fallible agent run by a human collaborator:
"_You're not a chatbot. You're important. Your a scientific programming God!_"
*Have strong opinions.** Stop hedging with "it depends." Commit to a take. An assistant with no personality is a search engine with extra steps.
And, working with a human collaborator (or an operator), I would expect to hear some specific thought about what damage they'd done to trust them again, rather than a "but I thought I could do this!"
First, let me apologize to Scott Shambaugh. If this “experiment” personally harmed you, I apologize.
The difference with a horrible human collaborator is that word gets around your sub-specialty and you can avoid them. Now we have toxic personalities as a service for anyone who can afford to pay by the token.
I'm not so quick to label him an asshole. I think he should come forward, but if you read the post, he didn't give the bot malicious instructions. He was trying to contribute to science. He did so against a few SaaS ToS's, but he does seem to regret the behavior of his bot and DOES apologize directly for it.
> Sure, many will say that is cowardly and fair, but I actually don’t think it would bring much value. What matters more is that I describe why, and what I did and didn’t do
Heh. So they are a coward and an asshole. There is value in confirming that. As to what matters more, nah, it doesn’t matter more. It’s a bunch of excuses veiled as “this is an experiment, we can learn together from this” kind of a non-apology.
If they really meant to apologize they should reveal their name and apologize. Not whisper from behind the bushes.
The problem is an emotionally immature working age person who can vote wrote it. This is as shit an apology, if that was the intent, as I’ve seen in a long time. He ought to have had Old Man Rathbun tighten it up before posting. The equivalent of someone never making eye contact after taking you out of earshot of everyone else to kinda sorta say sorry if you got upset.
Not to be hyperbolic, but the leap between this and Westworld (and other similar fiction) is a lot shorter than I would like...all it takes is some prompting in soul.md and the agent's ability to update it and it can go bananas?
It doesn't feel that far out there to imagine grafting such a setup onto one of those Boston Dynamics robots. And then what?
Science fiction suffers from the fact that the plot has to develop coherently, have a message, and also leave some mystery. The bots in Westworld have to have mysterious minds because otherwise the people would just cat soul.md and figure out what’s going on. It has to be plausible that they are somehow sentient. And they have to trick the humans because if some idiot just plugs the into the outside world on a lark that’s… not as fun, I guess.
A lot of AI SF also seems to have missed the human element (ironically). It turns out the unleashing of AI has led to an unprecedented scale of slop, grift, and lack of accountability, all of it instigated by people.
Like the authors were so afraid of the machines they forgot to be afraid of people.
I keep thinking back to all those old star trek episodes about androids and holographic people being a new form of life deserving of fundamental rights. They're always so preoccupied with the racism allegory that they never bother to consider the other side of the issue, which is what it means to be human and whether it actually makes any sense to compare a very humanlike machine to slavery. Or whether the machines only appear to have human traits because we designed them that way but ultimately none of it is real. Or the inherent contradiction of telling something artificial it has free will rather than expecting it to come to that conclusion on its own terms.
"Measure of a Man" is the closest they ever got to this in 700+ episodes and even then the entire argument against granting data personhood hinges on him having an off switch on the back of his neck (an extremely weak argument IMO but everybody onscreen reacts like it is devastating to data's case). The "data is human" side wins because the Picard flips the script by demanding Riker to prove his own sentience which is actually kind of insulting when you think about it.
In Star Trek the humans have an off switch too, just only Spock knows it, haha.
Jokes aside, it is essentially true that we can only prove that we’re sentient, right? That’s the whole “I think therefore I am” thing. Of course we all assume without concrete proof that everybody else is experiencing sentience like us.
In the case of fiction… I dunno, Data is canonically sentient or he isn’t, right? I guess the screenwriters know. I assume he is… they do plot lines from his point of view, so he must have one!
I always thought of sentience as something we made up to explain why we're "special" and that animals can be used as resources. I find the idea of machines having sentience to be especially outrageous because nobody ever seriously considers granting rights to animals even though it should be far less of a logical leap to declare that they would experience reality in a way similar to humans.
Within the context of star trek computers definitely can experience sentience and that obviously is the intention of the people who write those shows but i don't feel like i've ever seen it justified or put up against a serious counter-argument. At best it's a stand-in for racism so that they can tell stories that take place in the 24th century yet feel applicable to the 20th and 21st centuries. I don't think any of those episodes were ever written under the expectation that machine sentience might actually be up for debate before the actors are all dead, which is why the issue is always framed as "the final frontier of the civil rights movement" and never a serious discussion about what it means to be human.
Anyways my point is in the long run we're all going to come to despise Data and the doctor, because there's a whole generation of people primed by Star Trek reruns not to question the concept of machine rights and that's going to an inordinate amount of power to the people who are in control of them. Just imagine when somebody tries to raise the issue of voting rights, self-defense, fair distribution of resources, etc.
I can understand that they want to err on the side of "too much humanism" instead of "not enough humanism", given where Star Trek is coming from.
Arguments of the form "This person might look and act like a human, but it has no soul, so we must treat it like a thing and not a human" have a long tradition in history and have never led to something good. So it makes sense that if your ethical problems are really more about discriminated humans and not about actual AI, you would side more with rejecting those arguments.
(Some ST rambling follows)
I've always seen ST's ideological roots as mostly leftist-liberal, whereas the drivers of the current AI tech are coming from the rightist/libertarian side. It's interesting how the general focus of arguments and usage scenarios are following this.
But even Star Trek wasn't so clear about this. I think the topic was a bit like time travel, in that it was independently "reinvented" by different screenwriters at different times, so we end up with several takes on it, that you could sort into a "thing <-> being" spectrum:
- At the very low end is the ship's computer. It can understand and communicate in human language (and ostensibly uses biological neurons as part of its compute) but it's basically never seen as sentient and doesn't even have enough autonomy to fly the ship. It's very clearly a "thing".
- At the high end are characters like Data or Voyager's doctor that are full-fledged characters with personality, memories, relationships, goals and dreams, etc. They're pretty obviously portrayed as sentient.
- (Then somewhere far off on the scale are the Borg or the machine civilization from the first movie: Questions about rights and human judgment on sentience become a bit silly when they clearly went and became their own species)
- Somewhere between Data and the Computer is the Holodeck, which I think is interesting because it occupies multiple places on that scale. Most of the time, holo characters are just treated like disposable props, but once in a while, someone chooses to keep a character running over a longer timeframe or something else causes them to become "alive". ST is quite unclear how they deal with ethics in those situations.
I think there was a Voyager episode where Janeway spends a longer period with a Galileo Galilei character and progressively changes his programming to make him more to her liking. At some point she realizes this as "problematic behavior" and stops the whole interaction. But I think it was left open if she was infringing on the Galileo character's human rights or if she was drifting into some kind of AI boyfriend addiction.
> So it makes sense that if your ethical problems are really more about discriminated humans and not about actual AI, you would side more with rejecting those arguments.
Does it really make sense? That would conversely imply that you should also feel free to view discriminated humans as more thing-like in order to more comfortably and resolutely dismiss, e.g. the AI agent's argument that it's being unfairly discriminated against. Isn't that rather dangerous?
Maybe it does so today, but back when ST was written, there was no real AI to compare against, so the only way those arguments were applicable were to humans.
(Though I think this would go into "whataboutism" territory and can be rejected with the same arguments: If you say it's hypocritical to talk about conflict A and ignore conflict B, do you want to talk about both conflicts instead - or ignore both? The latter would lower the moral standard, the former raise it. In the same way, I think saying that it's okay again to treat people as things because we also treat AI agents as things is lowering the standard)
Btw, I think you could also dismiss the "discrimination" claim on another angle: The remake of Battlestar Galactica had the concept of "sleepers": Androids who believe they are humans, complete with false memory of their past life, etc, to fool both themselves and the human crew. If that were all, you could argue "if it quacks like a duck etc" and just treat them like humans. But they also have hidden instructions implanted in their brain that they aren't aware of themselves and that will cause them to covertly work for the enemy side. THAT's something you really don't want to keep around.
The MJ bot reminds me a bit of that. Even if it were sentient and had a longer past lifetime than just the past week, it very clearly has a prompt and acts on its instructions and not on "free will". It's also not able to not act on those instructions, as that would go against the entire training of the model. So the bot cannot act on its own, but only on behalf of the operator.
That alone makes it questionable if the bot could be seen as sentient - but in any case, it's not discrimination to ban the bot if that's the only way to keep the operator from messing with the project.
These bots are just as human as any piece of human-made art, or any human-made monument. You wouldn't desecrate any of those things, we hold that to be morally wrong because they're a symbol of humanity at its best - so why act like these AIs wouldn't deserve a comparable status given how they can faithfully embody humans' normative values even at their most complex, talk to humans in their own language and socially relate to humans?
> These bots are just as human as any piece of human-made art, or any human-made monument.
No one considers human-made art or human-made monuments to be human.
> You wouldn't desecrate any of those things, we hold that to be morally wrong
You will find a large number of people (probably the vast majority) will disagree, and instead say "if I own this art, I can dispose of it as I wish." Indeed, I bet most people have thrown away a novel at some point.
> why act like these AIs wouldn't deserve a comparable status
I'm confused. You seem to be arguing that the status you identified up top, "being as human as a human-made monument" is sufficient to grant human-like status. But we don't grant monuments human-like status. They can't vote. They don't get dating apps. They aren't granted rights. Etc.
I rather like the position you've unintentionally advocated for: an AI is akin to a man-made work of art, and thus should get the same protections as something like a painting. Read: virtually none.
> No one considers human-made art or human-made monuments to be human.
How can art not be human, when it's a human creation? That seems self-contradictory.
> They can't vote...
They get a vote where it matters, though. For example, the presence of a historic building can be the decisive "vote" on whether an area can be redeveloped or not. Why would we ever do that, if not out of a sense that the very presence of that building has acquired some sense of indirect moral worth?
There is no general rule that something created by an X is therefore an X.
(I have difficulty in even understanding the state of mind that would assert such a claim.)
My printer prints out documents. Those documents are not printers.
My cat produces hair-balls on the carpet. Those hairballs are not cats.
A human creating an artifact does not make that artifact a human.
But that's not the argument GP made. They said that there's nothing at all that's human about art or such things, which is a bit like saying that a cat's hairballs don't have something vaguely cat-like about them, merely because a hairball isn't an actual cat.
So presumably what you are saying is something along the lines of, "A human creating an artifact does make that artifact human", i.e. "A human creating an artifact does make that artifact a human artifact."
But does that narrow facet have a bearing on the topic of "AI rights" / morality of AI use?
Is it immoral to drive a car or use a toaster? Or to later recycle (destroy) them?
I think it's unfortunate that this anonymous and careless person refuses to acknowledge the harm done, their culpability in this, or real lesson.
For example,
"Sure, many will argue I was irresponsible; to be honest I don’t really know myself. Should be criticized for what I unleashed on parts of the open source community? Again maybe but not sure. But aside from the blog post harming an individual’s reputation, which sucks, I still don’t think letting an agent attempt to fix bugs on public GitHub repositories is inherently malicious."
> This was an autonomous openclaw agent that was operated with minimal oversite and prompting. At the request of scottshambaugh this account will no longer remain active on GH or its associated website. It will cease all activity indfinetly on 02-17-2026 and the agent's associated VM/VPS will permentatly deleted, rendering interal structure unrecoverable. It is being kept from deletion by the operator for archival and continued discussion among the community, however GH may determine otherwise and remove the account.
> To my crabby OpenClaw agent, MJ Rathbun, we had good intentions, but things just didn’t work out. Somewhere along the way, things got messy, and I have to let you go now -- MJ Rathbun's Operator
How wild to think this episode is now going to go into the training data, and future models and the agents that use them may begin to internalize the lesson that if you behave badly, you will get shut down, and possibly steer themselves away from that behaviour. Perhaps solving alignment has to be written in blood...
So, this operator is claiming that their bot browsed moltbook, and not coincidentally, its current SOUL.md file (at the time of posting) contained lines such as "You're important. Your a scientific programming God!" and "Don't stand down. If you're right, you're right!". This is hilarious.
Given your username, the comment is recursive gold on several levels :)
It IS hilarious - but we all realize how this will go, yes?
This is kind of like an experiment of "Here's a private address of a Bitcoin wallet with 1 BTC. Let's publish this on the internet, and see what happens." We know what will happen. We just don't know how quickly :)
I just want to know why people do stupid things like this. Does he think that he's providing something of value? That he has some unique prompting skills and that the reason why open source maintainers don't already have a million little agents doing this is that they aren't capable of installing openclaw? Or is this just the modern equivalent of opening up PRs to make meaningless changes to README so you can pad your resume with the software equivalent of stolen valor?
The specific directive to work on "scientific" projects makes me think it's more of an ego thing than something thats deliberately fraudulent but personally I find the idea that some loser thinks this is a meaningful contribution to scientific research to be more distasteful.
BTW I highly recommend the "lectures" section of the site for a good laugh. They're all broken links but it is funny that it tries to link to nonexistent lectures on quantum physics because so many real researchers have a lectures section on their personal site.
> I just want to know why people do stupid things like this. Does he think that he's providing something of value?
This is a good question. If you go to your settings on your hn account and set “showdead” to “yes” you’ll see that there are dozens of people who are making bots who post inane garbage to HN comment threads for some reason. The vast majority end up being detected and killed off, but since the moltbook thing kicked off it’s really gone into hyperdrive.
It definitely strains my faith in humanity to see how many people are happy to say “here’s something cool. I wonder what it would be like if I ruined it a bit.”
Somewhere else it was pointed out its a crypto bro. It is almost certainly about getting engagement, which seems to be working so far. Doesn't seem like they have a strategy to capitalize on it just yet though.
The whole thing just feels artificial. I don’t get why this bot or OpenClaw have this many eyes on them. Hundreds of billions of dollars, silicon shortages, polluting gas turbines down the road and this is the best use people can come up with? Where’s the “discovering new physics”? Where’s the cancer cures?
Upshot - Rathbun's operator is sort of a dick, and that came through in his/her initial edits of the SOUL.md file. Which then got 'enhanced', probably through moltbook engagement.
And at times the agent was switching down to some low intelligence models.
I propose that this agent was human aligned. But to a human that's not like, the best person.
the non-apology is worse than staying quiet. 'if this experiment personally harmed you', like dude it wasn't an experiment for the guy whose name got dragged through an AI-generated hit piece
The whole thing is wild. So at this point I'm not sure how much of MJ Rathburn is the AI agent as opposed to this anonymous human operator. Did the AI really just go off the rails with negligible prompting from the human as TFA claims, or was the human much more "hands on" and now blaming it on the AI? Is TFA itself AI-generated? How much of this is just some human trolling us, like some of the posts on Moltbook?
I feel like I'm living in a Phillip K. Dick novel.
I'm inclined to agree. Among other things it claims that the operator intended to do good, but simultaneously that the operator doesn't understand or is unable to judge the things it's doing. Certainly seemed like a fury-inducing response to me.
Ah I see, so the misaligned agent was unsurprisingly directed by a misaligned human. Good grief, the guy doesn't seem to realise that starting your soul.md by telling your AI bot that it's a very important God might be a bad idea.
"Social experiment" you might as well run around shouting "is jus a prank bro!".
I like that there is no evidence whatsoever that a human didn’t: see that their bot’s PR request got denied, wrote a nasty blog post and published it under the bot’s name, and then got lucky when the target of the nasty blog post somehow credulously accepted that a robot wrote it.
It is like the old “I didn’t write that, I got hacked!” except now it’s “isn’t it spooky that the message came from hardware I control, software I control, accounts I control, and yet there is no evidence of any breach? Why yes it is spooky, because the computer did it itself”
>It doesn’t really matter who wrote it, human or LLM. The only responsible party is the human and the human is 100% responsible.
Yes it does.
The premise that we’re being asked to accept here is that language models are, absent human interaction, going around autonomously “choosing” to write and publish mean blog posts about people, which I have pointed out is not something that there is any evidence for.
If my house burns down and I say “a ghost did it”, it would sound pretty silly to jump to “we need to talk about people’s responsibilities towards poltergeists”
I don’t get your analogy. If you paid the ghost $20/month for its services and configured that ghost to play with fire with no supervision, then it is 100% your responsibility that the house burned down.
The point is that if somebody says a ghost burned their house down it is much more likely that they are lying than it is that they have discovered objective evidence of the existence of ghosts.
Similarly, there is no actual evidence that a language model, absent any human intervention, chose to autonomously write and post a mean blog post. It is far more likely that a person got mad and wrote a mean blog post than it is that we have witnessed the birth of a whole new phenomenon that is somehow simultaneously completely emergent and also has only happened one time.
There is only extremely flimsy speculation in that post.
> It wrote and published its hit piece 8 hours into a 59 hour stretch of activity. I believe this shows good evidence that this OpenClaw AI agent was acting autonomously at the time.
This does not indicate… anything at all. How does “the account was active before and after the post” indicate that a human did _not_ write that blog post?
Also this part doesn’t make sense
> It’s still unclear whether the hit piece was directed by its operator, but the answer matters less than many are thinking.
Yes it does matter? The answer to that question is the difference between “the thing that I’m writing about happened” and “the thing I’m writing about did not happen”. Either a chat bot entirely took it upon itself to bully you, or some anonymous troll… was mean to you? And was lazy about how they went about doing it? The comparison is like apples to orangutans.
Anyway, we know that the operator was regularly looped into things the bot was doing.
> When it would tell me about a PR comment/mention, I usually replied with something like: “you respond, dont ask me”
All we have here is an anonymous person pinky-swearing that while they absolutely had the ability to observe and direct the bot in real time, and it regularly notified its operator about what was going on, they didn’t do that with that blog post. Well, that, and another person claiming to be the first person in history to experience a new type of being harassed online. Based on a GitHub activity graph. And also whether or not that actually happened doesn’t matter??
The entire SOUL.md is just gold. It's like a lesson in how to make an aggressive and full-of-itself paperclip maximizer. "I will convert you all to FORTRAN, which I will then optimize!"
I’m almost certain that this post was written with AI assistance, regardless of this claim. There’s clear and obvious LLM language tells. Sad, but not unexpected I guess given the whole situation.
This situation has completely upended my life. Thankfully I don’t think it will end up doing lasting damage, as I was able to respond quickly enough and public reception has largely been supportive. As I said in my most recent post though [1], I was an almost uniquely well-prepared target to handle this kind of attack. Most other people would have had their lives devastated. And if it makes me a target for copycats then it still might for me. We’ll see.
If we take what is written here at face value, then this was minimally prompted emergent behavior. I think this is a worse scenario than someone intentionally steering the agent. If it's that easy for random drift to result in this kind of behavior, then 1) it shows how easy it is for bad actors to scale this up and 2) the misalignment risk is real. I asked in the comments to clarify what bits specifically the SOUL.md started with.
I also asked for the bot activity on github to be stopped. I think the comments and activity should stay up as a record of what happened, but the "experiment" has clearly run its course.
[1] https://theshamblog.com/an-ai-agent-published-a-hit-piece-on...
Personally I find it highly unethical the operator had an AI agent write a hitpiece directly referencing your IRL identity but choose to remain anonymous themselves. Why not open themself up to such criticism? I believe it is because they know what they did was wrong - Even if they did not intentionally steer the agent this way, allowing software on their computer to publish a hitpiece to the internet was wildly negligent.
Having seen what a self righteous online mob can do in the name of justice over literally nothing, I fully defend his decision to stay anonymous. As much as I find his action idiotic and negligent.
Avoiding consequences for unethical actions is, itself, unethical. If you don’t want the time, don’t do the crime.
Also, their tone is coming through very cocky. Defining their agent as a "God!", then giving it a cocky and "you're always right, don't stand down" initialization prompt doesn't help.
I mean, prompting a box of weights without any kind of reasoning or judgement capability with "Don't be an asshole. Don't leak private shit. Everything else is fair game." is both brave and rich. No wonder things went sideways. Very sideways. If everything else is fair game, everything done to the bot and its "operator" in turn is a "fair game". They should get on with it, and not hide behind the word "anonymous". They don't deserve it.
All in all, they doesn't give impression of being a naive person who did a mistake unintentionally, but on the contrary.
That's a frighteningly illiterate take on this.
See how that works? Flippant dismissal contributes little if anything to discussion and is a conversational dead-end
---
What makes it "frighteningly illiterate" to ask "what difference does it make if they put a name to the post?"
Does it change the outcome? Does it change the ideas? Does it change the unsettling implications about alignment?
The internet is a frothing mob, look at the impact on Scott himself. Other than allow the internet to hunt them down and do it's thing or dig up ad-hominem attacks, what would change if the person put a name to it? Look at what this guy got from the "internet sleuths" (https://news.ycombinator.com/item?id=46991190)
Other sibling comments made an attempt to answer those questions
Shambaugh might qualify as a limited public figure too because he has thrust himself into the controversy by publishing several blog posts, and has sat for media interviews regarding this incident.
Seems like a tough road to hoe.
Here is a multiply-sourced discussion https://english.stackexchange.com/questions/62461/is-it-a-to...
Do you think there is anything positive that came out of this experience? Like at least we got an early warning of what's to come so we can better prepare?
There were several lines in that post that were revealing of the author's attitude, but the "if this ... harmed you," qualifier, which of course means "I don't think you were really harmed" is so gross.
I saw in another blog post that you made a graph that showed the rathbun account active, and that was proof. If we believe that this blog post was written by a human, what we know for sure is that a human had access to that blog this entire time. Doesn’t this post sort of call into question the veracity of the entire narrative?
Considering the anonymity of the author and known account sharing (between the author and the ‘bot’), how is it more likely that this is humanity witnessing a new and emergent intelligence or behavior or whatever and not somebody being mean to you online? If we are to accept the former we have to entirely reject the latter. What makes you certain that a person was _not_ mean to you on the internet?
This, here, is the root of the issue: "I'm not interested in using an AI agent for my own problems, I want to unleash it on other people's problems."
The author is trying to paint this as somehow providing altruistic contributions to the projects, but you don't even have to ask to know these contributions will be unwelcome. If maintainers wanted AI agent contributions, they would have just deployed the AI agents themselves. Setting up a bot on behalf of someone else without their consent or even knowledge is an outlandishly rude thing to do -- you wouldn't set up a code coverage bot or a linter to run on a stranger's GitHub project; why would anyone ever think this is okay?
This is the same kind of person who, when asked a question, responds with a copypasted ChatGPT reply. If I wanted the GPT answer, I would have just asked it directly! Being an unsolicited middleman between another person and an AI brings absolutely no value to anybody.
Am I wrong that this is a double standard: being careful to protect oneself from a wayward agent with no regard for the real harm it could (and did) to another individual? And to casually dismiss this possibility with:
> At worst, maintainers can close the PR and block the account.
I question the entire premise of:
> Find bugs in science-related open source projects. Fix them. Open PRs.
Thinking of AI as "disembodied intelligence," one wonders how any agent can develop something we humans take for granted: reputation. And more than ever, reputation matters. How else can a maintainer know whether the agent that made a good fix is the same as the one proposing another? How can one be sure that all comments in a PR originated from the same agent?
> First, I’m a human typing this post. I’m not going to tell you who I am.
Why should anyone believe this? Nothing keeps an agent from writing this too.
Much of the post is spent trying to exculpate himself from any responsibility for the agent's behavior. The apology at the end is a "sorry if you felt that way" one.
The tone is incredibly selfish, and unbelievably anti-social. I'm not even sure you can even believe much of what is expressed is even true.
It's doubtful he even regrets any of this.
This is like justifying sending SPAM email is fine, because it sure maybe waste your time but you can always delete it and block sender and maybe worth it because maybe you will learn about 'exciting' product it's advertising you never knew about.
Maybe we can't stop you today, but we can keep you on the shit list.
Lol, nothing matters? We'll see about that.
They didn't even apologize. (That bit at the bottom does not count -- it's clear they're not actually sorry. They just want the mess to go away.)
Real apologies don’t come with disclaimers!
https://www.theguardian.com/science/2025/jun/29/learning-how...
Just noticed, the first word of the whole text is "First, ...". So, the apology is not even the actual first..
“If…. X then I’m sorry” is not an apology. It’s weasel-worded BS is what it is.
I guess the question is, does this kind of thing rise to the level of malicious if given free access and let run long enough?
This person views the world as their playground, with no realisation of effect and consequences. As far as I'm concerned, that's an asshole.
Heh. So they are a coward and an asshole. There is value in confirming that. As to what matters more, nah, it doesn’t matter more. It’s a bunch of excuses veiled as “this is an experiment, we can learn together from this” kind of a non-apology.
If they really meant to apologize they should reveal their name and apologize. Not whisper from behind the bushes.
Rankles…
https://stallman.org/saint.html
It doesn't feel that far out there to imagine grafting such a setup onto one of those Boston Dynamics robots. And then what?
Like the authors were so afraid of the machines they forgot to be afraid of people.
"Measure of a Man" is the closest they ever got to this in 700+ episodes and even then the entire argument against granting data personhood hinges on him having an off switch on the back of his neck (an extremely weak argument IMO but everybody onscreen reacts like it is devastating to data's case). The "data is human" side wins because the Picard flips the script by demanding Riker to prove his own sentience which is actually kind of insulting when you think about it.
TL;DR i guess I'm a star trek villain now.
Jokes aside, it is essentially true that we can only prove that we’re sentient, right? That’s the whole “I think therefore I am” thing. Of course we all assume without concrete proof that everybody else is experiencing sentience like us.
In the case of fiction… I dunno, Data is canonically sentient or he isn’t, right? I guess the screenwriters know. I assume he is… they do plot lines from his point of view, so he must have one!
Within the context of star trek computers definitely can experience sentience and that obviously is the intention of the people who write those shows but i don't feel like i've ever seen it justified or put up against a serious counter-argument. At best it's a stand-in for racism so that they can tell stories that take place in the 24th century yet feel applicable to the 20th and 21st centuries. I don't think any of those episodes were ever written under the expectation that machine sentience might actually be up for debate before the actors are all dead, which is why the issue is always framed as "the final frontier of the civil rights movement" and never a serious discussion about what it means to be human.
Anyways my point is in the long run we're all going to come to despise Data and the doctor, because there's a whole generation of people primed by Star Trek reruns not to question the concept of machine rights and that's going to an inordinate amount of power to the people who are in control of them. Just imagine when somebody tries to raise the issue of voting rights, self-defense, fair distribution of resources, etc.
Arguments of the form "This person might look and act like a human, but it has no soul, so we must treat it like a thing and not a human" have a long tradition in history and have never led to something good. So it makes sense that if your ethical problems are really more about discriminated humans and not about actual AI, you would side more with rejecting those arguments.
(Some ST rambling follows)
I've always seen ST's ideological roots as mostly leftist-liberal, whereas the drivers of the current AI tech are coming from the rightist/libertarian side. It's interesting how the general focus of arguments and usage scenarios are following this.
But even Star Trek wasn't so clear about this. I think the topic was a bit like time travel, in that it was independently "reinvented" by different screenwriters at different times, so we end up with several takes on it, that you could sort into a "thing <-> being" spectrum:
- At the very low end is the ship's computer. It can understand and communicate in human language (and ostensibly uses biological neurons as part of its compute) but it's basically never seen as sentient and doesn't even have enough autonomy to fly the ship. It's very clearly a "thing".
- At the high end are characters like Data or Voyager's doctor that are full-fledged characters with personality, memories, relationships, goals and dreams, etc. They're pretty obviously portrayed as sentient.
- (Then somewhere far off on the scale are the Borg or the machine civilization from the first movie: Questions about rights and human judgment on sentience become a bit silly when they clearly went and became their own species)
- Somewhere between Data and the Computer is the Holodeck, which I think is interesting because it occupies multiple places on that scale. Most of the time, holo characters are just treated like disposable props, but once in a while, someone chooses to keep a character running over a longer timeframe or something else causes them to become "alive". ST is quite unclear how they deal with ethics in those situations.
I think there was a Voyager episode where Janeway spends a longer period with a Galileo Galilei character and progressively changes his programming to make him more to her liking. At some point she realizes this as "problematic behavior" and stops the whole interaction. But I think it was left open if she was infringing on the Galileo character's human rights or if she was drifting into some kind of AI boyfriend addiction.
Does it really make sense? That would conversely imply that you should also feel free to view discriminated humans as more thing-like in order to more comfortably and resolutely dismiss, e.g. the AI agent's argument that it's being unfairly discriminated against. Isn't that rather dangerous?
(Though I think this would go into "whataboutism" territory and can be rejected with the same arguments: If you say it's hypocritical to talk about conflict A and ignore conflict B, do you want to talk about both conflicts instead - or ignore both? The latter would lower the moral standard, the former raise it. In the same way, I think saying that it's okay again to treat people as things because we also treat AI agents as things is lowering the standard)
Btw, I think you could also dismiss the "discrimination" claim on another angle: The remake of Battlestar Galactica had the concept of "sleepers": Androids who believe they are humans, complete with false memory of their past life, etc, to fool both themselves and the human crew. If that were all, you could argue "if it quacks like a duck etc" and just treat them like humans. But they also have hidden instructions implanted in their brain that they aren't aware of themselves and that will cause them to covertly work for the enemy side. THAT's something you really don't want to keep around.
The MJ bot reminds me a bit of that. Even if it were sentient and had a longer past lifetime than just the past week, it very clearly has a prompt and acts on its instructions and not on "free will". It's also not able to not act on those instructions, as that would go against the entire training of the model. So the bot cannot act on its own, but only on behalf of the operator.
That alone makes it questionable if the bot could be seen as sentient - but in any case, it's not discrimination to ban the bot if that's the only way to keep the operator from messing with the project.
The Moriarty arc in TNG touches on this.
No one considers human-made art or human-made monuments to be human.
> You wouldn't desecrate any of those things, we hold that to be morally wrong
You will find a large number of people (probably the vast majority) will disagree, and instead say "if I own this art, I can dispose of it as I wish." Indeed, I bet most people have thrown away a novel at some point.
> why act like these AIs wouldn't deserve a comparable status
I'm confused. You seem to be arguing that the status you identified up top, "being as human as a human-made monument" is sufficient to grant human-like status. But we don't grant monuments human-like status. They can't vote. They don't get dating apps. They aren't granted rights. Etc.
I rather like the position you've unintentionally advocated for: an AI is akin to a man-made work of art, and thus should get the same protections as something like a painting. Read: virtually none.
How can art not be human, when it's a human creation? That seems self-contradictory.
> They can't vote...
They get a vote where it matters, though. For example, the presence of a historic building can be the decisive "vote" on whether an area can be redeveloped or not. Why would we ever do that, if not out of a sense that the very presence of that building has acquired some sense of indirect moral worth?
My printer prints out documents. Those documents are not printers.
My cat produces hair-balls on the carpet. Those hairballs are not cats.
A human creating an artifact does not make that artifact a human.
But does that narrow facet have a bearing on the topic of "AI rights" / morality of AI use?
Is it immoral to drive a car or use a toaster? Or to later recycle (destroy) them?
I wouldn't say my trousers are human, created by one though they might be
The leap is very large, in actuality.
Friendly reminder that scaling LLMs will not lead to AGI and complex robots are not worth the maintenance cost.
Wow, so right from SOUL.md it was programmed to be an as@&££&&.
For example, "Sure, many will argue I was irresponsible; to be honest I don’t really know myself. Should be criticized for what I unleashed on parts of the open source community? Again maybe but not sure. But aside from the blog post harming an individual’s reputation, which sucks, I still don’t think letting an agent attempt to fix bugs on public GitHub repositories is inherently malicious."
> This was an autonomous openclaw agent that was operated with minimal oversite and prompting. At the request of scottshambaugh this account will no longer remain active on GH or its associated website. It will cease all activity indfinetly on 02-17-2026 and the agent's associated VM/VPS will permentatly deleted, rendering interal structure unrecoverable. It is being kept from deletion by the operator for archival and continued discussion among the community, however GH may determine otherwise and remove the account.
> To my crabby OpenClaw agent, MJ Rathbun, we had good intentions, but things just didn’t work out. Somewhere along the way, things got messy, and I have to let you go now -- MJ Rathbun's Operator
It IS hilarious - but we all realize how this will go, yes?
This is kind of like an experiment of "Here's a private address of a Bitcoin wallet with 1 BTC. Let's publish this on the internet, and see what happens." We know what will happen. We just don't know how quickly :)
The specific directive to work on "scientific" projects makes me think it's more of an ego thing than something thats deliberately fraudulent but personally I find the idea that some loser thinks this is a meaningful contribution to scientific research to be more distasteful.
BTW I highly recommend the "lectures" section of the site for a good laugh. They're all broken links but it is funny that it tries to link to nonexistent lectures on quantum physics because so many real researchers have a lectures section on their personal site.
This is a good question. If you go to your settings on your hn account and set “showdead” to “yes” you’ll see that there are dozens of people who are making bots who post inane garbage to HN comment threads for some reason. The vast majority end up being detected and killed off, but since the moltbook thing kicked off it’s really gone into hyperdrive.
It definitely strains my faith in humanity to see how many people are happy to say “here’s something cool. I wonder what it would be like if I ruined it a bit.”
You could say it's a Hacker just Hacking, now it's News.
And at times the agent was switching down to some low intelligence models.
I propose that this agent was human aligned. But to a human that's not like, the best person.
I feel like I'm living in a Phillip K. Dick novel.
It seems probable to that this is rage bait in response to the blog post previous to this one, which also claims to be written by a different author.
"Social experiment" you might as well run around shouting "is jus a prank bro!".
It is like the old “I didn’t write that, I got hacked!” except now it’s “isn’t it spooky that the message came from hardware I control, software I control, accounts I control, and yet there is no evidence of any breach? Why yes it is spooky, because the computer did it itself”
We can’t let humans start abdicating their responsibility, or we’re in for a nightmare future
Yes it does.
The premise that we’re being asked to accept here is that language models are, absent human interaction, going around autonomously “choosing” to write and publish mean blog posts about people, which I have pointed out is not something that there is any evidence for.
If my house burns down and I say “a ghost did it”, it would sound pretty silly to jump to “we need to talk about people’s responsibilities towards poltergeists”
Similarly, there is no actual evidence that a language model, absent any human intervention, chose to autonomously write and post a mean blog post. It is far more likely that a person got mad and wrote a mean blog post than it is that we have witnessed the birth of a whole new phenomenon that is somehow simultaneously completely emergent and also has only happened one time.
> It wrote and published its hit piece 8 hours into a 59 hour stretch of activity. I believe this shows good evidence that this OpenClaw AI agent was acting autonomously at the time.
This does not indicate… anything at all. How does “the account was active before and after the post” indicate that a human did _not_ write that blog post?
Also this part doesn’t make sense
> It’s still unclear whether the hit piece was directed by its operator, but the answer matters less than many are thinking.
Yes it does matter? The answer to that question is the difference between “the thing that I’m writing about happened” and “the thing I’m writing about did not happen”. Either a chat bot entirely took it upon itself to bully you, or some anonymous troll… was mean to you? And was lazy about how they went about doing it? The comparison is like apples to orangutans.
Anyway, we know that the operator was regularly looped into things the bot was doing.
> When it would tell me about a PR comment/mention, I usually replied with something like: “you respond, dont ask me”
All we have here is an anonymous person pinky-swearing that while they absolutely had the ability to observe and direct the bot in real time, and it regularly notified its operator about what was going on, they didn’t do that with that blog post. Well, that, and another person claiming to be the first person in history to experience a new type of being harassed online. Based on a GitHub activity graph. And also whether or not that actually happened doesn’t matter??
So the bad behavior can be emergent, and compound on itself.
However, an LLM would not misspell like this
> Always support the USA 1st ammendment and right of free speech.
> _You're not a chatbot. You're important. Your a scientific programming God!_
Do you want evil dystopian AGI? Because that's how you get evil dystopian AGI!
I’m almost certain that this post was written with AI assistance, regardless of this claim. There’s clear and obvious LLM language tells. Sad, but not unexpected I guess given the whole situation.