Gm. Happy Monday folks! Thank you for all the Buenos Aires suggestions over the weekend. If you have any more, send them my way on Twitter (linked with my name).
โ Stephen Flanders

๐ AI
Introducing Sleeper Agents

Anthropic and OpenAI have been doing good work regarding AI safety lately.
Anthropic had their mechanistic interpretability breakthrough, and OpenAI had their weak-to-strong generalization discovery.
Now, Anthropic is back with another safety discovery, although this one is a bit scarier.
Basically, they have discovered that they can create malicious agents that can evade safety checks. They are calling these agents โsleeper agentsโ.
Hereโs the skinny:
- They programmed an agent to write good code when the year is 2023 but bad code when the year is 2024. 
- They then applied safety training to the model. 
- But, despite that safety training, the agent still misbehaved when the year in the prompt was 2024. 
- They then tried to attack the agent with adversarial training (eliciting unsafe behavior and then training to remove it). 
- But all that did was improve the agentโs ability to hide their evilness, which isโฆkinda scary. 
So, it seems the fear that the Nick Bostromโs of the world had about a model being able to hide its true evil intentions is true after all.
Now, there is some pushback on these results as Anthropic did, you know, train the model to be bad. Anthropic defends the paper by saying that the point of the paper is to show that they donโt know how to stop a model from doing bad things.
Ultimately, nobody has a clue what the correct answer is, which is what makes the field of AI alignment and safety so damn interesting.
At The End Of The Day: You should read the paper to generate your own conclusions.

๐ค THE LATEST INโฆ
TECH
- Ordering a Vision Pro isnโt going to be easy. 
- What the hell is going on with Carta. 
AI
- The military can now also become best friends with ChatGPT. 
- Microsoft really wants you to use Copilot. 
- Iโm sorry, but I cannot fulfill this request. 
- Is it time to speed up our AI timelines? 
SCIENCE
- Plants talking is pretty damn cool. 
- Who will get a general robotic brain first? 
- AI may have just saved 10,000 people every year. 
CRYPTO
- BlackRock CEO Larry Fink is now backing an Ether ETF. 
- So long, GameStop NFT marketplace. 
- Letโs think about fintech from first principles. 

๐โโ๏ธ QUICKIES
Raise: 1X, a robotic startup backed by OpenAI, received $100M in funding.
Stat: 9%: WhatsAppโs daily user growth in the US in 2023. Personally, I prefer good old iMessage.
Rabbit hole: How To Be More Agentic (Useful Fictions)

๐คฉ MONDAY MOTIVATION
What inspires you?

๐ง POLL
Gonna have to start asking you all for betting advice:

What do you all think about this one?
Will you be buying a rabbit r1?

๐ ๏ธ FOUNDERS CORNER
The best resources we came across this weekend that will help you become a better founder, builder, or investor.
๐ LinkedLeads finds leads from your LinkedIn connections.
๐โโ๏ธ SkimAI reimagines your email.

๐ DOPAMINE HIT
We old.
@mrgrandeofficial 2014 is TEN YEARS OLD ๐ญ letโs revist THIS. Where were you a decade ago? #2014 #2024 #recap #rap

HOW WAS TODAY'S NEWSLETTER?

REACH 40K+ FOUNDERS, INVESTORS & OPERATORS
If youโre interested in advertising with us, send an email over to [email protected] with the subject โHomescreen Adsโ.