AI Reading for Tuesday November 11

Better to seek forgiveness from your mutant super-offspring than to ask permission? Literal Frankenstein stuff?

The combination of MAGA politics and AI and the tech bubble offers only short-term political protection for Donald Trump, not long-term stability. - Financial Times

Trump AI adviser David Sacks alleges conspiracy for anti-AI bad vibes. Researchers say they just think some nonpartisan risk management is in order. Says $1 billion "Doomer Industrial Complex" of Effective Altruism donors includes Dustin Moskovitz and Jaan Tallinn, funds hundreds of think tanks. - Fortune

Practitioners use automated metrics (BLEU, ROUGE-L, BERTScore, perplexity), benchmarks (MMLU, GSM8K, TruthfulQA), verifiers and human review to evaluate LLMs, and industry trends favor hybrid pipelines combining benchmarks, LLM-judges and audits to balance quality, safety and cost. - MachineLearningMastery.com

Kimi K2 Thinking climbs into lmarena.ai top 10, tied with z.ai for top open source slot, and new Ernie, MiniMax M2 still MIA.

AI slop hits new high as fake country artist hits #1 on Billboard digital songs chart. - The Register

Data Miner's Daughter

I Walk The ROC Line

Wichita Optical Fiber Man

Overfit on You

Whiskey Matrix Multiply

These Bootstraps Were Made for Walking

I Shaved My Legs for This AI Cowboy?

Brave AI doggie def going viral on Facebook - Reddit

Comet can't do anything b/c of sandboxing, still vulnerable to prompt injection - Schneier on Security

Not really a fan of agentic browsers. Back in the day I wanted to automate some research flows, thought I could probably write a Chrome extension to do this, it will just look like me clicking around. And it was a struggle. Some stuff worked, make a hotkey to grab info from LinkedIn and update the CRM (until they change the layout)

But it didn't work great, browser JS was pretty lobotomized for safety, cross-browser scripting, can’t really run anything locally. Eventually I switched to e.g. Selenium and Playwright, works better even if you get blocked sometimes.

Today, in Claude Code, you can say search for articles on this, download, apply a prompt.

This works sort of OK out of the box, it can write bash and python code to do what you want. When accessing the Web it asks you if each site is safe, if it's ok to do bash stuff. If you give it some proper scripts with skills will probably work OK and also be safe without full ability to write and run code.

I tried to do this in Comet and it didn't work b/c sandboxed, as it should be. (And yet Comet is still vulnerable to prompt injection attacks)

It took a long time for browser vendors to make JS safe and it got pretty lobotomized, same is happening to agentic browsers. Eventually gets safe enough that it can't do much. At the same time even the basic stuff like ‘summarize this page’ has to be disabled in enterprise, or you need a browser that uses the corporate LLM in Azure or on-prem.

Anyway, if there is a good agentic browser that can do as much as Claude Code and is safe I'd love to hear about it. IMO AI good AI use cases are 1) research i.e. reading a lot of stuff, 2) creating first draft of content, 3) automating workflows, 4) coding. IMO 1) is a good fit for browsers i.e. agentic search, but leave the rest alone.Agents will be interesting until the Web gets totally locked down from stuff like scraping and comparison shopping agents.

Also, use caution…running Claude Code on untrusted Web content that could have prompt injections, with bash access, is unsafe. I've done the thing where you give it full shell access and paste a Docker error and say 'fix this please', and watched it stumble around … just no, this is not the way.

Follow the latest AI headlines via SkynetAndChill.com on Bluesky

AI Reading for Tuesday November 11

Show Me The Money

Watt the F

CapEx Infinity War

Keep Reading

SkynetAndChill by druce.ai

Home