simonmesmith's comments

simonmesmith | 2 years ago | on: Ask HN: How do you decide a non-fiction book is worth your time?

Wholeheartedly agree. And I think the problem has gotten worse for a few reasons. One is that if you’re interested in a subject, you likely follow it closely online, so reading a book-length treatment feels like learning a lot of stuff you already know. Two is that publishers try to find authors with social media audiences who likely already published all their good ideas for free and now have to dredge for additional material that’s usually less interesting.

When I was a child and teenager I loved science fiction. When I became an adult I shunned fiction as a waste of time and read only nonfiction books. Then I started to find that most nonfiction books could be summarized in a blog post or Wikipedia entry, regurgitated well established ideas, or spewed unfounded bullshit. I’ve since flipped my perspective again. I think good fiction is worth reading at book length while most nonfiction is better consumed in blog posts, articles, journal papers, or via a ChatGPT tutoring session.

I will highlight one consistent exception. I love every book I’ve read by Yuval Noah Harari.

simonmesmith | 2 years ago | on: The current state of Carbon Dioxide Removal

Probably the two most telling phrases:

“Almost all scenarios that limit warming to 1.5°C or 2°C require novel CDR.”

“Virtually all current CDR (99.9% or 2 GtCO₂ per year) comes from conventional management of land, primarily via afforestation and reforestation.

Only a tiny fraction (0.1% or 0.002 GtCO₂ per year) results from novel CDR methods which involve storing captured carbon in the lithosphere, ocean or products.”

So, long way to go if we think this is what we need.

simonmesmith | 2 years ago | on: U.S. payrolls rose 199,000 in November, unemployment rate falls to 3.7%

One thing that always confuses me with unemployment figures is that they don’t account for labor force participation. People who drop out of the workforce and stop looking for work no longer count towards the unemployment rate. Theoretically, if a country had a labor force participation rate of 0%, they could also have an unemployment rate of 0%, but that’s obviously not a good thing unless robots are doing everything.

Currently, the US labor force participation rate for those 15+ is at about the same level as it was in 1977 and down from its peak in about 1997, after which it has been on a steady drop with a small uptick in 2019 and again last year. See this chart: https://data.worldbank.org/indicator/SL.TLF.CACT.NE.ZS?locat...

I’m not an expert at all but I just find that the unemployment rate in isolation is hard to interpret unless you also look at labor force participation. An unemployment rate of X% in 2023 is different from X% in 1997 in terms of the percentage of the population actually working.

simonmesmith | 2 years ago | on: GPT-4 Can Almost Perfectly Handle Unnatural Scrambled Text

Good ideas posted in response to this. Perhaps we need to try just rearranging images of specific things. Like, rearrange a bird and see if it can figure out that. Above I jumbled up everything and that’s different than what they did with the text, where they grouped by words.

simonmesmith | 2 years ago | on: GPT-4 Can Almost Perfectly Handle Unnatural Scrambled Text

It strikes me that this opens or at least expands an attack vector for prompt injection. Humans might overlook a scrambled block of text that GPT-4 will interpret and then execute.

Also, I was curious about whether this would also work with images. So I chopped an image of a bird (a night heron) on a boat in my own collection (so, not in GPT-4's training data) into squares and randomly rearranged them, then tried to get ChatGPT-4 to interpret it. It doesn't work. I got this:

--- CHATGPT RESPONSE:

The image you've uploaded appears to be a composite of several smaller images that are cut and arranged in a grid-like pattern. This collage-like arrangement makes it difficult to discern a clear, singular subject. However, I can identify fragments of what seems to be a bird—possibly a magpie or some type of corvid—due to the black and white plumage. There are also elements that look like parts of a silver car, like a side mirror, and segments of wooden structures that could be parts of a bench or a deck.

The way the images are scrambled resembles a puzzle, where each piece needs to be placed correctly to form a coherent whole. This type of image could be used as a challenge to reconstruct the original picture, requiring the viewer to mentally rotate and arrange the pieces.

---

Maybe a different approach to scrambling an image would work.

simonmesmith | 2 years ago | on: Why OpenAI Fired Sam Altman – and What Happens Next in the AI World

Just a heads-up that there isn’t much new in here if you’ve been following the saga on Twitter and have seen Brockman’s and Swisher’s tweets and are aware of the other recent departures. It’s mainly a summary of tweets. Not criticizing, just trying to save you a click if you’re following in real-time.

simonmesmith | 2 years ago | on: Apple's AI-powered Siri assistant could land as soon as WWDC 2024

I’m interested to see this happen but “as soon as” approximately June 2024 doesn’t feel that soon. It’s about as far from today as today is from the launch of GPT-4. Think of everything that’s happened since then. And things only seem to be speeding up. Still, I welcome a Siri that’s useful beyond setting timers.

simonmesmith | 2 years ago | on: Crying staff, strippers, filthy poems: New book digs into Bridgewater

There are so many stories of these falls from grace. I remember when Theranos, WeWork, FTX, etc., were media darlings. People rush to emulate their strategy, leaders, and culture. Then the companies unravel or performance suffers and everyone tries to explain what went wrong.

I think a better explanation is “the halo effect.” Businesses have success, even if for a limited time, and using limited definitions (e.g. VC valuation), and people tell stories about why they’re successful that put leadership, strategy, culture, and so forth, front and centre, when it could have been dumb luck or a low interest rate environment or other factors totally unrelated to leadership, strategy, and culture.

A book that really opened my eyes to this was The Halo Effect: https://www.goodreads.com/en/book/show/136181

simonmesmith | 2 years ago | on: 14-year-old who invented soap to treat skin cancer America's Top Young Scientist

Looks like the soap uses salicylic acid, glycolic acid, and tretinoin: https://www.washingtonian.com/2023/10/11/a-ninth-grader-from...

Those ingredients looked familiar to me from work I’ve done with skin health products. They’re cosmetic product ingredients. A quick ChatGPT check:

- Salicylic acid is a beta-hydroxy acid commonly used in acne treatments and exfoliating products. It penetrates oil glands to unclog pores and has anti-inflammatory properties.

- Glycolic acid is an alpha-hydroxy acid mainly used for skin exfoliation and anti-aging. It helps remove dead skin cells and can improve the appearance of fine lines.

- Tretinoin is a prescription-strength retinoid used for treating acne and signs of aging. It promotes skin cell turnover and boosts collagen production.

So, if this $0.50 soap bar gets to market, I assume it will quickly be snapped up by people looking to generally keep their skin young and healthy, because that’s a hell of a lot cheaper than other products containing those ingredients.

simonmesmith | 2 years ago | on: Ask HN: Extracting Knowledge Graphs from LLMs

Having worked on this problem in biology, I think one of the challenges you’ll find is that the knowledge graph will be extremely context-dependent and biased towards highly probable nodes and edges.

For example, if you ask an LLM to create a graph of all proteins related to X disease, and show how they interact, it will oblige. (You can try this yourself easily in the OpenAI playground. Just ask it to send you back a list like X -> Y -> Z or whatever. Or an array of source/target/relation triplets.)

The challenge is that what you get will be very dependent on how you phrase your request. So you’ll never know if you’re getting a “complete” graph or just the most probable graph for the request you made. If you’re an expert in the domain, you’ll know, but if you’re an expert you might not need the graph in the first place.

simonmesmith | 2 years ago | on: CR in humans builds strong muscle and stimulates healthy aging genes

It says participants were only able to achieve a “12% reduction in calorie intake.” That’s a reduction of 240 calories a day on a 2,000 calorie diet.

The target for the two major monkey studies (that I know of) to date was a 30% reduction.

The results of those studies, which had conflicting results, showed that even at that degree of calorie restriction, dietary quality was likely more important than calorie restriction for health and longevity. Peter Attia summarizes this well here: https://peterattiamd.com/calorie-restriction-part-iia-monkey...

My takeaway from all of this is Michael Pollan’s mantra: “eat food, not too much, mostly plants.”

simonmesmith | 2 years ago | on: Metformin could treat gum disease and help with healthy aging, study claims

N = 1, but I’ve been taking 1700mg of metformin (in two split doses of 850mg) for five years.

I was able to get the prescription from my doctor by presenting evidence and the rationale and design for the TAME trial.

He was comfortable with this due to the strong safety history of the drug. Insurance covers it (though it’s very cheap anyway) and I pay nothing out of pocket. I live in Canada and also have private insurance through work.

I had some gastrointestinal issues initially, but they weren’t serious, and my body adapted fairly quickly as they want away. I tried taking metformin while doing an 18:6 intermittent fast and gastrointestinal symptoms returned, so I won’t be doing that again.

There is some debate about the benefits and risks of metformin, but so far I haven’t experienced negative side effects of concern to me. I’ve had various tests over the past five years—blood, urine, body composition, etc.

As for the TAME trial, while there have been positive developments, I’m now doubtful it will ever actually run. I have a prediction market on Manifold right now that shows a 25% chance it will be listed on clinicaltrials.gov by the end of the year.

I’m not trying to promote the market (the main benefit to me is more market participants hopefully means better forecasting), but if anyone’s interested, it’s here: https://manifold.markets/SimonSmith/will-the-targeting-aging...

simonmesmith | 2 years ago | on: Ask HN: Bad Results from Vector Search?

From some experience I've had with this:

* Is that the right chunk size? How much of a chunk might contain the relevant information? Is it better for your use case to chunk by sentence? I've done RAG with document chunks, sentences, and triplets (source -> relation -> target). How you chunk can have a big impact.

* One approach that I've seen work very well is (1) first, use keyword or entity search to limit results, then (2) use semantic similarity to the query to rank those results. This is how, for example, they do it at LitSense for sentences from scientific papers: https://www.ncbi.nlm.nih.gov/research/litsense/. Paper here: https://academic.oup.com/nar/article/47/W1/W594/5479473.

* You still need metadata. For example, if a user asks for something like "show me new information about X," the concept of "new" won't get embedded in the text. You'll need to convert that to some kind of date search. This is where doing RAG with something like OpenAI function calls can be great. It can see "new" and use that to pass a date to a date filter.

* I've found some embeddings can be frustrating because they conflate things that can even be opposites. For example, "increase" and "decrease" might show up as similar because they both get mapped into the space for "direction." This probably isn't an issue with better (I assume higher dimensional) embeddings, but it can be problematic with some embeddings.

* You might need specialized domain embeddings for a very specific domain. For example, law, finance, biology, and so forth. Certain words or concepts that are very specific to a domain might not be properly captured in a general embedding space. A "knockout" means something very different in sports, when talking about an attractive person, or in biology when it refers to genetic manipulation.

simonmesmith | 2 years ago | on: ChatGPT-4 significantly increased performance of business consultants

Having been a consultant, what strikes me about this is the next, to me seemingly obvious question: What if you just removed the consultants entirely and just had GPT-4 do the work directly for the client?

If you’re a client and need a consultant to do something, you have to explain the requirement to them, review the work, give feedback, and so forth. There will likely be a few meetings in there.

But if GPT-4 can make consultants so much better, I imagine it can also do their work for them. And if you combine this with the reduction in communications overhead that comes from not working with an outside group, why wouldn’t clients just accrue all the benefits to themselves, plus the benefit of not paying outside consultants or dealing with the overhead of managing them?

This is especially the case when the client is already a domain expert but just needs some additional horsepower. For example, marketing brand managers may work with marketing consultants even though they know their products and marketing very well. They just need more resources, which can come in the form of consultants for reasons such as internal head-count restrictions.

Anyway, I just wonder if BCG thought through the implications of participating in this study. To me it feels like a very short step from “helps consultants help their clients” to “helps clients directly and shows consultants aren’t really necessary.”

Especially so if the client just hires an intern and gives them GPT-4.

simonmesmith | 2 years ago | on: AutoGen: Enabling next-generation large language model applications

Preprint from August: https://arxiv.org/abs/2308.08155

Docs: https://microsoft.github.io/autogen/docs/Research/

Repo: https://github.com/microsoft/autogen

Installs via pip: pip install pyautogen

Looks like another agent framework, of which I’ve tried several and been disappointed… but I still put trying this on my to-do list because you never know. Open Interpreter proved to be a pleasant surprise recently.

simonmesmith | 2 years ago | on: Artificial intelligence is ineffective and potentially harmful for fact checking

So, some of the findings in relation to the psychology of AI fact-checking they found may be valid and interesting (e.g. how it increases confidence, even in false information when incorrect). But their methodology to me seems quite flawed:

1. They use “ChatGPT” and never mention if it’s 3.5 or 4 (unless I missed it—don’t think so) so I suspect 3.5.

2. They ask ChatGPT to evaluate the truthfulness of new headlines outside its training data cutoff, don’t show it the source of the headline (e.g. a reputable news outlet or not), and don’t allow it to to use an internet search to determine if the news is corroborated.

So their claim that it’s “ineffective” seems like a pretty big overreach. At best you might say “AI fact-checking by relatively weak models not given all available information and asked to evaluate headlines outside their training data cutoff” are ineffective. Duh!

Clearly, if you were to design a good AI fact-checker, you wouldn’t design it like this. You would give it all available information, allow it to corroborate with a news search, and use a stronger model or a model fine-tuned for the task.

simonmesmith | 2 years ago | on: Ask HN: Does “Effort Deflation” Demotivate? Why Not Wait for Ever Easier Tech?

But if you anticipate that GPT-5 will have X capabilities that will negate the need for you to learn Y intermediate tools, so you can instead focus on Z, you will have avoided wasting effort that could be better spent elsewhere.

Like, if I know we’re in a deflationary economic environment, I’m better off sitting on my cash and waiting for prices to drop before making a purchase, unless the benefit of owning that item outweighs the cost of not waiting.

I think I now do similar cost/benefit calculations regarding technology.

page 1