top | item 44184236

(no title)

caeruleus | 9 months ago

I don't believe the concept of DNA can be reduced to a sequence of quaternary numerals, which is what gene sequence data would represent. Similar to proteins, DNA forms higher-level structures on top of the primary one [1], and (in a biological context, inside the nucleus) exhibits somewhat self-modifying [2] and self-regulating [3] behavior as well as meta-modification [4]. Analogous to the article, if one defines the language of DNA by its nucleobase sequence, this language can only represent a subset of the world of DNA.

Somewhat related, the way the adaptive immune system works has similarities with some concepts in machine learning. In this process, sections of nuclear DNA serve as randomly initialized weights in precursor cells [5] as well as final weights in memory cells. There's even fine-tuning of the weights. [6]

[1] https://en.wikipedia.org/wiki/Nucleic_acid_structure [2] https://en.wikipedia.org/wiki/Transposable_element [3] https://en.wikipedia.org/wiki/Transcriptional_regulation [4] https://en.wikipedia.org/wiki/Epigenetics [5] https://en.wikipedia.org/wiki/V(D)J_recombination [6] https://en.wikipedia.org/wiki/Affinity_maturation

discuss

order

throwawaymaths|9 months ago

> don't believe the concept of DNA can be reduced

followed by examples of things that are encoded by DNA. Fro example, sure, maybe you'll miss bootstrapping methylation on a first pass but the idea of methylation is there in the DNA, and if you didnt have "methylation in the right place" more than likely some generation (N) would.

to wit, i dont think there is strong evidence of an "ice-9" in the epigenome that brings about a spark of life that can't easily be triggered by chance given a template lacking it.

so there's probably not something intrinsically missing from DNA as an encoding medium vs say "casually" missing from any given piece of DNA.

if you want something a bit stronger than an assertion, the DNA used to bootstrap m. capricolum into Syn1 lacked all the decorations (made in yeast) and was not locked into higher order structure (treated with protease prior to transplantation)

caeruleus|9 months ago

You're raising some intriguing points, and I agree with your assertion about the epigenome. I still feel like your response misses the point I was making.

> followed by examples of things that are encoded by DNA

... given its natural environment. A nucleobase sequence is not a symbolic language, it relies on physical laws in general and a defined chemical environment in particular (that it helps to create and maintain) to mean something. It's similar to the point about Othello vs. the physical world in the article: The language itself does not encode every bit of information about the world it describes. For instance, in 3D space, regions of DNA that are far apart in the sequence can physically interact and influence each other’s expression.

TLDR: I think my point is that a base sequence requires a particular context (~ interpreter/knowledge about the physical world) to encode mostly everything about life. Treating it as just a language in the context of LLMs abstracts away the complex substrate that makes it work.