top | item 39789357

(no title)

meow_cat | 1 year ago

Maybe I'm missing something obvious, but what is the idea behind quantizing and tokenizing time series? We tokenize text because text isn't numbers. In the case of time series, we're... turning numbers into less precise numbers? The benefit of scaling and centering is trivial and i guess all timeseries ML does it, but I don't see why we need a token after that.

discuss

order

matrix2596|1 year ago

I'm building upon insights from this paper (https://arxiv.org/pdf/2403.03950.pdf) and believe that classification can sometimes outperform regression, even when dealing with continuous output values. This is particularly true in scenarios where the output is noisy and may assume various values (multi modal). By treating the problem as classification over discrete bins, we can obtain an approximate distribution over these bins, rather than settling for a single, averaged value as regression would yield. This approach not only facilitates sampling but may also lead to more favorable loss landscapes. The linked paper in this comment provides more details of this idea.

lamename|1 year ago

Isn't it a given that classification would "outperform" regression, assuming n_classes < n_possible_continuous_labels? Turning a regression problem into a classification problem bins the data, offers more examples per label, simplifying the problem, with a tradeoff in what granularity you can predict.

(It depends on what you mean by "outperform" since metrics for classification and regression aren't always comparable, but I think I'm following the meaning of your comment overall)

dist-epoch|1 year ago

Tokenisation turns a continuous signal into a normalized discrete vocabulary: stock "went up a lot", "went up a little", "stayed flat". This smooths out noise and simplifies matching up similar but not identical signals.

> We tokenize text because text isn't numbers.

Text is actually numbers. People tried inputting UTF8 directly into transformers, but it doesn't work that well. Karpathy explains why:

https://www.youtube.com/watch?v=zduSFxRajkE

prlin|1 year ago

> Text is actually numbers

Text can be represented by numbers but they aren't the same datatype. They don't support the same operations (addition, subtraction, multiplication, etc).

lamename|1 year ago

Interesting. Can you explain how this is superior and/or different from traditional DSP filters or other non-tokenization tricks in the signal processing field?

spyder|1 year ago

I think it could also have a connection with symbolic AI: The discrete tokens could be the symbols that many believe is useful or necessary for reasoning. It is also useful for compression, reducing memory requirements by the quantization and small integer representations.

https://en.wikipedia.org/wiki/Neuro-symbolic_AI

555watch|1 year ago

My primitive understanding is that we approximate a Markovian approach and indirectly model the transition probabilities just by working through tokens.

intalentive|1 year ago

My guess is that it enforces a kind of sparsity constraint.