patrickas's comments

patrickas | 2 years ago | on: What every software developer must know about Unicode in 2023

That is why I like the way Raku handles it.

It has distinct .chars .codes and .bytes that you can specify depending on the use case. And if you try to use .length is complains asking you to use one of the other options to clarify your intent.

  my \emoji = "\c[FACE PALM]\c[EMOJI MODIFIER FITZPATRICK TYPE-3]\c[ZERO WIDTH JOINER]\c[MALE SIGN]\c[VARIATION SELECTOR-16]";
  say emoji; #Will print the character
  say emoji.chars; # 1 because on character
  say emoji.codes; # 5 because five code points
  say emoji.encode('UTF8').bytes; # 17 because encoded utf8
  say emoji.encode('UTF16').bytes; # 14 because encoded utf16

patrickas | 2 years ago | on: US Senator Uses ChatGPT for Opening Remarks at a Hearing on AI

It is not about what the model tells you.

This paper shows an emergent world model in an LLM that was taught to play otello moves https://ar5iv.labs.arxiv.org/html/2210.13382

https://arxiv.org/pdf/2303.12712.pdf This paper discusses (among other things) how a GPT4 model navigated between rooms in a text adventure game and was able to create a map afterward. Literally building a model of the world as it was navigating and drawing a map of that afterwards

patrickas | 2 years ago | on: US Senator Uses ChatGPT for Opening Remarks at a Hearing on AI

LLMs do have explicit world models that can be even manipulated. There are many recent papers on the subject.

patrickas | 2 years ago | on: Google DeepMind CEO says some form of AGI possible in a few years

Please see my reply to parent.

> "fusion by 1990 instead of 2000..." Those three dots are omitting the most important part of the issue if we spend that much extra money on R&D

In the fusion case no one was willing to spend the money, in AGI's case it looks like everyone seems to be willing to spend the money.

I personally hope they won't, but that is a crutial point not to be overlooked.

patrickas | 2 years ago | on: Google DeepMind CEO says some form of AGI possible in a few years

I don't think this contradicts the AGI prediction though (nor the "Fusion by 1990 with a bit of extra investment" prediction for that matters)

https://external-preview.redd.it/LkKBNe1NW51Wh-8nLSTRdQtTha2...

This chart shows how much people in the 1970s estimated should be invested to have fusion by 1990, to have it by 2000s and to "never" have it. We ended up spending below the "never" amount for research over four decades so of course fusion never happened exactly as predicted.

I think the main difference is that no one was interested in investing in fusion back then, while everyone is interested in investing in AGI now.

patrickas | 3 years ago | on: Cerebras-GPT: A Family of Open, Compute-Efficient, Large Language Models

This paper from late last year shows that LLMs are not "just" stochastic parrots, but they actually build an internal model of the "world" that is not programmed in, just from trying to predict the next token.

https://ar5iv.labs.arxiv.org/html/2210.13382

PS: More research has been done since that confirmed and strengthened the conclusion.

patrickas | 3 years ago | on: GPT-4

Your comment reminded me of this article:

Humans Who Are Not Concentrating Are Not General Intelligences

https://www.lesswrong.com/posts/4AHXDwcGab5PhKhHT/humans-who...

patrickas | 5 years ago | on: String length functions for single emoji characters evaluate to greater than 1

Raku seems to be more correct (DWIM) in this regard than all the examples given in the post...

  my \emoji = "\c[FACE PALM]\c[EMOJI MODIFIER FITZPATRICK TYPE-3]\c[ZERO WIDTH JOINER]\c[MALE SIGN]\c[VARIATION SELECTOR-16]";

  #one character
  say emoji.chars; # 1 
  #Five code points
  say emoji.codes; # 5

  #If I want to know how many bytes that takes up in various encodings...
  say emoji.encode('UTF8').bytes; # 17 bytes 
  say emoji.encode('UTF16').bytes; # 14 bytes

Edit: Updated to use the names of each code point since HN cannot display the emoji

patrickas | 5 years ago | on: Welcome to the Next Level of Bullshit

As far as I understand in this specific case yes.

The whole schtick of GPT-3 is the insight that we do not need to come up with a better algorithm than GPT-2. If we dramatically increase the number of parameters without changing the architecture/algorithm its capabilities will actually dramatically increase instead of reaching a plateau like it was expected by some.

Edit: Source https://www.gwern.net/newsletter/2020/05#gpt-3

"To the surprise of most (including myself), this vast increase in size did not run into diminishing or negative returns, as many expected, but the benefits of scale continued to happen as forecasted by OpenAI."

patrickas | 5 years ago | on: Welders set off Beirut blast while securing explosives

There is no indication that it was "the judiciary's decision" to store it "near a major population center".

That's a story floated by the head of customs to try and shift the blame to the judiciary. But there is no evidence for it, and there is much evidence against it.

Source: The court documents released by journalists Riad Kobeisyi and Dima Sadek.

patrickas | 6 years ago | on: Is Perl 6 Being Renamed?

Off the top of my head: Concurrency and parallelism using high level and low level APIs. There is no GIL.

Grammars which are like regular expressions on steroids for parsing. ( admittedly still not optimized for speed)

Gradual typing, you can go from no types at all for short one liners, to using built in types, to defining your own complex types for big programs.

  subset Positive of Int where { $^number > 0; } #create a new type called Positive 
  multi factorial(1) { 1 } #multi subroutines that dispatch on type
  multi factorial(Positive \n) { n * factorial(n-1) } 


  #say factorial(0); #Error type mismatch
  #say factorial(5.5); #Error type mismatch
  say factorial(5); #120

  hyper for (1 .. 1000) -> $n { #use hyper indicate for loop can be run in parallel on all available CPUs
    say "Factorial of $n is { factorial($n) }"; #Gives correct results by automatically upgrading to big int when needed
  }

patrickas | 6 years ago | on: Is Perl 6 Being Renamed?

That is exactly what the main architect of the compiler has been working on for the past couple of years.

Performance has been enhanced by a few factors since it was released in 2015.

For a lots of things the next stable release (as soon as the latest round of optimization have been merged) will be on par with perl5 / ruby / python ...

The main slow thing remaining is Grammars which if I understand correctly are not available in those other languages to compare speed, but that is next on the road map for optimization.

Here is a recent talk about the state of perl6 performance: https://www.youtube.com/watch?v=QNeu0wK92NE

patrickas | 6 years ago | on: Myths about Perl 6

You're looking for Cro

https://cro.services

patrickas | 7 years ago | on: Chrome Developer Tools – Easy Web Debugging You Need to Know

hey @dirkstrauss The leading image in the article is an unnecessarily huge 22 megabytes png image that can be resized and converted to jpeg without any noticeable loss of quality.

patrickas | 8 years ago | on: Mobile Devices Compromised by Fake Secure Messaging Clients

I thought it was interesting that the reports mentions some devices are compromised at the phone repair shop right after being repaired.

patrickas | 9 years ago | on: Google Maps will soon be able to find your parked car

The new addition to maps is about doing that manually when you need to, with extra features like how many minutes left on the parking meter.

patrickas | 12 years ago | on: The next version of DuckDuckGo

There is a typo in the about page, under "2013 Open-Source Donations", "Crytocat" should be "Cryptocat".

patrickas | 12 years ago | on: How Heartbleed Leaked Private Keys

The sensitive data that was saved in that address is still there. Memory has been freed so the os can use is again but the actual data is still there is memory untill get get overwritten by something else...

The program will work with no problems, but sensitive data that has been used then freed is available for retrieval when bugs like heartbleed are found.

As the article suggests the right way is to clean the data from memory ( by overwriting it with something else) before freeing it.

patrickas | 12 years ago | on: Swedish developer discovers security hole in iPhone

It seems to me he is just manipulating the DCS of the SMS being sent. This is standard behavior according to the GSM SMS specs.

http://www.etsi.org/deliver/etsi_gts/03/0338/05.00.00_60/gsm...

From section 4, "SMS Data Coding Scheme" can be used to control "Voicemail Message Waiting" among other indicators and to send messages of "Class 0" which instruct the phone to shall "display the message immediately and send an acknowledgement to the SC when the message has successfully reached the MS irrespective of whether there is memory available in the SIM or ME."

Admittedly it has been over a decade since I last played with sending such messages to phones, but it did seem to me like a bug in the spec, giving too much control to anyone with access to an sms-c (or any other mean to change the DCS field). Back then all phones I tested had implemented the spec as described.

patrickas | 12 years ago | on: Texas students fake GPS signals and take control of an $80 million yacht

That's cool! Reminds me of the way the Iranian most likely got control over a US drone a couple of years back.

They jammed communication signals and faked GPS data when automatic "go back to home base" landing procedure kicked in.

http://www.informationweek.com/security/attacks/iran-hacked-...