saturncoleus's comments

saturncoleus | 9 years ago | on: PDE-Based Image Compression

New compression algorithms are always interesting because they reveal a new way of describing a photograph in a succinct way. The overlap between image compression and image searching is surprisingly high because the same techniques are used to derive the most important part of the picture data. Even if a new compression isn't ideal for image transmission, it is pretty much always a new avenue to look at for image matching/searching.

Coming from the other side, corners and edges are great ways to describe an image when doing process. It's a natural step to use this data for compression, which is what the authors have done here.

saturncoleus | 9 years ago | on: PDE-Based Image Compression

PNG will always do a bad at compression. PNG is deflate encoded, but without the ability to rearrange pixel data to work with deflate's strengths. It will pick RGBA, which mixes the "entropy" data for each color channel together. PNG was born out of the necessity to replace the (at the time) patent encumbered GIF. PNG has a different purpose, and was designed to get something working fast.

saturncoleus | 10 years ago | on: Achieving a Perfect SSL Labs Score with Go

Java's SSL is very painful to deal with. It has no support for ALPN or NPN, making it really difficult to use for HTTP/2. Also, the GCM cipher suites are implemented in Java, not with Intrinsics, so they are painfully slow. As in 20MB/s. Openssl and Go's TLS get 3000MB/s for the same amount of CPU.

Worst of all, Oracle refuses to propagate patches backwards, so if you are running even a mildly old (like a year) JDK, you will suffer for it. Enterprise Java doesn't move that fast, so you end up having bootclass hacks to get around these shortcomings.

And oh yeah, if you also support Android, you might as well hire someone full time to deal with this, since it is that time consuming to deal with.

saturncoleus | 10 years ago | on: Achieving a Perfect SSL Labs Score with Go

> This means I need to take the default CipherSuites and simply remove any that use a cipher smaller than 256bit.

Arguably using a higher bit cipher suite should be considered worse, since it reduces accessibility. 128 bit crypto (specifically the GCM suites) are ridiculously faster, to the point where it is practically free to enable it for all websites. Treating 256bit crypto as better feels like it is missing a key point of security: availability.

saturncoleus | 10 years ago | on: Symantec/Norton Antivirus Remote Heap/Pool Memory Corruption CVE-2016-2208

Can't help but wonder if attackers already knew this. There seems to be quite a few bugs found by taviso in antivirus code in the past few months, which has got to either attract attackers to look more closely at it or possibly break their existing exploits. Either way, it's frightening!

Increasingly, my non-computer savvy family members ask me what kind of anti virus they should use. I used to pick one to tell them since I know they aren't as cautious as I am, but I am not sure I have a good answer for them any more. Has AV software reached the point that a lay user is more vulnerable with it than without it?

saturncoleus | 10 years ago | on: Show HN: Smlr – re-encode jpegs using butteraugli visual quality measurement

It looks like this binary searches across the Go quality levels to find acceptable quality. This isn't a bad idea, but it won't be fruitful. The Go standard library JPEG encoder is pretty basic, and reuses the quant tables from the spec. It also doesn't optimize the huffman tables, so the pics are typically 10% bigger than they need to be.

This idea, taken to the extreme is mozjpeg. It is really advanced and can take advantage of a lot of cool tricks (like trellis optimization) in order to get the absolute best quality for the size.

https://github.com/mozilla/mozjpeg

saturncoleus | 10 years ago | on: Let's Make a Varint

Those are all really good points actually. In response to them individually:

- The problem with generating id's is that it isn't known ahead of time how many there will be. This forces a solution that is suboptimal in all circumstances.

- The reason for rejecting UTF-8 is mostly backwards compatibility with existing software. Being able to use encoded UTF-8 strings that exceed the million is possible, but really burns a lot of bridges along the way. The point about boyer moore is really cool, I had no idea that was a goal!

- Having the length in folder structure is exponential, but only at the top most level. It will be uniform under each length dir. This is an acceptable price to pay when typing "ls ./dir", since removing the prefix would make it hard to read quickly:

    0/
    0.jpg
    1/
    1.jpg

saturncoleus | 10 years ago | on: Mpemba effect: warmer water can freeze faster than colder water

Does the same effect happen with 100% relative humidity?

saturncoleus | 10 years ago | on: Implemented proposals for Swift 3

If you don't care about side effects:

C:

    int foo = bar ? 2 : 4;

Go:

    foo := map[bool]int{true:2, false:4}[bar]

Python had a similar trick before 2.5 added ternary conditionals.

saturncoleus | 10 years ago | on: A list of command line tools for manipulating structured text data

Awk is great at what it does, but I find myself unable to keep it cached in my brain long enough to reuse it. Using awk usually means a google search of how to use it, which defeats quickly working at a term.

saturncoleus | 10 years ago | on: Ask HN: What's the most interesting algorithm?

Huffman coding is probably one of the more interesting ones, not only because it is so ridiculously useful, but because of the wide taxonomy of implementations. It is quite malleable, able to be morphed and optimized to the particular application: JPEG, PNG, HPACK, Gzip to name a few popular usages and implementations.

What is really enlightening though is implementing a basic one, because it is so simple. The core of it involves popping two graph nodes from a heap and pushing a new one. I did this in school, and was impressed by it, but became far more appreciative when I tried to do the JPEG way. It doesn't even provide a table, just a histogram!

It also acted as the basis for the successor of arithmetic coding, which is pretty much in every modern video codec. Can you imagine a world that is still analog because we couldn't figure out how to transmit digit video or images or audio? Huffman is a key link in the chain between the past and present.

saturncoleus | 10 years ago | on: Facebook sued for storing biometric data mined from photographs

Getting your picture taken kind of puts you on the defensive, but that probably isn't going away. You have to be able to go outside, go to the bank, go to work, and go to the grocery store. Each of those places are going to have cameras up for their own peace of mind.

The alternative would be to not go out and enjoy life, which is the worse of the two options.

saturncoleus | 10 years ago | on: NVIDIA Announces the GeForce GTX 1000 Series

Alternatively they could make their own API, independent from OpenCL. I have written a small amount in each languages (a rainbow table generator), and found the CUDA version much more pleasant to use. CUDA made it really easy to get a good idea of the underlying hardware, and make architectural changes in order to make the code faster.

OpenCL (using an ATI card) was much harder to program, since the abstraction level was much higher. Writing two separate kernels and have them each be faster than a generic version that ends up compromising for compatibility.

The OpenCL one ended up being faster, but I suspect that's due to ATI hardware being superior at the time.

saturncoleus | 10 years ago | on: Facebook sued for storing biometric data mined from photographs

If you have ever watched the TV show Black Mirror, it does an exploration of what it would be like to have instant access to know everything about a person by looking at them. The technology becomes a commodity and results in the destruction of personal relationships rather than a dystopian big brother society.

It seems to me that is a much more likely future than a criminal and government oppression future. Is one of them inevitable? Probably, but not any more than our present is someone else's future.

saturncoleus | 10 years ago | on: HTTP/2 Adoption Stats

A couple reasons. Look at these as something HTTP/2 does well, but maybe not the best:

Head of line blocking is mostly solved. You can interleave sending big messages with small ones. You can send control messages (like "Hey, I'm shutting down soon, don't send any more traffic this way") along with other messages. The alternative would be using multiple connections, or reimplementing your own version on top of HTTP/2.

The above is much more useful in the presence of streaming. H2 has first class support for bidirectional streaming. It is now feasible to do a stock ticker, or chat room, or whatever over a normal H2 connection, and not have a whole extra protocol or browser work-arounds. Web sockets work, and hanging GET request work, but they are extra burden. It would be great if the standard protocol supported it out of the box.

TCP Keep alive is not good enough, especially in the presence of Proxies. TCP Keep alive only goes over the first hop. It is possible to work around this, but wouldn't it be nice if this was part of the spec? Also, for what it's worth, TCP Keep alive only works over TCP. In the case of not using TCP (like Unix Sockets), what do you send to check round trip time? What about over shared memory? Other transports?

H2 Header support is pretty useful too. Sending repetitive headers (like user agent, referrer, auth tokens) is wasteful in H1.1. Huffman encoding allows you get get back the size of base64 encoded strings pretty easily, so the penalty for having to only use safe characters in your headers.

Some people have mentioned that this protocol was designed for making advertising faster. While this is possibly true, Google is planning on using HTTP/2 as its new intra/inter-Datacenter RPC transport (See gRPC). The protocol is good enough to support browsers, mobile, and servers without having to transliterate between protocols.