ezdiy's comments | WingNews

ezdiy | 8 years ago | on: Rationale: Or why am I bothering to rewrite nanomsg?

TLB is only an indirect cause. This is because kernel scheduler preempts processes fairly infrequently (100 or 1000hz, or dynamic, but still capped to a small number).

Scheduling quantums are so large precisely to keep TLB flush overhead of a context switch low. If a network mandates more interaction (say, 100k req/s across all workers), each quantum tick must process a queued bundle of 1000 requests which piled up while asleep. This works as designed - you're supposed to use up all of your quantum, and not terminate it early by issuing blocking IO per request. One prerequisite for this is that your network/disk protocol must be pipelineable (most are because thats how we deal with network/seek latencies).

But at certain point the overhead of this pipelining itself becomes so great (message queues too deep) you have to switch to threading.

Hardcore threading advocates on the other hand, need to account for overhead of atomics (for locking, or for "lockless" algorithms). An atomic must wait for all pending writeback flush. Threading gets a lot of bad rep not because "kernels suck at it", but because person making such a statement wrote their program as an exercise in lock contention and/or too much write cache pollution per single atomic.

Threading vs process tradeoff = deep pipeline overhead vs frequent queue flush+locking overhead tradeoff.

Typically, you need to meet somewhere in the middle for best performance, which is when you end up with threads with job queues - those basically emulate process-induced queues within thread model.

ezdiy | 8 years ago | on: Rationale: Or why am I bothering to rewrite nanomsg?

HTTP2 is all about "tcp-inside-tcp" states.

https://github.com/golang/net/blob/master/http2/http2.go#L81

Wherever possible, they do the sensible thing - just goroutines for each flow, piped into from the muxed frame parser via channels, but the state for each flow must be still tracked as per-flow state in the topmost flow dispatcher - goroutine can't tear itself down on timeouts, inspect OOB signalilng of the topmost stream and such.

ezdiy | 8 years ago | on: The GNUnet System

What do you mean "accidentaly"? You either have the file pinned, or not. If its not pinned, you're not seeding anything. If it's pinned, you're liable same as with BT.

ezdiy | 8 years ago | on: The GNUnet System

> IPFS feels closest to it as of now.

I'm rather skeptic. IMO, IPFS is just overengineered bittorrent. At the lowest layer, the semantics are the same, except with really bad ideas thrown into the mix - wantlist instead of bitmaps, forcing DAG in places where there's no use for it, all driving overall performance of the network into the ground.

ezdiy | 8 years ago | on: The GNUnet System

GIT would come closest. Note that while it's not common now (aside from some first canaries, like scuttlebutt), GIT works perfectly fine as a P2P opennet. You are much less trapped on github, because you can just take your repo and the complete history in it anywhere you want - be it central server, or some sort of P2P overlay.

ezdiy | 8 years ago | on: The GNUnet System

>Then when we've all sunk into all these convenient cloud services and "easy to use" disposable devices, we'll have lost all of our privacy and power.

Convenience can be engineered into P2P, too.

With Bitcoin, people definitely get the convenience of unregulated speculative asset they wanted (presumably because real estate is even more obtuse than bitcoin).

With Bittorrent, people definitely get the convenience of having access to obscure content (though netflix is great counterexample to it).

> And yet we'll have people argue that these open source and federated/distributed systems are "too confusing" and "not practical" and that we shouldn't even try to avoid this future.

I think best route would be that of Linux vs Android. Which has already happened to a certain degree with Mastodon for instance - someone "privatizes" the underlying open fabric and puts a nice "convenience trap" on top of it to "attract adoption".

The issue here is mostly that investors in such an endeavor are seeking total control of the userbase, an engineered artificial "inconvenience of switching platform".

Internet behemoths should not be called out on "dangers of getting regulated and handing control to the government" (frankly, anything can happen, not just that), but on keeping their platforms un-interoperable on purpose in a bid to attain a monopoly through networking effects, burdening the entrapped users with "inconvenience of limited frontend choice". They should be called out the same way we called out Microsoft back then, or say, Comcast now.

ezdiy | 9 years ago | on: An off-grid social network

All pubs on the wiki are indeed overloaded. Interestingly if one sets up their own, the other pubs eventually sync with it, only the desktop client seems to be unhappy with laggy pubs. Is that by design?

FWIW, you can use pub.lua.cz:8008:@xYSW6eVu8gTS/nTSXZiH97dgKZ+wp7NkomR6WKK/PBI=.ed25519~iQ16RuvjKZqy/RhiXXmW9+6wuZNq+SBI8evG3PotxvI= if you have trouble connecting to the ones on github.

Feel free to add it to the wiki, I do plan to run it long term, but I am not a github user.

ezdiy | 9 years ago | on: An off-grid social network

If you want a "modern web" example, nntpchan - despite the name, its not related to usenet directly, only uses the same method of federated pub/sub replication.

ezdiy | 9 years ago | on: Opera 12.15 source code

> using different types for things that are web observable, especially

Can you be more specific?

> has UI lagginess with large numbers of tabs due to IPC

The IPC lock contention is part of the issue (as well as tab per process model), but those can be all worked around (sorta, well, chromium got rid of no-sandbox ifdefs a year ago...).

> Vivaldi is mostly formed of ex-Opera employees

Morten Stenshorn, Dave Rune, Rafal Chlodnicki, Sigbjorn Finne - more than half of names in there! - are the names signed under layout engine and ES engine in the leaked tree. Same people you can find on https://operasoftware.github.io/upstreamtools/

How many people I could track down who worked on opera 12 and now vivaldi? None.

I'm willing to counter-speculate: Whatever made original presto great was not because of their cofounder and CEO, but the people who actually wrote the code.

ezdiy | 9 years ago | on: Opera 12.15 source code

The answer is - it's sorta comparing apples and oranges - libraries, or general C++ inlining cruft can inflate the binary size a lot indeed.

A much fairer comparison is to compare source side by side. Chromium source is about 4x larger than operas, when not counting any 3rd party dependencies.

Or even better, compile times. Chromium build (or firefox) is half a day job on mid-range laptop (especially with 4/8G memory).

Opera builds in about 20 minutes. I was also pleasently surprised the codebase is not particularly bitrotten, and both VS2015 and modern gcc could cope with it.

ezdiy | 9 years ago | on: Opera 12.15 source code

I'm posting this from chropera as we speak, and I used presto for a decade before that.

While your points are spot on, there's one thing about Presto - performance, or more specifically resource usage. Modern layouting engines are simply not careful, because apparently all their devs have 32GB RAM and 8 cores, so users have to as well.

The problem is the argument "people generally don't have 50 tabs open, so why bother". Well, it bothers people who could comfortably do so in presto, it's not just nostalgia goggles.

As for vivaldi, I really tried to like it, putting up with its numerous subtle bugs. Then I deobfuscated its source code one day and realized if it ever becomes stable, it will be the day when a large scale nodejs project managed to do so. For some reason (cheap labor?) they decided Java or Typescript isn't the route.

ezdiy | 10 years ago | on: Show HN: Trigrad, a novel image compression with interesting results

Love how the algorithm is simple compared to iDCT based ones, very good job!

> No text at all

Or indeed anything which poorly maps to gradients. For this I'm thinking about instead of storing pixel values per sample vertex, store only dct coefficient(s) per each tri - result of which gets texture-like-mapped to the tri surface. Think JPEG instead of 8x8 quads using variably sized tris.

JPEG artifacts would then be much more fine grained around edges.

EDIT: It would not need to be as complex as full DCT because a lot of information is carried through tri shape/positioning on edges. The idea is to have "library of various gradient shapes" to pick from, not full 8x8 DCT matrix capable of reconstituting 8x8 bitmap approximations.

Once again, thanks for inspiring implementation.

ezdiy | 10 years ago | on: The Crystal Programming Language

> A fully concurrent GC is impossible

That's a common misconception, of course mutability gc barriers can be made atomic. But it comes at significant synchronization cost, plus using full shared world like in C does not seem like a good design decision anyway in high level language like crystal.

Which is why I'd be more in favour of refcounting, and let the user make the choice - a simple stop-world gc mark&sweep is alright for tasks which can afford the higher memory usage and pauses (one gains good throughput), or rc - good for low latency, low memory usage (and low throughput and high cache pollution).

Regarding multi core threading, crystal has next to none. All modern gc design decisions depend on how exactly multicore threading will be eventually implemented. hence why fixing gc is not a priority, but rc could be readily useful.

ezdiy | 10 years ago | on: The Crystal Programming Language

Current channels implementation uses fibers, ie everything runs on single thread and only stack is switched. There is no GIL.

If you want true multithreading, simply use pthreads like you would in C. Just make sure to mark gc roots of new thread stacks (gc is thread aware). But this also comes with a lot of headaches as you're now responsible to synchronize everything and manage mutexes by hand.

ezdiy | 10 years ago | on: The Crystal Programming Language

Indeed it's becoming quite promising (using crystal already for few pet scripts which were too slow even for rubinius). Where it's most lacking at the moment is gc - it uses stop-world off the shelf boehmgc which is ok but not exactly great for memory heavy tasks.

ezdiy | 11 years ago | on: Video games beat interviews to recruit the very best

While the efforts of Matasano to make wargaming cool and hip again certainly is commendable, I think HR drones of the world also appreciate a game of "pretend to be sociable teamplayer/brogrammer".

Too often hackers don't get the job because of being overqualified and prima-donna, rather than lack of technical skill - a game teaching/testing the art of office politics could be applicable to wider market [outside of infosec].

tl;dr: Perhaps less of http://alexnisnevich.github.io/untrusted/ and more of EVE online.

ezdiy | 13 years ago | on: Linux local privilege escalation 0day, 2.6.37 - 3.8.10

Spender did excellent write-up about how it exactly works: http://www.reddit.com/r/netsec/comments/1eb9iw/sdfucksheepor...

ezdiy | 13 years ago | on: Linux local privilege escalation 0day, 2.6.37 - 3.8.10

Try dmesg.

ezdiy | 13 years ago | on: Linux local privilege escalation 0day, 2.6.37 - 3.8.10

You're correct.

Bug is in 2.6.37-3.8.8, fixed in 3.8.9.

http://lxr.linux.no/linux+v3.8.9/kernel/events/core.c#L5331

ezdiy | 13 years ago | on: Linux local privilege escalation 0day, 2.6.37 - 3.8.10

Not that I know of, hence 0day.