Redis on the Raspberry Pi: Adventures in unaligned lands

[+] drej|8 years ago|reply

I never deal with such low level issues, so I don't have to read this, but... reading these posts by antirez is such a joy. He makes this topic so clear and understandable, he doesn't assume much, he doesn't use overly complex explanations, he just "says it like it is" :-)

Thanks!

[+] hellwd|8 years ago|reply

++ :)

[+] drewg123|8 years ago|reply

I fondly remember unaligned access faults "back in the day" with FreeBSD/alpha. We implemented a fixup for applications, but not for the kernel. I seem to recall that even though x86 could deal with unaligned accesses, it caused minor performance problems, so fixing alignment issues on alpha would benefit x86 as well.

Most (definitely not all) of the mis-alignment problems were in the network stack, and were centered around the fact that ethernet headers are 14 bytes, while nearly all other protocols had headers that were a multiple of at least 4 bytes.

I've said it before, and I'll say it again: If I had a time machine, I would not kill Hitler. I'd go back to the 70s and make the ethernet header be 16 bytes long, rather than 14.

[+] IgorPartola|8 years ago|reply

Why in god's name did they make it 14?!

[+] blattimwind|8 years ago|reply

There is a funny mode on ARM processors (turned on in some images, by default) which causes unaligned reads to silently return bogus data (just increasing a kernel counter).

PowerPC, and really, most non-x86 architectures, do this one way or another.

[+] faragon|8 years ago|reply

PowerPC (and POWER) has reasonable hardware support for unaligned memory access, at least for 32-bit data, and if the data is in the data cache. Depending on the processor, the exceptions that reach the OS can be more or less frequent.

ARM v6-A and later (except for some microcontrollers, like Cortex M0/R0, that don't support hardware unaligned access at all, triggering a exception) is similar to the Intel x86 case (reference in transparent unaligned memory access -except for SIMD, where x86 can raise exceptions, too, in the case of unaligned load/store opcodes-), where there is hardware support for unaligned memory access.

For software that uses intensive non-aligned data access, e.g. data compression algorithms doing string search, PowerPC, ARM v6-A (and later ARM Application processors), new MIPS with HW support for unaligned memory access, and Intel are pretty much the same (i.e. b = * (uint32_t * )(a + 23) will take 1-2 cycles, not requiring doing a memcpy(&b, a + 23, sizeof(uint32_t))).

For SIMD, though, there is no transparent fix, although there are specific opcodes for aligned and unaligned memory access (e.g. load/store, unaligned load/store).

[+] throwaway000002|8 years ago|reply

I'm probably the only weirdo that thinks this, but if you support byte-addressing you'd better as well be happy with byte-alignment. Atomics being the only place where it's reasonable to be different.

Which brings me to padding. I wonder what percentage of memory of the average 64-bit user's system is padding? I'm afraid of the answer. The heroes of yesteryear could've coded miracles in the ignored spaces in our data.

[+] wzdd|8 years ago|reply

> if you support byte-addressing you'd better as well be happy with byte-alignment

All ARM processors do this. The concept is called "natural alignment" and it's pretty common on non-x86. See e.g. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.... . The problem here is that a lot of code written for x86 wants more than that, e.g. byte addressing for non-byte-wide values.

[+] pm215|8 years ago|reply

Alignment requirements are and have historically been very common -- you can see them on the PDP-11, the 680x0, and so on. It's only because a few very popular architectures like x86 have had very loose or no alignment requirements that we've ended up with a lot of code that assumes there is no alignment requirement, and this has dragged other architectures down the "we need to support this" path. If your architecture faults on misaligned accesses it's really not hard to deal with -- you have to be doing something a bit odd to even run into the problem usually.

[+] MrBuddyCasino|8 years ago|reply

Accessing memory locations ending in 0x7? Gather round the campfire folks, James Mickens has a story to tell: https://www.usenix.org/system/files/1311_05-08_mickens.pdf

[+] luhn|8 years ago|reply

> Redis is adding a “Stream” data type that is specifically suited for streams of data and time series storage, at this point the specification is near complete and work to implement it will start in the next weeks.

This sounds like it could be really exciting. Is there anywhere I can find out more?

Specifically, I've been struggling to find an appropriate backend for HTTP Server-Sent Events, could this feature help with that?

[+] antirez|8 years ago|reply

Hello, please check my two Redis Conf 2017 talks on youtube. There is info about Streams.

[+] yeswecatan|8 years ago|reply

Here's a discussion on reddit. There's a link to the proposal on github, too.

https://www.reddit.com/r/redis/comments/4mmrgr/stream_data_s...

[+] johnny22|8 years ago|reply

I'm pretty sure I saw implementations that used the existing publish subscribe mechanism in Redis to handle it and seemed happy with it. I have no personal experience with it though.

[+] msarnoff|8 years ago|reply

Recently I've been doing a lot of low-level work with ARMv7-M microcontrollers (specifically, NXP's Kinetis Cortex-M4 chips) and was quite pleased to find out that they are pretty lenient about unaligned accesses. To quote from the ARM Cortex-M4 Processor Technical Reference Manual:

"Unaligned word or halfword loads or stores add penalty cycles. A byte aligned halfword load or store adds one extra cycle to perform the operation as two bytes. A halfword aligned word load or store adds one extra cycle to perform the operation as two halfwords. A byte-aligned word load or store adds two extra cycles to perform the operation as a byte, a halfword, and a byte. These numbers increase if the memory stalls."

However, multi-word memory instructions (LDRD, STRD, LDM, STM, etc.) always require their arguments to be word-aligned.

[+] type0|8 years ago|reply

Great article, this project just begs the name of Redisberry Pi

[+] JefeChulo|8 years ago|reply

In future project I might be interested in the use of Redis for queuing jobs, this comes very handy to now early the main issues I could get when developing.

[+] amelius|8 years ago|reply

Could Rust's typesystem catch unaligned pointer dereferences?

[+] bbatha|8 years ago|reply

Sort of, Rust is supposed to make references to packed structure members unsafe, but currently doesn't. An RFC was accepted to change the behavior but it has not been fully implemented. Here's the tracking issue: https://github.com/rust-lang/rust/issues/27060

[+] wofo|8 years ago|reply

Considering dereferencing a pointer after doing some arithmetic on it can only be done within unsafe blocks, I would say you are at least warned about it. But it will happily compile.

[+] dis-sys|8 years ago|reply

wondering what kind of performance overhead it is going to cause by letting the kernel to handle unaligned access vs. fixing the software to actually always use aligned access?

[+] crncosta|8 years ago|reply

Nice article!

[+] k__|8 years ago|reply

OT: Is blattimwind shadow banned?

[+] make3|8 years ago|reply

? I see his post

[+] retox|8 years ago|reply

No, but posting while green will usually get your comment downvoted to oblivion, even if you are erudite and contribute to the conversation.

Turn on "show dead comments" and see how many greens are deleted. I screenshot many examples.

59 comments