Object-oriented design patterns in C and kernel development

ryao|6 months ago

> The article describes how the Linux kernel, despite being written in C, embraces object-oriented principles by using function pointers in structures to achieve polymorphism.

This technique predates object oriented programming. It is called an abstract data type or data abstraction. A key difference between data abstraction and object oriented programming is that you can leave functions unimplemented in your abstract data type while OOP requires that the functions always be implemented.

The sanest way to have optional functions in object oriented programming that occurs to me would be to have an additional class for each optional function and inherit each one you implement alongside your base class via multiple inheritance. Then you would need to check at runtime whether the object is an instance of the additional class before using an optional function. With an abstract data type, you would just be do a simple NULL check to see if the function pointer is present before using it.

pavlov|6 months ago

In Smalltalk and Objective-C, you just check at runtime whether an object instance responds to a message. This is the original OOP way.

It's sad that OOP was corrupted by the excessively class-centric C++ and Java design patterns.

trws|6 months ago

I largely agree, and use these patterns in C, but you’re neglecting the usual approach of having a default or stub implementation in the base for classic OOP. There’s also the option of using interfaces in more modern OOP or concept-style languages where you can cast to an interface type to only require the subset of the API you actually need to call. Go is a good example of this, in fact doing the lookup at runtime from effectively a table of function pointers like this.

1718627440|6 months ago

> This technique predates object oriented programming.

I would rather say that OOP is a formalization of predating patterns and paradigma.

mistrial9|6 months ago

The concept of abstract data type is a real idea in the days of compiler design. You might as well say "compiler design predates object oriented programming". The technique described in the lead is used to implement object-oriented programming structures, just as it says. So are lots of compiler design features under the hood.

source- I wrote a windowing framework for MacOS using this pattern and others, in C with MetroWerks at the time.

kerblang|6 months ago

You can do exactly what was done in C with most OOP languages like Java & C# because you have lambdas now, and lambdas are just function pointers. You can literally assign them to instance variables (or static variables).

(sorry it took more than a decade for Java to catch up and Sun Microsystems originally sued Microsoft for trying to add lambdas to java way back when, and even wrote a white paper insisting that anonymous inner classes are a perfectly good substitute - stop laughing)

yndoendo|6 months ago

Inheritance is not needed when a composite pattern can be used.

class DefaultTask { }

class SpecialTask { }

class UsedItem {

    UsedItem() { _task = new SpecialTask() }
    
    void DoIt() { _task.DoIt() }

}

Is python a OOP language? Self / this / object pointer has to be passed similar to using C style object-oriented / data abstraction.

Shorel|6 months ago

An abstract data type is a software design pattern.

The difference is that design patterns are a technique where you use features not implemented by the compiler or language, and all the checks have to be done by the developer, manually.

Thus, you are doing part of the work of the compiler.

In assembler, a function call is a design pattern.

pakl|6 months ago

A few years ago Peterpaul developed a lightweight object-oriented system on top of C that was really pleasant to use[0].

No need to pass in the object explicitly, etc.

Doesn't have the greatest documentation, but has a full test suite (e.g., [1][2]).

[0] https://github.com/peterpaul/co2

[1] https://github.com/peterpaul/co2/blob/master/carbon/test/pas...

[2] https://github.com/peterpaul/co2/blob/master/carbon/test/pas...

guerrilla|6 months ago

For people wondering what it looks like without the syntactic sugar of carbon then look here [0]. As far as I can see, there's no support for parametric polymorphism.

0. https://github.com/peterpaul/co2/tree/master/examples/my-obj...

saagarjha|6 months ago

I feel like Vala tries to fit in this niche too.

1718627440|6 months ago

> Having to pass the object explicitly every time feels clunky, especially compared to C++ where this is implicit.

I personally don't like implicit this. You are very much passing a this instance around, as opposed to a class method. Also explicit this eliminates the problem, that you don't know if the variable is an instance variable or a global/from somewhere else.

MontyCarloHall|6 months ago

Agreed, one of the biggest design mistakes in the OOP syntax of C++ (and Java, for that matter) was not making `this` mandatory when referring to instance members.

loeg|6 months ago

I think the author is talking about this:

  object->ops->start(object)

Where not only is it explicit, but you need to specify the object twice (once to resolve the Vtable, and a second time to pass the object to the stateless C method implementation).

spacechild1|6 months ago

> Also explicit this eliminates the problem, that you don't know if the variable is an instance variable or a global/from somewhere else.

People typically use some kind of naming convention for their member variables, e.g. mFoo, m_Foo, m_foo, foo_, etc., so that's not an issue. I find `foo_` much more concise than `this->foo`. Also note that you can use explicity this in C++ if you really want to.

Gibbon1|6 months ago

The implicit this sounds to me like magic. Magic!

Ask how do I do this, well see it's magic. It just happens.

Something went wrong? That's also magic.

After 40 years I hate magic.

Galanwe|6 months ago

I don't quite agree, especially because the implicit this not only saves you from explicitly typing it, but also because by having actual methods you don't need to add the struct suffix to every function.

    mystruct_dosmth(s);
    mystruct_dosmthelse(s);

vs

    s->dosmth();
    s->dosmthelse();

ActorNightly|6 months ago

You can also get clever with macros.

elteto|6 months ago

...and C++ added explicit this parameters (deducing this) in C++23.

ryao|6 months ago

“this” is a reserved keyword in C++, so you do not need to worry about it being a global variable.

That said, I like having a this pointer explicitly passed as it is in C with ADTs. The functions that do not need a this pointer never accidentally have it passed from the developer forgetting to mark the function static or not wanting to rewrite all of the function accesses to use the :: operator.

tdrnl|6 months ago

A talk[0] about Tmux is where I learned about this pattern in C.

I wrote about this concept[1] for my own understanding as well -- just tracing the an instance of the pattern through the tmux code.

[0] https://raw.githubusercontent.com/tmux/tmux/1536b7e206e51488... [1] https://blog.drnll.com/tmux-obj-oriented-commands

wosined|6 months ago

Hi, I don't know much about this. But it seems to me that the OP is doing it differently than the kernel devs. If you read the article that the OP links, then you get the impression that the vtables contain typed function pointers, while OP uses void pointers. Also the main benefit mentioned in the kernel dev article is that you save memory, by not having multiple function pointers in each structure instance, but instead you have just one pointer to a vtable in each instance. Thus the main benefit is saving memory according to kernel dev, but OP uses this vtable as a form of indirection to implement runtime method swapping and polymorphism, which is not even mentioned in the kernel dev article. Thus, OP uses some other pattern than the one mentioned by kernel dev.

1718627440|6 months ago

> while OP use void pointers

OP doesn't use void pointers, he uses void. He writes about functions having no arguments and returning nothing for the same reason other blog posts name functions foo and bar.

> OP uses this vtable as a form of indirection to implement runtime method swapping and polymorphism

The kernel uses vtables to implement polymorphism, it doesn't store the vtable in the object to save space. If there is no polymorphism, you don't use a vtable at all, that's saving even more space.

SLWW|6 months ago

I've done this on a few smaller projects when I was in college. It's fun bringing something similar to OOP into C; however you can get into trouble really quickly if you are not careful.

munchler|6 months ago

Note that this is using interfaces (i.e. vtables, records of function pointers), not full object-orientation. Other OO features, like classes and inheritance, have much more baggage, and are often not worth the associated pain.

1718627440|6 months ago

What do you think inheritance is, if not composition of vtables? What do you think classes are, if not a composition of a vtable and scoped variables?

PhilipRoman|6 months ago

Field inheritance is surprisingly natural in C, where a struct can be cast to it's first member.

ryao|6 months ago

vtables contain function pointers to functions that take “this” pointers. The author mentions struct file_operations as an example of a vtable. struct file_operations contains a pointer to a function that does not take “this” pointer. It is not even a vtable.

accelbred|6 months ago

I usually put an inline wrapper around vtable functions so that `thing->vtable->foo(thing, ...)` becomes `foo(thing, ...)`.

2OEH8eoCRo0|6 months ago

Yup. I've often wonder why the aversion to C++ since they are obviously using objects. Is it that they don't want to also enable all the C++ language junk like templates or OO junk like inheritance?

nphardon|6 months ago

Here's one example. For us, it's more a tradeoff rather than an aversion. There's pros (manual memory management in C) and cons (manual memory management in C) for each. We do math operations (dense and sparse matrix math for setting up and solving massive systems of differential equations) on massive graphs with up to billions of nodes and edges. We use C in parts of the engine because we need to manage memory at a very fine level to meet performance demands on our tool. Other parts of the tool use C++ because they decided the tradeoff benefited in the other direction, re memory access / management / ease of use. As a result we need really robust qa around memory leaks etc. and tbh we rely on one generational talent of an engineer to keep things from falling apart; but we get that speed. As a side note, we implement objects in C a little more complex than the op, so that the object really does end up as a black box to the user (other engineers), with all the beauty of data agnosticism.

1718627440|6 months ago

C makes it obvious were you use that dynamism and where you don't. Syntactic sugar doesn't really make that much of a difference and also restricts more creative uses.

The C syntax is not really that complicated. Dynamic dispatch and virtual methods was already in the article. Here is inheritance:

    struct Subclass {
        struct Baseclass base;
    };

That's not really that complicated. Sure, you need to encapsulate every method of the parent class, if you want to expose it. But you are also recommended to do that in other languages, and if you subclass you probably want to slightly modify behaviour anyway.

As for stuff like templates: C doesn't thinks everything needs to be in the compiler. For example shadowing and hiding symbols can be done by the linker, since this is the component that handles symbol resolution across different units anyway. When you want templates, either you actually want a cheap way of runtime dynamism, then do that, or you want source code generation. Why does the compiler need to do that? For the basics there is a separate tool in the language: the Preprocessor, if you want more, you are free to choose your tool. If you want a macro language, there is e.g. M4. If you want another generator just use it. If you feel no tool really cuts it, why don't you write your code generator in C?

BinaryIgor|6 months ago

I always wonder, why not anything similar made it into a new (some) C version? Clearly, there is a significant demand for - lots of people reimplementing the same (similar) set of patterns.

1718627440|6 months ago

Whenever you invent syntactic sugar you need to make some usage blessed and some usage impossible/needing to fallback to the old way without syntactic sugar. See https://news.ycombinator.com/item?id=45040662. Also some point of C is, that it doesn't hide that dynamic complexity. You always see when there is dynamic dispatch. There are tons of language, which introduce some formalism for these concepts, honestly most modern imperative languages seem to be. The unique selling point of C is, that you see the complexity. That influences you to only use it if you really want it. Also the syntax isn't really that complicated.

davikr|6 months ago

Probably into the High C Compiler.

TickleSteve|6 months ago

Never. Do. This...

I was involved in a product with a large codebase structured like this and it was a maintainability nightmare with no upsides. Multiple attempts were made to move away from this to no avail.

Consider that the code has terrible readability due to no syntax-sugar, the compiler cannot see through the pointers to optimise anything, tooling has no clue what to do with it. On top of that, the syntax is odd and requires any newbies to effectively understand how a c++ compiler works under-the-hood to get anything out of it.

On top of those points, the dubious benefits of OOP make doing this a quick way to kill long-term maintainability of your project.

For the devs who come after you, dont try to turn C into a poor-mans C++. If you really want to, please just use C++.

1718627440|6 months ago

Can you elaborate what exactly the maintainability nightmare was?

To me less syntactic sugar is more readable, because you see what function call involves dynamic dispatch and which doesn't. Ideally it should also lead to dynamic dispatch being restricted to where it is needed.

I don't know where (might also have been LWN), but there was a post about it actually being more optimizable by the compiler, because dynamic code in C involves much less function pointers and the compiler can assume UB more often, because the assignments are in user code.

> requires any newbies to effectively understand how a c++ compiler

You are not supposed to reimplement a C++ compiler exactly, you are supposed to understand how OOP works and then this emerges naturally.

> dont try to turn C into a poor-mans C++

It's not poor-mans C++, when it's idiomatic C.

People like me very much choose C while having this usage in mind, because its clearer and I can sprinkle dynamism where it's needed not where the language/compiler prescribes it and because every dynamism is clear because there is not dynamic sugar, so you can't hide it.

nphardon|6 months ago

Another cool thing about this approach is you can have the arguments to your object init be a pointer to a structure of args. Then down the line you can add features to your object without having to change all the calls to init your object throughout the code base.

unknown|6 months ago

[deleted]

MangoToupe|6 months ago

If this is the pattern you prefer, why not choose a language that caters to it? Choosing C just seems like you're TRYING to shoot yourself. I don't care how good you are at coding, this is just a bad decision.

1718627440|6 months ago

Because they like how C caters to this. This question was asked here several times, please read the answers there.

220 comments