A person smarter than me once told me: “Just use snprintf()”.
The tiny performance hit is usually worth the safety and simplicity. Only one error condition to check, covering both concatenation and copies, sensible 'size' argument behaviour (size includes null terminating byte, null terminating byte is always written)...
I came here to post exactly that; I think using snprintf() is the proper solution to the "I need to build a string out of many parts in C" problem pretty much every time.
Especially since these functions' return values make it hard to use them to step forward in the buffer, so subsequent calls need to walk over the already-inserted characters again and again. That kind of redundant work makes me grit my teeth, which I guess is a sign that I'm a C programmer at heart. :)
The latter is given a pointer to the base buffer, so it has to walk past the existing characters in order to find the end where the concatenation should start.
I also find the "size" arguments confusing and weird. I think it's idiomatic in C to (tightly) associate a size with a buffer pointer, and speaking in terms of "there are n bytes available at this location" makes sense to me. So the prototype really should have been
At least in glibc, the printf family is much slower than strcpy. It is a huge performance penalty if string copy is frequent (e.g. when you incrementally write to a large string buffer).
As a bonus, if you pass NULL for the first argument, its return value tells you how much memory it would have used, so you can allocate a buffer that's the right size. Bigger performance hit, but in practice, I found the impact negligible in a codebase that doesn't do a lot of stringy stuff but needs to emit reasonable messages to logfiles etc.
There have been various attempts to improve on strcpy() etc. The current thinking is that strscpy() is the best general interface to this functionality. See the "improving ... strcpy" section at
The sad part is really that there are several implementation of safe(r) string manipulation functions, they all do mostly the same thing, but have different names and parameter orders, preventing interoperability between major compilers. They really need to get their shit together and finally agree on a single variant to be standardized.
One drawback of these functions is that the return value is the length of the source string. This is seldomly useful, as it's mostly a micro-optimization to avoid an extra string scan when using static buffers for small strings, but dynamic buffers for large ones. However, it prevents these functions from being used with input that might not be null-terminated, and causes performance to be bounded by the size of the source (which usually isn't controlled by the caller), instead of the destination (which usually is).
I've never understood the circumstances where it is ok for a copy/concatenate operation to also include a bonus "truncate the result" operation at the end. I'm a believer that you should use strcpy and strcat, and also make sure your buffers are large enough. If you can't make them large enough, then you can't do the operation you are trying to do (and I would say that doing it incorrectly via truncation isn't obviously better since this seems like silent corruption to me)
Truncating strings is a practical alternative that provides useful signal with minimal extra error handling code. Yes, in the ideal world there will be no string truncation, but if there are no alternatives in a lot of cases truncation is better than nothing.
Which of those two error messages would you rather see?
sprintf: output buffer is to small
or
fopen: cannot open file /usr/bin/superlong-dist-path/208277409874874/fi
Someone will likely be able to answer more specifically, but I seem to recall an old common operation was filling fixed width text fields; truncating a message would make sense there (not ideal, but probably familiar).
Before using any function of the strcpy family, try to think about why you are using them.
The first question is: do I need a copy? Obviously, often, you do, but maybe you can avoid it. Copying data can be costly in terms of performance, and of course, without copying, no need to allocate memory or deal with buffers sizes.
Now, that you are sure that you indeed need a copy, maybe you should consider using memcpy() instead. If you already know the sizes of your strings, then it is a no-brainer. If you don't, maybe doing strlen()/strnlen() first on your input data is a good idea, and if it is too big, handle the error appropriately, and if it is not, then do a memcpy() now that you have the size.
The default behavior of "safe" variants of strcpy() it to truncate the string to the maximum size. It is certainly better than a buffer overflow, but it may not be what you want, just ask your ass
The standard C functions are enough of a mess mentally for me that I try to use them only for simple things.
If I'm dealing with lots of string manipulations I usually create a struct with a count and an alloc size and a char* pointer and write init, cleanup, extend, and append functions specific for it and use that as a string type. It's less of a mental load for me to deal with reallocating an array and keeping count and alloc updated correctly once than it is knowing that character arrays are being recounted all the time and worrying about exactly where the '\0' is going to end up and if my buffer is large enough for every string I'm dealing with.
Null-terminated strings have to be up there with null-pointers as design ideas that seemed great at the time, but turned out to be extremely costly because of the insane number of bugs they produce.
I wonder what an alternative world would look like where C had decided to use pascal-strings instead (prefixing the string with its length). We would undoubtedly have argued about the length of the size prefix, but a mistake there is trivial to spot compared to wrong application of str[nls]*cpy(_s)?.
I used to program in Pascal a lot, and its counted strings were one of the worst features.
First, you cannot really "argue" about length of the size prefix - it is a part of ABI/calling convention, and unless you have template-like mechanism and overloading/conventions, you will have to use whatever stdlib was compiled with.
And Borland Pascal's stdlib was compiled with 1 byte prefix, making max string length of 255 bytes. This made sense when total RAM was measured in hundreds of kilobytes, but generated so many bugs when a normally short string had to be just a bit longer.
I was somewhat envious of my C-using friends who could put entire help message into a single string. But not envious enough to switch to much slower C tooltchain :)
There were other reasons, too, like using less registers in assembly-level loops (old machines had much fewer registers than modern ones). Or not having to worry about validating strings when reading binary data from disk.
That also would have made for a nice performance benefit passing to zero-copy type calls. And reduced complexity for new languages written in C implementing their own string types. Tcl was irritating when it was first invented, as it directly used C strings, and couldn't process binary data at all.
The better answer would be to add data types like SDS[0] to the standard library, and use them as ADTs (Abstract Data Types) [1].
Unfortunely WG14 has proven in 30 years of existence, that it isn't something that they care to fix, and while 3rd party solutions exist, without vocabulary types on the stardard library adoption will never take off.
[+] [-] q3k|2 years ago|reply
The tiny performance hit is usually worth the safety and simplicity. Only one error condition to check, covering both concatenation and copies, sensible 'size' argument behaviour (size includes null terminating byte, null terminating byte is always written)...
Just use snprintf().
[+] [-] unwind|2 years ago|reply
Especially since these functions' return values make it hard to use them to step forward in the buffer, so subsequent calls need to walk over the already-inserted characters again and again. That kind of redundant work makes me grit my teeth, which I guess is a sign that I'm a C programmer at heart. :)
Compare:
and The latter is given a pointer to the base buffer, so it has to walk past the existing characters in order to find the end where the concatenation should start.I also find the "size" arguments confusing and weird. I think it's idiomatic in C to (tightly) associate a size with a buffer pointer, and speaking in terms of "there are n bytes available at this location" makes sense to me. So the prototype really should have been
[+] [-] attractivechaos|2 years ago|reply
[+] [-] krylon|2 years ago|reply
[+] [-] Someone|2 years ago|reply
It need not even be a performance hit. The compiler could recognize the idiom of code such as
and optimize it into (example possibly buggy; my C is rusty)[+] [-] pixelbeat__|2 years ago|reply
https://www.pixelbeat.org/programming/gcc/string_buffers.htm...
[+] [-] TonyTrapp|2 years ago|reply
[+] [-] Denvercoder9|2 years ago|reply
[+] [-] ben_bai|2 years ago|reply
Those are not valid C-strings! String functions are(have) "undefined behavior" for non string inputs!
[+] [-] TheNewAndy|2 years ago|reply
[+] [-] theamk|2 years ago|reply
Which of those two error messages would you rather see?
or[+] [-] tedunangst|2 years ago|reply
[+] [-] dwattttt|2 years ago|reply
[+] [-] GuB-42|2 years ago|reply
The first question is: do I need a copy? Obviously, often, you do, but maybe you can avoid it. Copying data can be costly in terms of performance, and of course, without copying, no need to allocate memory or deal with buffers sizes.
Now, that you are sure that you indeed need a copy, maybe you should consider using memcpy() instead. If you already know the sizes of your strings, then it is a no-brainer. If you don't, maybe doing strlen()/strnlen() first on your input data is a good idea, and if it is too big, handle the error appropriately, and if it is not, then do a memcpy() now that you have the size.
The default behavior of "safe" variants of strcpy() it to truncate the string to the maximum size. It is certainly better than a buffer overflow, but it may not be what you want, just ask your ass
isstant.
[+] [-] godshatter|2 years ago|reply
If I'm dealing with lots of string manipulations I usually create a struct with a count and an alloc size and a char* pointer and write init, cleanup, extend, and append functions specific for it and use that as a string type. It's less of a mental load for me to deal with reallocating an array and keeping count and alloc updated correctly once than it is knowing that character arrays are being recounted all the time and worrying about exactly where the '\0' is going to end up and if my buffer is large enough for every string I'm dealing with.
[+] [-] wongarsu|2 years ago|reply
I wonder what an alternative world would look like where C had decided to use pascal-strings instead (prefixing the string with its length). We would undoubtedly have argued about the length of the size prefix, but a mistake there is trivial to spot compared to wrong application of str[nls]*cpy(_s)?.
[+] [-] theamk|2 years ago|reply
First, you cannot really "argue" about length of the size prefix - it is a part of ABI/calling convention, and unless you have template-like mechanism and overloading/conventions, you will have to use whatever stdlib was compiled with.
And Borland Pascal's stdlib was compiled with 1 byte prefix, making max string length of 255 bytes. This made sense when total RAM was measured in hundreds of kilobytes, but generated so many bugs when a normally short string had to be just a bit longer.
I was somewhat envious of my C-using friends who could put entire help message into a single string. But not envious enough to switch to much slower C tooltchain :)
There were other reasons, too, like using less registers in assembly-level loops (old machines had much fewer registers than modern ones). Or not having to worry about validating strings when reading binary data from disk.
[+] [-] tyingq|2 years ago|reply
[+] [-] pjmlp|2 years ago|reply
[+] [-] david2ndaccount|2 years ago|reply
[+] [-] knorker|2 years ago|reply
Yes, they make it harder to get memory corruption. But that's a very narrow definition of "safe".
If you tell the computer to copy a string, or append to a string, it's wrong, and not safe, to only do a partial copy or concatenation.
It's a huge footgun. Smaller than strcpy(), sure, but "safe" it is not.
I don't really have a better answer for C.
[+] [-] pjmlp|2 years ago|reply
Unfortunely WG14 has proven in 30 years of existence, that it isn't something that they care to fix, and while 3rd party solutions exist, without vocabulary types on the stardard library adoption will never take off.
[0] - https://github.com/antirez/sds
[1] - https://en.wikipedia.org/wiki/Abstract_data_type