I am always baffled that C didn’t ever get a native string type. Strings are used in what feels like 99.99999% of the applications written. Having proper strings that don’t require fiddling with pointers on bytes would likely prevent more than 50% of security issues out there.
Without system/external libraries C is more like easier to read assembly, without much on top of it. There are no strings as we understand them in assembly, only pointers to sequential lump of RAM where NULL character means end of string. That’s why C is so great as language for libraries at the level where strings are only for debugging and a waste of computing time anyway.
But for some reason often instead of writing a library in C and then linking to it in some high level language to handle the operations where strings are common, people try to use the hammer for everything and end up with overflowing buffers or trying to make exceptions in the kernel for D-Bus
I know that C is meant only as a little step forward from assembler. But it was also fine to introduce arrays (which are also not a thing in asm where you only operate on pointers). So why not also add a datatype for THE ONE THING that is used almost everywhere?
There are thousands of useful things one could argue about if they would make sense in the language or not (and in the case of C I would be totally fine with saying “no” to all of them). But strings IMO are not just some fringe feature that is used here and there. They are mission critical across the board and far too important to be left for libraries.
arrays (which are also not a thing in asm where you only operate on pointers)
I’m afraid that’s wrong. Arrays are definitely an asm thing. An array is just a pointer to the first object of consecutively stored objects. You add n*size_of_stored_type to the pointer and you get the nth object
They are mission critical
Do you have an example? I know that many products abandon having control over what is executed because that’s cheaper money/developer-time wise and leverage the power of CPU. So instead of securely comparing a string once and then using enum(int) in further code, use string comparison all the time. But that’s a design problem, not technical one
Basically every program that deals with some form of user input will come across strings. Be it to print something to the screen, write something to a file, read something from a file, read something from the user interface (even if it’s stdin). Even most non-user-facing tools (daemons, drivers, etc) have to deal with strings often enough, even if “just” for something like writing log or debug entries.
For me it’s hard to come up with any application where I don’t need strings sooner or later. Typically sooner than later.
Do you separate that? I mean if the idea is to use C only outside of user interaction, then maybe. But is this a realistic scenario? If I write my whole application/library in C, user interaction is part of the application nonetheless. Maybe not what you consider “mission critical” from a program-reliability standpoint. But still mission critical from a user-experience standpoint. Because the whole application is worthless, if it cannot be used.
If I write my whole application/library in C, user interaction is part of the application nonetheless
That’s my point. Human facing interface needs a lot of code that does not really do much, only needs to be there to cover all the edge cases of mixed parameters, cancel buttons, trying to click “next” without filling important textbox… And writing all this in C (I mean the actual user-end program interface, not the general GUI library, like GTK for example) only makes it worse to debug and maintain. You most often don’t get any gain from manual memory management. If an operation is taking too long maybe it’s time to put it inside the backend library. But if you’re optimizing that operation you’ve already moved away from comparing strings inside - it’s the first one to go when a loop takes too long. And once we are speaking about more than one program that we want to have consistent behavior across that might need to change in the future - C is only slowing you down.
Do you really need to reference the “Cancel” button via pointer when checking if the user should be allowed to go back?
Write a general backend library for your important stuff and optimizations in C, so you can easily load it in other languages. And then use something higher level for the interpreter/GUI where sanitizing user input is 5 lines of different libraries from the language (I mean like re or zip in Python - these are not external, these are Python’s STL), instead of 50 lines of juggling pointers, which in C you would be doing even if the input was all ints.
You don’t care about stack height and jumping to previous frame after being in a procedure (assembler level of looking at the code) - that’s what C does for you
So why care about the pointers and structs when resizing a GUI? - Let some higher level language manage that for you
This is part of the problem. Instead of solid primitives you have to implement them yourself or pull in a library, both of which you have to hope are compatible with other libraries (or you have to convert manually all the time).
How many people who write their own string implementation do you think do so perfectly? I’d guess at most 50%. This means that basic operations in a good number of apps will have unknown bugs. Fixing bugs in application logic is one thing, but having to debug low-level type implementations is not something the average developer should do.
If don’t want to do low level programming why use C in the first place? The whole point of using C is so you can fiddle with pointers to have absolute control. Rust and Go are great alternatives that have built in strings.
Floats are implemented on most hardware by the instruction set so the language has no control over those unless your programming on a microcontroller like an atmega328p in which case you have to implement it yourself. As for why no in built support for strings is available in C is mostly due to C programmer hating change. Most hardcore C programmers are still using C89 (and the majority C99) and you can’t change old standards. C dosen’t need more features it needs less. I am a big fan of removing for loops like Zig to make the langauge simpler. That way it can maintain its minimalism. The more minimalistic the easier to write compilers.
Modern hardware also has specific instructions to speed up C string operations for the common ways they are implemented. We rely on compiler optimisation for those as well. Why not do the same for floats?
Because the language already supports it. Its not a question of what modern hardware can do just backwards compatibility and not changing the language too much. There would be no point in adding these features because if you want them you can just use Modern C++. There is no need for two identical languages occupying the same niche.
I’m not arguing for C implementing classes and all other C++ features, only for a basic data type used in most programs. Backwards compatibility is also a pretty poor argument considering new versions of C are released every couple of years with new features, already breaking backwards compatibility. Why is this specific change too much?
I am always baffled that C didn’t ever get a native string type. Strings are used in what feels like 99.99999% of the applications written. Having proper strings that don’t require fiddling with pointers on bytes would likely prevent more than 50% of security issues out there.
Without system/external libraries C is more like easier to read assembly, without much on top of it. There are no strings as we understand them in assembly, only pointers to sequential lump of RAM where NULL character means end of string. That’s why C is so great as language for libraries at the level where strings are only for debugging and a waste of computing time anyway.
But for some reason often instead of writing a library in C and then linking to it in some high level language to handle the operations where strings are common, people try to use the hammer for everything and end up with overflowing buffers or trying to make exceptions in the kernel for D-Bus
I know that C is meant only as a little step forward from assembler. But it was also fine to introduce arrays (which are also not a thing in asm where you only operate on pointers). So why not also add a datatype for THE ONE THING that is used almost everywhere?
There are thousands of useful things one could argue about if they would make sense in the language or not (and in the case of C I would be totally fine with saying “no” to all of them). But strings IMO are not just some fringe feature that is used here and there. They are mission critical across the board and far too important to be left for libraries.
I’m afraid that’s wrong. Arrays are definitely an asm thing. An array is just a pointer to the first object of consecutively stored objects. You add
n*size_of_stored_type
to the pointer and you get the nth objectDo you have an example? I know that many products abandon having control over what is executed because that’s cheaper money/developer-time wise and leverage the power of CPU. So instead of securely comparing a string once and then using enum(int) in further code, use string comparison all the time. But that’s a design problem, not technical one
Basically every program that deals with some form of user input will come across strings. Be it to print something to the screen, write something to a file, read something from a file, read something from the user interface (even if it’s stdin). Even most non-user-facing tools (daemons, drivers, etc) have to deal with strings often enough, even if “just” for something like writing log or debug entries.
For me it’s hard to come up with any application where I don’t need strings sooner or later. Typically sooner than later.
But this is high level. You shouldn’t rely on strings or user input down in the mission critical part of the program
Do you separate that? I mean if the idea is to use C only outside of user interaction, then maybe. But is this a realistic scenario? If I write my whole application/library in C, user interaction is part of the application nonetheless. Maybe not what you consider “mission critical” from a program-reliability standpoint. But still mission critical from a user-experience standpoint. Because the whole application is worthless, if it cannot be used.
That’s my point. Human facing interface needs a lot of code that does not really do much, only needs to be there to cover all the edge cases of mixed parameters, cancel buttons, trying to click “next” without filling important textbox… And writing all this in C (I mean the actual user-end program interface, not the general GUI library, like GTK for example) only makes it worse to debug and maintain. You most often don’t get any gain from manual memory management. If an operation is taking too long maybe it’s time to put it inside the backend library. But if you’re optimizing that operation you’ve already moved away from comparing strings inside - it’s the first one to go when a loop takes too long. And once we are speaking about more than one program that we want to have consistent behavior across that might need to change in the future - C is only slowing you down.
Do you really need to reference the “Cancel” button via pointer when checking if the user should be allowed to go back?
Write a general backend library for your important stuff and optimizations in C, so you can easily load it in other languages. And then use something higher level for the interpreter/GUI where sanitizing user input is 5 lines of different libraries from the language (I mean like
re
orzip
in Python - these are not external, these are Python’s STL), instead of 50 lines of juggling pointers, which in C you would be doing even if the input was all ints.You don’t care about stack height and jumping to previous frame after being in a procedure (assembler level of looking at the code) - that’s what C does for you
So why care about the pointers and structs when resizing a GUI? - Let some higher level language manage that for you
You can just write your own implementation it’s not too complex or just use a string library.
I can also write my own programming language. That wasn’t the point.
This is part of the problem. Instead of solid primitives you have to implement them yourself or pull in a library, both of which you have to hope are compatible with other libraries (or you have to convert manually all the time).
How many people who write their own string implementation do you think do so perfectly? I’d guess at most 50%. This means that basic operations in a good number of apps will have unknown bugs. Fixing bugs in application logic is one thing, but having to debug low-level type implementations is not something the average developer should do.
If don’t want to do low level programming why use C in the first place? The whole point of using C is so you can fiddle with pointers to have absolute control. Rust and Go are great alternatives that have built in strings.
But why does that mean C can’t implement a native string type?
Why implement floats instead of making people do it themselves?
Floats are implemented on most hardware by the instruction set so the language has no control over those unless your programming on a microcontroller like an atmega328p in which case you have to implement it yourself. As for why no in built support for strings is available in C is mostly due to C programmer hating change. Most hardcore C programmers are still using C89 (and the majority C99) and you can’t change old standards. C dosen’t need more features it needs less. I am a big fan of removing for loops like Zig to make the langauge simpler. That way it can maintain its minimalism. The more minimalistic the easier to write compilers.
Modern hardware also has specific instructions to speed up C string operations for the common ways they are implemented. We rely on compiler optimisation for those as well. Why not do the same for floats?
Because the language already supports it. Its not a question of what modern hardware can do just backwards compatibility and not changing the language too much. There would be no point in adding these features because if you want them you can just use Modern C++. There is no need for two identical languages occupying the same niche.
I’m not arguing for C implementing classes and all other C++ features, only for a basic data type used in most programs. Backwards compatibility is also a pretty poor argument considering new versions of C are released every couple of years with new features, already breaking backwards compatibility. Why is this specific change too much?