I am getting confused with size_t
in C. I know that it is returned by the sizeof
operator. But what exactly is it? Is it a data type?
Let's say I have a for
loop:
for(i = 0; i < some_size; i++)
Should I use int i;
or size_t i;
?
转载于:https://stackoverflow.com/questions/2550774/what-is-size-t-in-c
According to the 1999 ISO C standard (C99),
size_t
is an unsigned integer type of at least 16 bit (see sections 7.17 and 7.18.3).
size_t
is an unsigned data type defined by several C/C++ standards, e.g. the C99 ISO/IEC 9899 standard, that is defined instddef.h
.1 It can be further imported by inclusion ofstdlib.h
as this file internally sub includesstddef.h
.This type is used to represent the size of an object. Library functions that take or return sizes expect them to be of type or have the return type of
size_t
. Further, the most frequently used compiler-based operator sizeof should evaluate to a constant value that is compatible withsize_t
.
As an implication, size_t
is a type guaranteed to hold any array index.
The manpage for types.h says:
size_t shall be an unsigned integer type
size_t
is an unsigned type. So, it cannot represent any negative values(<0). You use it when you are counting something, and are sure that it cannot be negative. For example, strlen()
returns a size_t
because the length of a string has to be at least 0.
In your example, if your loop index is going to be always greater than 0, it might make sense to use size_t
, or any other unsigned data type.
When you use a size_t
object, you have to make sure that in all the contexts it is used, including arithmetic, you want non-negative values. For example, let's say you have:
size_t s1 = strlen(str1);
size_t s2 = strlen(str2);
and you want to find the difference of the lengths of str2
and str1
. You cannot do:
int diff = s2 - s1; /* bad */
This is because the value assigned to diff
is always going to be a positive number, even when s2 < s1
, because the calculation is done with unsigned types. In this case, depending upon what your use case is, you might be better off using int
(or long long
) for s1
and s2
.
There are some functions in C/POSIX that could/should use size_t
, but don't because of historical reasons. For example, the second parameter to fgets
should ideally be size_t
, but is int
.
From my understanding, size_t
is an unsigned
integer whose bit size is large enough to hold a pointer of the native architecture.
So:
sizeof(size_t) >= sizeof(void*)
size_t
and int
are not interchangeable. For instance on 64-bit Linux size_t
is 64-bit in size (i.e. sizeof(void*)
) but int
is 32-bit.
Also note that size_t
is unsigned. If you need signed version then there is ssize_t
on some platforms and it would be more relevant to your example.
As a general rule I would suggest using int
for most general cases and only use size_t
/ssize_t
when there is a specific need for it (with mmap()
for example).
In general, if you are starting at 0 and going upward, always use an unsigned type to avoid an overflow taking you into a negative value situation. This is critically important, because if your array bounds happens to be less than the max of your loop, but your loop max happens to be greater than the max of your type, you will wrap around negative and you may experience a segmentation fault (SIGSEGV). So, in general, never use int for a loop starting at 0 and going upwards. Use an unsigned.
size_t
is a type that can hold any array index.
Depending on the implementation, it can be any of:
unsigned char
unsigned short
unsigned int
unsigned long
unsigned long long
Here's how size_t
is defined in stddef.h
of my machine:
typedef unsigned long size_t;
If you are the empirical type,
echo | gcc -E -xc -include 'stddef.h' - | grep size_t
Output for Ubuntu 14.04 64-bit GCC 4.8:
typedef long unsigned int size_t;
Note that stddef.h
is provided by GCC and not glibc under src/gcc/ginclude/stddef.h
in GCC 4.2.
Interesting C99 appearances
malloc
takes size_t
as an argument, so it determines the maximum size that may be allocated.
And since it is also returned by sizeof
, I think it limits the maximum size of any array.
Since nobody has yet mentioned it, the primary linguistic significance of size_t
is that the sizeof
operator returns a value of that type. Likewise, the primary significance of ptrdiff_t
is that subtracting one pointer from another will yield a value of that type. Library functions that accept it do so because it will allow such functions to work with objects whose size exceeds UINT_MAX on systems where such objects could exist, without forcing callers to waste code passing a value larger than "unsigned int" on systems where the larger type would suffice for all possible objects.
size_t is unsigned integer data type. On systems using the GNU C Library, this will be unsigned int or unsigned long int. size_t is commonly used for array indexing and loop counting.
size_t or any unsigned type might be seen used as loop variable as loop variables are typically greater than or equal to 0.
When we use a size_t object, we have to make sure that in all the contexts it is used, including arithmetic, we want only non-negative values. For instance, following program would definitely give the unexpected result:
// C program to demonstrate that size_t or
// any unsigned int type should be used
// carefully when used in a loop
#include<stdio.h>
int main()
{
const size_t N = 10;
int a[N];
// This is fine
for (size_t n = 0; n < N; ++n)
a[n] = n;
// But reverse cycles are tricky for unsigned
// types as can lead to infinite loop
for (size_t n = N-1; n >= 0; --n)
printf("%d ", a[n]);
}
Output
Infinite loop and then segmentation fault