这是什么? getproccount

What is going on in this code? From the name and the context it's finding the number of cores on the machine, but how does it work? What's all that bit fiddling for?

static int32
getproccount(void)
{
        uintptr buf[16], t;
        int32 r, cnt, i;

        cnt = 0;
        r = runtime·sched_getaffinity(0, sizeof(buf), buf);
        if(r > 0)
        for(i = 0; i < r/sizeof(buf[0]); i++) {
                t = buf[i];
                t = t - ((t >> 1) & 0x5555555555555555ULL);
                t = (t & 0x3333333333333333ULL) + ((t >> 2) & 0x3333333333333333ULL);
                cnt += (int32)((((t + (t >> 4)) & 0xF0F0F0F0F0F0F0FULL) * 0x101010101010101ULL) >> 56);
        }

        return cnt ? cnt : 1;
}

Note: ignore the · in runtime·sched_getaffinity, think of that line as just an arbitrary library/system call that does what the name and arguments imply. (In this case this specific call comes from the old pre-Go1.4 runtime written in a slight variation of C that deals with ·).

The for loop runs over as many elements that get filled into the buf array. For each of those elements, it calculates the number of bits that are set in that element (the bit fiddling with t does just that) and that is added into cnt. At the end cnt is returned (or 1 if cnt is 0).

Explanation for the bit fiddling:

Bit fiddling line 1

The line t = t - ((t >> 1) & 0x5555555555555555ULL); basically groups off the bits of t into 2 bits and replaces each group with the count of the number of set bits in that group. This works like follows:

Consider t = ... w x y z where w,x,y,z are individual bits. Then

t                 = ... w x y z
t>>1              = ..... w x y
t>>1 & 0x...55ULL = ... 0 w 0 y

From above, it is clear to see why the grouping happens into 2 bits (for example, y and z get grouped together here). Now t - ((t >> 1) & 0x5555555555555555ULL) will have each group of 2 bits y z transformed to (y-z). Using a table of the 4 possibilities (00, 01, 10, 11), we see that the answers are (00, 01, 01, 10) which matches with the number of bits set in that group of 2 bits.

Bit fiddling line 2

Moving on to the next bit fiddling line t = (t & 0x3333333333333333ULL) + ((t >> 2) & 0x3333333333333333ULL);, we can see that it is adding up adjacent groups of 2 in groups of 2.

t & 0x..33ULL takes the rightmost 2 bits of each group of 4 bits.
(t>>2) & 0x..33ULL takes the leftmost 2 bits of each group of 4 bits and shifts them right by 2.
Since these two groups of 2 bits were the number of bits set in the original number, it has added up the number of bits set in every group of 4 bits. (i.e. now, each group of 4 bits has the number of bits that were set originally in those positions)

Bit fiddling line 3

As for the last bit fiddling line cnt += (int32)((((t + (t >> 4)) & 0xF0F0F0F0F0F0F0FULL) * 0x101010101010101ULL) >> 56);, we can break it down into a few parts for easier understanding.

(int32)(
  (
    (
      (
        t + (t >> 4)
      ) & 0xF0F0F0F0F0F0F0FULL
    ) * 0x101010101010101ULL
  ) >> 56
)

Currently, each group of 4 bits stores the number of bits that were set originally in the number. Shifting the number over by 4 and adding would add all adjecent groups together. &ing with 0x..0F0FULL picks out the right 4 bits of each group of 8 bits. Hence, at the end of (t + (t >> 4)) & 0x...0F0FULL, there are groups of 8 bits which contain the number of bits that were there in those locations in the original number. Let's just call this number s = (t + (t >> 4)) & 0x...0F0FULL for simplicity.

We now have to do (s * 0x...0101ULL) >> 56. Notice that t and thus s are 64 bits in size. This means that there are 8 groups of 8 bits. By simple multiplication with 0x...0101ULL (which is just the rightmost bit turned on for each group), the leftmost group will now be the sum of all the groups. By shifting right by 56 (i.e. (64 - 8)), we move this leftmost group into the rigthmost position. This means that the rightmost group of 8 bits now has the sum of all the groups of s. But s's groups were the number of set bits in those locations in the number, therefore, after the >>56 operation, we have the total number of set bits in the original number. The (int32) is just typecasting to a lower size datatype (which is actually enough to store this number.

Note: A bit being set means that the bit is equal to 1.