I want to write a function that returns the nearest next power of 2 number. For example if my input is 789, the output should be 1024. Is there any way of achieving this without using any loops but just using some bitwise operators?
-
1They're multiple questions matching this one. For example: http://stackoverflow.com/questions/364985/algorithm-for-finding-the-smallest-power-of-two-thats-greater-or-equal-to-a-giv – Yann Droneaud Mar 11 '13 at 11:49
-
4By way of clarification, do you need the nearest power of 2 (ie. 65 would give you 64, but 100 would give you 128) or the nearest above (ie. 65 would give you 128, and so would 100)? – Kim Reece Jan 21 '09 at 17:45
-
5See here for possible solutions: [http://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2Float](http://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2Float) – Stefan Jan 21 '09 at 17:31
-
1possible duplicate of [Given an integer, how do I find the next largest power of two using bit-twiddling?](http://stackoverflow.com/questions/1322510/given-an-integer-how-do-i-find-the-next-largest-power-of-two-using-bit-twiddlin) – Nathan Jan 16 '14 at 00:44
-
7@Nathan Your link is 8 months *later* than this question. – Joseph Quinsey Jan 16 '14 at 05:11
-
Or convert to Rust and use https://doc.rust-lang.org/stable/std/primitive.usize.html#method.next_power_of_two ;-) (Supposed to be in Hackers Delight section 3.2 too) – JGFMK Aug 31 '20 at 18:49
-
@Nathan's linked thread is indeed posted later than the one here, but John Feminella's [answer](https://stackoverflow.com/a/1322548/3873510) in that thread is superb. Readers may want to take a look. – Paul Razvan Berg Mar 25 '21 at 19:43
-
Heh. All the answers and comments here (and all the answers and comments at https://stackoverflow.com/questions/1322510/given-an-integer-how-do-i-find-the-next-largest-power-of-two-using-bit-twiddlin ) make the potentially wrong assumption that the integer is unsigned and provide example code that is very broken for negative integers. – Brendan Jul 05 '21 at 01:00
31 Answers
Check the Bit Twiddling Hacks. You need to get the base 2 logarithm, then add 1 to that. Example for a 32-bit value:
Round up to the next highest power of 2
unsigned int v; // compute the next highest power of 2 of 32-bit v v--; v |= v >> 1; v |= v >> 2; v |= v >> 4; v |= v >> 8; v |= v >> 16; v++;
The extension to other widths should be obvious.
- 25,191
- 47
- 61
- 93
- 13,648
- 6
- 42
- 47
-
20This is not the most efficient solution because many processors have special instruction for counting leading zeros which can be used to compute log2 very efficiently. See https://en.wikipedia.org/wiki/Find_first_set – Simon Oct 04 '13 at 21:57
-
13@Simon: it's the portable solution. There's no common efficient algorithm for all architectures – phuclv Dec 06 '13 at 09:27
-
9
-
@Litherum did you read the codes at bit twidding hacks? The power-of-2 case has already treated specifically – phuclv Dec 29 '13 at 01:42
-
1@rishta the `^` operator in C isn't power. And that's the slowest solution – phuclv Jun 23 '14 at 04:04
-
1Internet Archive to the rescue: https://web.archive.org/web/20160703165415/https://graphics.stanford.edu/~seander/bithacks.html This is still a lazy answer so I downvoted. – Ray Jul 04 '16 at 17:31
-
Why there is 5 times `v |= v >>` repeated? I tried it with 2 time and it also works, eg:`v--; console.log(v); v |= v >> 1; console.log(v); v |= v >> 2; console.log(v); v++;` Using latest Chrome on Debian 9 64bit – Marecky Jan 02 '18 at 00:40
-
-
16This thread is still well referenced but this answer (and most others) are highly outdated. CPUs have an instruction to help this (actually already at that time?). From : https://jameshfisher.com/2018/03/30/round-up-power-2.html `uint64_t next_pow2(uint64_t x) { return x == 1 ? 1 : 1< – MappaM Mar 13 '19 at 11:13
-
5@MappaM This answer is still highly relevant and the best portable way to do it. Your 64-bit version has undefined behaviour if `x > UINT32_MAX` and isn't branchless. Also, GCC and Clang use `-mtune=generic` by default (as do most distros), so your code will NOT expand to the `lzcnt` instruction on x86_64 -- it'll actually expand to something MUCH slower (a libgcc routine) unless you use something like `-march=native`. So your proposed replacement is non-portable, buggy and (typically) slower. – Craig Barnes Jul 07 '19 at 14:34
-
1I'm not saying this is a full and definitive answer, or I would not have answered in comments. But a consideration for speed would be nice. On ubuntu, it works fine with default. Note the _clzl and _clz so it works for >UINT32_MAX too. – MappaM Jul 09 '19 at 09:11
-
2@MappaM that sounds like a case of "works on my machine". For me, `next_pow2(1ULL << 34)` returns 0 and the reason has nothing to do with `_clzl` vs. `_clz`. While your function is trivial to fix, the point I'm trying to make is simple -- if you're going to call other people's code "highly outdated", you should first make sure your own code is tested and working and not just a buggy copypasta. – Craig Barnes Dec 01 '19 at 09:13
-
-
@florin I feel like it would be more accurate to quote Bit Twiddling Hacks here and say that what needs to be found is the "result that may be expressed by the formula 1U << (lg(v - 1) + 1)", which is pow(2, (lg(v - 1) + 1). Because the snippet in the answer doesn't really compute base 2 logarithm, [this does](http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious). – haykoandri Apr 13 '20 at 19:07
-
Waity-minty, what if `v` is unity? I trow this solution needs a guard: `if( v == 1 ) return 2;` – Ant_222 Jul 21 '20 at 23:13
-
@Ant_222 but 1 is a power of 2, it's just the zeroth power, i.e. `pow(2, 0) == 1`. special casing this isn't correct in the general case, even if it might be useful for your use case – Sam Mason Aug 20 '20 at 07:34
next = pow(2, ceil(log(x)/log(2)));
This works by finding the number you'd have raise 2 by to get x (take the log of the number, and divide by the log of the desired base, see wikipedia for more). Then round that up with ceil to get the nearest whole number power.
This is a more general purpose (i.e. slower!) method than the bitwise methods linked elsewhere, but good to know the maths, eh?
- 48,699
- 12
- 68
- 94
- 287,944
- 49
- 307
- 343
-
4From C99, you can also just use [log2](http://www.cprogramming.com/fod/log2.html) if supported by your tools. GCC and VS don't seem to :( – Matthew Read Jan 22 '12 at 05:49
-
2You're missing a bracket... next=pow(2, ceil(log(x)/log(2))); – Matthieu Cormier Aug 17 '12 at 17:10
-
17Be careful about float accuracy, though. `log(pow(2,29))/log(2)` = 29.000000000000004, so the result is 2**30 instead of returning 2**29. I think this is why log2 functions exist? – endolith Dec 09 '13 at 02:52
-
1
-
68The cost of this is probably at least 200 cycles and it's not even correct. Why does this have so many upvotes? – Axel Gneiting Aug 18 '15 at 20:17
-
4@SuperflyJon But it mentions bitwise operators and I assume correctness is implied by any question unless noted otherwise. – BlackJack May 02 '17 at 16:02
-
3Please do update the answer & use log2(). I made a mistake by not reading the comments and ended up with a great loss in a CP contest. :( – Abhishek Apr 12 '20 at 16:15
-
3
-
The main problem with this is say you're using a 64-bit integer value for x and ceil/log2 use a 64-bit double then there is no way to calculate for all values of x. – CR. Aug 31 '21 at 18:16
I think this works, too:
int power = 1;
while(power < x)
power*=2;
And the answer is power.
- 24,315
- 10
- 44
- 52
- 733
- 5
- 2
-
29Fair enough the question asked for no loops. But as clever as some of the other functions are, for code that is not performance sensitive the answer that is quickly and easily understood and verified to be correct always wins for me. – Tim MB Jan 17 '13 at 12:44
-
2This is not returning the nearest power of 2, butthe power of that is immediatly bigger than X. Still very good – CoffeDeveloper Jun 09 '13 at 16:34
-
3Instead of multiplying, some bitwise "magic" can be used instead `power <<= 1` – vallentin Jul 30 '16 at 08:01
-
6
-
8Beware of infinite loop if `x` is too large (i.e. not enough bits to represent the next power of 2). – alban Mar 22 '19 at 14:01
-
@MarkWeston https://godbolt.org/z/ajvGsTs5b Clang trunk ARM 64 bit does not appear to optimize the loop at the moment. – nmr Dec 15 '21 at 23:15
unsigned long upper_power_of_two(unsigned long v)
{
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
v++;
return v;
}
- 43,933
- 19
- 112
- 169
-
72Would be nice if you have attributed it (unless you discovered it). It comes from the bit twiddling hacks page. – florin Jan 21 '09 at 17:47
-
3
-
Jonathan, you need to do it for the upper half, and if that is zero, you do it for the lower half. – florin Jan 21 '09 at 18:03
-
5@florin, if v is a 64-bit type, couldn't you just add a "c |= v >> 32" after the one for 16? – Evan Teran Jan 21 '09 at 18:18
-
-
Becareful when using signed int. Values > 0x4000_0000_0000_0000 will return 0x8000_0000_0000_0000. – Nathan Oct 18 '12 at 23:25
-
Input values > 0x8000_0000_0000_0000 will return 0. Input of 0 returns 0. – Nathan Oct 18 '12 at 23:26
-
@Nathan `printf("%lx\n", upper_power_of_two(0x8000000000000000));` --> `8000000000000000` when `upper_power_of_two()` uses 64-bit `unsigned long`. – chux - Reinstate Monica Oct 26 '15 at 17:29
-
I found some application in JDK ArrayDeque, but it doesn't have the first statement `v--` I tested a few cases, seems okay, not sure what's the difference – zinking Aug 30 '16 at 08:21
-
@zinking I know this is an old comment, but I was curious, and after trying it it seems that without the decrement if the input is power of two, then the output is double the input, whereas with the decrement, a power of two is returned unchanged. – Ryan1729 Apr 18 '18 at 03:43
-
4Code that only works for a specific bit width should be using fixed-width types instead of minimum-width types. This function should take and return a `uint32_t`. – Craig Barnes Sep 21 '18 at 21:38
If you're using GCC, you might want to have a look at Optimizing the next_pow2() function by Lockless Inc.. This page describes a way to use built-in function builtin_clz() (count leading zero) and later use directly x86 (ia32) assembler instruction bsr (bit scan reverse), just like it's described in another answer's link to gamedev site. This code might be faster than those described in previous answer.
By the way, if you're not going to use assembler instruction and 64bit data type, you can use this
/**
* return the smallest power of two value
* greater than x
*
* Input range: [2..2147483648]
* Output range: [2..2147483648]
*
*/
__attribute__ ((const))
static inline uint32_t p2(uint32_t x)
{
#if 0
assert(x > 1);
assert(x <= ((UINT32_MAX/2) + 1));
#endif
return 1 << (32 - __builtin_clz (x - 1));
}
- 1
- 1
- 5,057
- 1
- 22
- 39
-
3Note that this returns the smallest power of 2 greater than OR equal to x. Changing (x -1) to x changes the function to return the smaller power of 2 greater than x. – Guillaume Jul 28 '13 at 18:50
-
3
-
-
@MarkP `__builtin_ctz()` won't be useful to round any non power of 2 number up to the next power of two – Yann Droneaud Apr 01 '16 at 12:42
-
2Please add in your answer a link to Wikipedia list of builtin bitwise functions for other compilers: https://en.wikipedia.org/wiki/Find_first_set#Tool_and_library_support Please provide also a 64-bits version. I propose the following C++11 function: `constexpr uint64_t nextPowerOfTwo64 (uint64_t x) { return 1ULL< – oHo Apr 09 '17 at 02:50
-
8 bits per byte is usually fine but `CHAR_BIT` is more accurate. – Gareth A. Lloyd Jun 12 '18 at 12:35
-
-
@oHo: If you're using C++, C++20 provides [`std::countl_zero`](https://en.cppreference.com/w/cpp/numeric/countl_zero) = CLZ and `std::bit_width` (index of MSB + 1 = x86 BSR+1) to finally provide a portable way for the language to expose something that most CPUs have been providing for years. In fact, C++20 even has [`std::bit_ceil`](https://en.cppreference.com/w/cpp/numeric/bit_ceil) which is the answer to this whole question. But this is a C question, and ISO C is still stuck in the dark ages in this regard, necessitating GNU extensions or other intrinsics to do it efficiently. – Peter Cordes Mar 03 '22 at 03:07
-
@oHo: Also if you're using C++, use `std::numeric_limits` to get the number of bits in a type. `uint64_t` is specifically guaranteed to have 64, but it's an optional type. CHAR_BIT shouldn't be assumed to be 8 if you're going for max portability, e.g. to DSPs, but `std::numeric_limits
::digits` makes it easy. – Peter Cordes Mar 03 '22 at 03:09
One more, although I use cycle, but thi is much faster than math operands
power of two "floor" option:
int power = 1;
while (x >>= 1) power <<= 1;
power of two "ceil" option:
int power = 2;
x--; // <<-- UPDATED
while (x >>= 1) power <<= 1;
UPDATE
As mentioned in comments there was mistake in ceil where its result was wrong.
Here are full functions:
unsigned power_floor(unsigned x) {
int power = 1;
while (x >>= 1) power <<= 1;
return power;
}
unsigned power_ceil(unsigned x) {
if (x <= 1) return 1;
int power = 2;
x--;
while (x >>= 1) power <<= 1;
return power;
}
- 2,333
- 3
- 31
- 32
-
3the result is not correct if `x` is power of 2. A micro to test if input is power of 2 is needed. `#define ISPOW2(x) ((x) > 0 && !((x) & (x-1)))` – pgplus1628 Oct 28 '15 at 13:05
-
@zorksylar more efficiently would be to `if (x == 0) return 1; /* Or 0 (Which is what I use) */ x--; /* Rest of program */` – yyny Mar 03 '16 at 16:45
-
Good solution! but the `power of two "ceil" option` is not correct. For example, when `x = 2` the result should be `2` instead of `4` – MZD Aug 18 '18 at 12:02
For any unsigned type, building on the Bit Twiddling Hacks:
#include <climits>
#include <type_traits>
template <typename UnsignedType>
UnsignedType round_up_to_power_of_2(UnsignedType v) {
static_assert(std::is_unsigned<UnsignedType>::value, "Only works for unsigned types");
v--;
for (size_t i = 1; i < sizeof(v) * CHAR_BIT; i *= 2) //Prefer size_t "Warning comparison between signed and unsigned integer"
{
v |= v >> i;
}
return ++v;
}
There isn't really a loop there as the compiler knows at compile time the number of iterations.
- 6,045
- 8
- 40
- 79
- 363
- 3
- 7
-
4
-
1@martinkunev Just replace UnsignedType and process it manually. I am pretty sure a C programmer can expand this simple template ignoring the `std::is_unsigned
::value` assertion. – user877329 Jun 01 '17 at 17:53 -
3@user877329 Sure, it would be nice to have an answer in Javascript as well, just in case somebody wants to translate that to C. – martinkunev Jun 01 '17 at 18:10
-
@martinkunev UnsignedType in JavaScript? Anyway, this solution shows how to do it for any UnsignedType, and it happen to be written in C++, rather than pseudocode [ sizeof(v)*CHAR_BIT instead of something like number of bits in an object of UnsignedType ]. – user877329 Jun 01 '17 at 18:46
Despite the question is tagged as c here my five cents. Lucky us, C++ 20 would include std::ceil2 and std::floor2 (see here). It is consexpr template functions, current GCC implementation uses bitshifting and works with any integral unsigned type.
- 2,749
- 25
- 49
-
6They recently renamed it to `bit_ceil` http://open-std.org/JTC1/SC22/WG21/docs/papers/2020/p1956r1.pdf – Wolfgang Brehm Mar 28 '20 at 11:30
-
2`bit_floor` and `bit_ceil` are now available in C++20 from the `
` header. https://en.cppreference.com/w/cpp/header/bit – SU3 Dec 01 '20 at 07:07
In standard c++20 this is included in <bit>.
The answer is simply
#include <bit>
unsigned long upper_power_of_two(unsigned long v)
{
return std::bit_ceil(v);
}
NOTE:
The solution I gave is for c++, not c, I would give an answer this question instead, but it was closed as a duplicate of this one!
- 1,350
- 7
- 16
For IEEE floats you'd be able to do something like this.
int next_power_of_two(float a_F){
int f = *(int*)&a_F;
int b = f << 9 != 0; // If we're a power of two this is 0, otherwise this is 1
f >>= 23; // remove factional part of floating point number
f -= 127; // subtract 127 (the bias) from the exponent
// adds one to the exponent if were not a power of two,
// then raises our new exponent to the power of two again.
return (1 << (f + b));
}
If you need an integer solution and you're able to use inline assembly, BSR will give you the log2 of an integer on the x86. It counts how many right bits are set, which is exactly equal to the log2 of that number. Other processors have similar instructions (often), such as CLZ and depending on your compiler there might be an intrinsic available to do the work for you.
- 6,555
- 31
- 46
-
This is an interesting one eventhough not related to the question ( I want to roundoff only integers), will try out this one.. – Naveen Jan 21 '09 at 18:29
-
Came up with it after reading the wikipedia article on floats. Besides that, I've used it to calculate square-roots in integer precision. Also nice, but even more unrelated. – Jasper Bekkers Jan 21 '09 at 18:32
-
This breaks the strict aliasing rules. On some compilers it may not work or issue a warning. – martinkunev Dec 01 '17 at 16:54
In x86 you can use the sse4 bit manipulation instructions to make it fast.
//assume input is in eax
mov ecx,31
popcnt edx,eax //cycle 1
lzcnt eax,eax //cycle 2
sub ecx,eax
mov eax,1
cmp edx,1 //cycle 3
jle @done //cycle 4 - popcnt says its a power of 2, return input unchanged
shl eax,cl //cycle 5
@done: rep ret //cycle 5
In c you can use the matching intrinsics.
Or jumpless, which speeds up things by avoiding a misprediction due to a jump, but slows things down by lengthening the dependency chain. Time the code to see which works best for you.
//assume input is in eax
mov ecx,31
popcnt edx,eax //cycle 1
lzcnt eax,eax
sub ecx,eax
mov eax,1 //cycle 2
cmp edx,1
mov edx,0 //cycle 3
cmovle ecx,edx //cycle 4 - ensure eax does not change
shl eax,cl
@done: rep ret //cycle 5
- 73,011
- 23
- 185
- 311
-
Perhaps I'm missing something, but it doesn't look like this will work. It essentially left-shifts the value 2 by the count of leading zeros in the original number i.e. `2<
– DBear Dec 13 '21 at 21:39 -
You should try and avoid the branch: replace it with `mov edx,0; cmovle ecx,edx` – chqrlie Feb 15 '22 at 08:56
/*
** http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog
*/
#define __LOG2A(s) ((s &0xffffffff00000000) ? (32 +__LOG2B(s >>32)): (__LOG2B(s)))
#define __LOG2B(s) ((s &0xffff0000) ? (16 +__LOG2C(s >>16)): (__LOG2C(s)))
#define __LOG2C(s) ((s &0xff00) ? (8 +__LOG2D(s >>8)) : (__LOG2D(s)))
#define __LOG2D(s) ((s &0xf0) ? (4 +__LOG2E(s >>4)) : (__LOG2E(s)))
#define __LOG2E(s) ((s &0xc) ? (2 +__LOG2F(s >>2)) : (__LOG2F(s)))
#define __LOG2F(s) ((s &0x2) ? (1) : (0))
#define LOG2_UINT64 __LOG2A
#define LOG2_UINT32 __LOG2B
#define LOG2_UINT16 __LOG2C
#define LOG2_UINT8 __LOG2D
static inline uint64_t
next_power_of_2(uint64_t i)
{
#if defined(__GNUC__)
return 1UL <<(1 +(63 -__builtin_clzl(i -1)));
#else
i =i -1;
i =LOG2_UINT64(i);
return 1UL <<(1 +i);
#endif
}
If you do not want to venture into the realm of undefined behaviour the input value must be between 1 and 2^63. The macro is also useful to set constant at compile time.
- 51
- 1
- 2
-
This is the probably the worst solution (it's also missing ULL suffix on the 64-bit constant). This will generate 32 tests per input in all cases. Better to use a while loop, it'll always be faster or at the same speed. – xryl669 Apr 29 '16 at 14:33
-
1BUT... this can be evaluated by the preprocessor if the input is a constant, and thus ZERO operation at run time! – Michael Sep 13 '19 at 16:53
Here's my solution in C. Hope this helps!
int next_power_of_two(int n) {
int i = 0;
for (--n; n > 0; n >>= 1) {
i++;
}
return 1 << i;
}
- 51
- 1
- 1
For completeness here is a floating-point implementation in bog-standard C.
double next_power_of_two(double value) {
int exp;
if(frexp(value, &exp) == 0.5) {
// Omit this case to round precise powers of two up to the *next* power
return value;
}
return ldexp(1.0, exp);
}
- 4,065
- 3
- 22
- 17
-
1Random browsers, if you read this comment, choose this code.This is clearly the best answer, no special instructions, no bit twiddling, just efficient, portable and standard code. Guessing why no one else upvoted it ^^ – CoffeDeveloper Oct 25 '15 at 21:36
-
7Random browsers, this is going to be dead slow if you don't have specialized floating point hardware. On x86 you can run circles around this code using bit twiddling. `rep bsr ecx,eax; mov eax,0; cmovnz eax,2; shl eax,cl` is about 25x faster. – Johan Mar 31 '16 at 15:31
An efficient Microsoft (e.g., Visual Studio 2017) specific solution in C / C++ for integer input. Handles the case of the input exactly matching a power of two value by decrementing before checking the location of the most significant 1 bit.
inline unsigned int ExpandToPowerOf2(unsigned int Value)
{
unsigned long Index;
_BitScanReverse(&Index, Value - 1);
return (1U << (Index + 1));
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#if defined(WIN64) // The _BitScanReverse64 intrinsic is only available for 64 bit builds because it depends on x64
inline unsigned long long ExpandToPowerOf2(unsigned long long Value)
{
unsigned long Index;
_BitScanReverse64(&Index, Value - 1);
return (1ULL << (Index + 1));
}
#endif
This generates 5 or so inlined instructions for an Intel processor similar to the following:
dec eax
bsr rcx, rax
inc ecx
mov eax, 1
shl rax, cl
Apparently the Visual Studio C++ compiler isn't coded to optimize this for compile-time values, but it's not like there are a whole lot of instructions there.
Edit:
If you want an input value of 1 to yield 1 (2 to the zeroth power), a small modification to the above code still generates straight through instructions with no branch.
inline unsigned int ExpandToPowerOf2(unsigned int Value)
{
unsigned long Index;
_BitScanReverse(&Index, --Value);
if (Value == 0)
Index = (unsigned long) -1;
return (1U << (Index + 1));
}
Generates just a few more instructions. The trick is that Index can be replaced by a test followed by a cmove instruction.
- 1,397
- 8
- 16
-
-
Thanks. In the application for which it was developed we explicitly needed 2 to the first power when 1 is input. 1 could be taken as a special case with a conditional without generating too many more instructions I imagine. – NoelC Aug 31 '18 at 23:30
-
Updated the answer to include a version that returns 1 for an input value of 1. – NoelC Sep 01 '18 at 02:53
Portable solution in C#:
int GetNextPowerOfTwo(int input) {
return 1 << (int)Math.Ceiling(Math.Log2(input));
}
Math.Ceiling(Math.Log2(value)) calculates the exponent of the next power of two, the 1 << calculates the real value through bitshifting.
Faster solution if you have .NET Core 3 or above:
uint GetNextPowerOfTwoFaster(uint input) {
return (uint)1 << (sizeof(uint) * 8 - System.Numerics.BitOperations.LeadingZeroCount(input - 1));
}
This uses System.Numerics.BitOperations.LeadingZeroCount() which uses a hardware instruction if available:
Update:
RoundUpToPowerOf2() is Coming in .NET 6! The internal implementation is mostly the same as the .NET Core 3 solution above.
Here's the community update.
- 1,431
- 16
- 22
Trying to make an "ultimate" solution for this. The following code
is targeted for C language (not C++),
uses compiler built-ins to yield efficient code (CLZ or BSR instruction) if compiler supports any,
is portable (standard C and no assembly) with the exception of built-ins, and
addresses all undefined behaviors.
If you're writing in C++, you may adjust the code appropriately. Note that C++20 introduces std::bit_ceil which does the exact same thing except the behavior may be undefined on certain conditions.
#include <limits.h>
#ifdef _MSC_VER
# if _MSC_VER >= 1400
/* _BitScanReverse is introduced in Visual C++ 2005 and requires
<intrin.h> (also introduced in Visual C++ 2005). */
#include <intrin.h>
#pragma intrinsic(_BitScanReverse)
#pragma intrinsic(_BitScanReverse64)
# define HAVE_BITSCANREVERSE 1
# endif
#endif
/* Macro indicating that the compiler supports __builtin_clz().
The name HAVE_BUILTIN_CLZ seems to be the most common, but in some
projects HAVE__BUILTIN_CLZ is used instead. */
#ifdef __has_builtin
# if __has_builtin(__builtin_clz)
# define HAVE_BUILTIN_CLZ 1
# endif
#elif defined(__GNUC__)
# if (__GNUC__ > 3)
# define HAVE_BUILTIN_CLZ 1
# elif defined(__GNUC_MINOR__)
# if (__GNUC__ == 3 && __GNUC_MINOR__ >= 4)
# define HAVE_BUILTIN_CLZ 1
# endif
# endif
#endif
/**
* Returns the smallest power of two that is not smaller than x.
*/
unsigned long int next_power_of_2_long(unsigned long int x)
{
if (x <= 1) {
return 1;
}
x--;
#ifdef HAVE_BITSCANREVERSE
if (x > (ULONG_MAX >> 1)) {
return 0;
} else {
unsigned long int index;
(void) _BitScanReverse(&index, x);
return (1UL << (index + 1));
}
#elif defined(HAVE_BUILTIN_CLZ)
if (x > (ULONG_MAX >> 1)) {
return 0;
}
return (1UL << (sizeof(x) * CHAR_BIT - __builtin_clzl(x)));
#else
/* Solution from "Bit Twiddling Hacks"
<http://www.graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2>
but converted to a loop for smaller code size.
("gcc -O3" will unroll this.) */
{
unsigned int shift;
for (shift = 1; shift < sizeof(x) * CHAR_BIT; shift <<= 1) {
x |= (x >> shift);
}
}
return (x + 1);
#endif
}
unsigned int next_power_of_2(unsigned int x)
{
if (x <= 1) {
return 1;
}
x--;
#ifdef HAVE_BITSCANREVERSE
if (x > (UINT_MAX >> 1)) {
return 0;
} else {
unsigned long int index;
(void) _BitScanReverse(&index, x);
return (1U << (index + 1));
}
#elif defined(HAVE_BUILTIN_CLZ)
if (x > (UINT_MAX >> 1)) {
return 0;
}
return (1U << (sizeof(x) * CHAR_BIT - __builtin_clz(x)));
#else
{
unsigned int shift;
for (shift = 1; shift < sizeof(x) * CHAR_BIT; shift <<= 1) {
x |= (x >> shift);
}
}
return (x + 1);
#endif
}
unsigned long long next_power_of_2_long_long(unsigned long long x)
{
if (x <= 1) {
return 1;
}
x--;
#if (defined(HAVE_BITSCANREVERSE) && \
ULLONG_MAX == 18446744073709551615ULL)
if (x > (ULLONG_MAX >> 1)) {
return 0;
} else {
/* assert(sizeof(__int64) == sizeof(long long)); */
unsigned long int index;
(void) _BitScanReverse64(&index, x);
return (1ULL << (index + 1));
}
#elif defined(HAVE_BUILTIN_CLZ)
if (x > (ULLONG_MAX >> 1)) {
return 0;
}
return (1ULL << (sizeof(x) * CHAR_BIT - __builtin_clzll(x)));
#else
{
unsigned int shift;
for (shift = 1; shift < sizeof(x) * CHAR_BIT; shift <<= 1) {
x |= (x >> shift);
}
}
return (x + 1);
#endif
}
- 539
- 5
- 9
constexpr version of clp2 for C++14
#include <iostream>
#include <type_traits>
// Closest least power of 2 minus 1. Returns 0 if n = 0.
template <typename UInt, std::enable_if_t<std::is_unsigned<UInt>::value,int> = 0>
constexpr UInt clp2m1(UInt n, unsigned i = 1) noexcept
{ return i < sizeof(UInt) * 8 ? clp2m1(UInt(n | (n >> i)),i << 1) : n; }
/// Closest least power of 2 minus 1. Returns 0 if n <= 0.
template <typename Int, std::enable_if_t<std::is_integral<Int>::value && std::is_signed<Int>::value,int> = 0>
constexpr auto clp2m1(Int n) noexcept
{ return clp2m1(std::make_unsigned_t<Int>(n <= 0 ? 0 : n)); }
/// Closest least power of 2. Returns 2^N: 2^(N-1) < n <= 2^N. Returns 0 if n <= 0.
template <typename Int, std::enable_if_t<std::is_integral<Int>::value,int> = 0>
constexpr auto clp2(Int n) noexcept
{ return clp2m1(std::make_unsigned_t<Int>(n-1)) + 1; }
/// Next power of 2. Returns 2^N: 2^(N-1) <= n < 2^N. Returns 1 if n = 0. Returns 0 if n < 0.
template <typename Int, std::enable_if_t<std::is_integral<Int>::value,int> = 0>
constexpr auto np2(Int n) noexcept
{ return clp2m1(std::make_unsigned_t<Int>(n)) + 1; }
template <typename T>
void test(T v) { std::cout << clp2(v) << std::endl; }
int main()
{
test(-5); // 0
test(0); // 0
test(8); // 8
test(31); // 32
test(33); // 64
test(789); // 1024
test(char(260)); // 4
test(unsigned(-1) - 1); // 0
test<long long>(unsigned(-1) - 1); // 4294967296
return 0;
}
- 89
- 1
- 3
Many processor architectures support log base 2 or very similar operation – count leading zeros. Many compilers have intrinsics for it. See https://en.wikipedia.org/wiki/Find_first_set
- 3,014
- 3
- 22
- 17
-
it's not about finding the highest set bit (=bsr) or counting leading zeros. he wants to *round up* to nearest power of 2. the answer with "subtract 1, then do bsr and shift 1 left" does that. – Flo Oct 13 '18 at 17:23
Assuming you have a good compiler & it can do the bit twiddling before hand thats above me at this point, but anyway this works!!!
// http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
#define SH1(v) ((v-1) | ((v-1) >> 1)) // accidently came up w/ this...
#define SH2(v) ((v) | ((v) >> 2))
#define SH4(v) ((v) | ((v) >> 4))
#define SH8(v) ((v) | ((v) >> 8))
#define SH16(v) ((v) | ((v) >> 16))
#define OP(v) (SH16(SH8(SH4(SH2(SH1(v))))))
#define CB0(v) ((v) - (((v) >> 1) & 0x55555555))
#define CB1(v) (((v) & 0x33333333) + (((v) >> 2) & 0x33333333))
#define CB2(v) ((((v) + ((v) >> 4) & 0xF0F0F0F) * 0x1010101) >> 24)
#define CBSET(v) (CB2(CB1(CB0((v)))))
#define FLOG2(v) (CBSET(OP(v)))
Test code below:
#include <iostream>
using namespace std;
// http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
#define SH1(v) ((v-1) | ((v-1) >> 1)) // accidently guess this...
#define SH2(v) ((v) | ((v) >> 2))
#define SH4(v) ((v) | ((v) >> 4))
#define SH8(v) ((v) | ((v) >> 8))
#define SH16(v) ((v) | ((v) >> 16))
#define OP(v) (SH16(SH8(SH4(SH2(SH1(v))))))
#define CB0(v) ((v) - (((v) >> 1) & 0x55555555))
#define CB1(v) (((v) & 0x33333333) + (((v) >> 2) & 0x33333333))
#define CB2(v) ((((v) + ((v) >> 4) & 0xF0F0F0F) * 0x1010101) >> 24)
#define CBSET(v) (CB2(CB1(CB0((v)))))
#define FLOG2(v) (CBSET(OP(v)))
#define SZ4 FLOG2(4)
#define SZ6 FLOG2(6)
#define SZ7 FLOG2(7)
#define SZ8 FLOG2(8)
#define SZ9 FLOG2(9)
#define SZ16 FLOG2(16)
#define SZ17 FLOG2(17)
#define SZ127 FLOG2(127)
#define SZ1023 FLOG2(1023)
#define SZ1024 FLOG2(1024)
#define SZ2_17 FLOG2((1ul << 17)) //
#define SZ_LOG2 FLOG2(SZ)
#define DBG_PRINT(x) do { std::printf("Line:%-4d" " %10s = %-10d\n", __LINE__, #x, x); } while(0);
uint32_t arrTble[FLOG2(63)];
int main(){
int8_t n;
DBG_PRINT(SZ4);
DBG_PRINT(SZ6);
DBG_PRINT(SZ7);
DBG_PRINT(SZ8);
DBG_PRINT(SZ9);
DBG_PRINT(SZ16);
DBG_PRINT(SZ17);
DBG_PRINT(SZ127);
DBG_PRINT(SZ1023);
DBG_PRINT(SZ1024);
DBG_PRINT(SZ2_17);
return(0);
}
Outputs:
Line:39 SZ4 = 2
Line:40 SZ6 = 3
Line:41 SZ7 = 3
Line:42 SZ8 = 3
Line:43 SZ9 = 4
Line:44 SZ16 = 4
Line:45 SZ17 = 5
Line:46 SZ127 = 7
Line:47 SZ1023 = 10
Line:48 SZ1024 = 10
Line:49 SZ2_16 = 17
- 667
- 7
- 9
I'm trying to get nearest lower power of 2 and made this function. May it help you.Just multiplied nearest lower number times 2 to get nearest upper power of 2
int nearest_upper_power(int number){
int temp=number;
while((number&(number-1))!=0){
temp<<=1;
number&=temp;
}
//Here number is closest lower power
number*=2;
return number;
}
- 7,934
- 2
- 38
- 54
Adapted Paul Dixon's answer to Excel, this works perfectly.
=POWER(2,CEILING.MATH(LOG(A1)/LOG(2)))
- 1,088
- 1
- 7
- 17
A variant of @YannDroneaud answer valid for x==1, only for x86 plateforms, compilers, gcc or clang:
__attribute__ ((const))
static inline uint32_t p2(uint32_t x)
{
#if 0
assert(x > 0);
assert(x <= ((UINT32_MAX/2) + 1));
#endif
int clz;
uint32_t xm1 = x-1;
asm(
"lzcnt %1,%0"
:"=r" (clz)
:"rm" (xm1)
:"cc"
);
return 1 << (32 - clz);
}
- 17,016
- 1
- 26
- 65
Here is what I'm using to have this be a constant expression, if the input is a constant expression.
#define uptopow2_0(v) ((v) - 1)
#define uptopow2_1(v) (uptopow2_0(v) | uptopow2_0(v) >> 1)
#define uptopow2_2(v) (uptopow2_1(v) | uptopow2_1(v) >> 2)
#define uptopow2_3(v) (uptopow2_2(v) | uptopow2_2(v) >> 4)
#define uptopow2_4(v) (uptopow2_3(v) | uptopow2_3(v) >> 8)
#define uptopow2_5(v) (uptopow2_4(v) | uptopow2_4(v) >> 16)
#define uptopow2(v) (uptopow2_5(v) + 1) /* this is the one programmer uses */
So for instance, an expression like:
uptopow2(sizeof (struct foo))
will nicely reduce to a constant.
- 51,856
- 9
- 92
- 143
The g++ compiler provides a builtin function __builtin_clz that counts leading zeros:
So we could do:
int nextPowerOfTwo(unsigned int x) {
return 1 << sizeof(x)*8 - __builtin_clz(x);
}
int main () {
std::cout << nextPowerOfTwo(7) << std::endl;
std::cout << nextPowerOfTwo(31) << std::endl;
std::cout << nextPowerOfTwo(33) << std::endl;
std::cout << nextPowerOfTwo(8) << std::endl;
std::cout << nextPowerOfTwo(91) << std::endl;
return 0;
}
Results:
8
32
64
16
128
But note that, for x == 0, __builtin_clz return is undefined.
- 1,647
- 2
- 9
- 18
-
There's another undefined behavior in your code. When `x > (UINT_MAX / 2)`, `__builtin_clz` would return 0 causing the left shift (`< – Explorer09 Jul 17 '21 at 22:26
-
Returning "16" for the input "8" is not really the right answer. – Ian Rehwinkel May 11 '22 at 20:33
If you need it for OpenGL related stuff:
/* Compute the nearest power of 2 number that is
* less than or equal to the value passed in.
*/
static GLuint
nearestPower( GLuint value )
{
int i = 1;
if (value == 0) return -1; /* Error! */
for (;;) {
if (value == 1) return i;
else if (value == 3) return i*4;
value >>= 1; i *= 2;
}
}
- 5,503
- 21
- 30
-
1
-
9DrJokepu - I think florin meant to say here that the OP asked for a loop-less solution – Eli Bendersky Nov 08 '09 at 05:56
Convert it to a float and then use .hex() which shows the normalized IEEE representation.
>>> float(789).hex()
'0x1.8a80000000000p+9'
Then just extract the exponent and add 1.
>>> int(float(789).hex().split('p+')[1]) + 1
10
And raise 2 to this power.
>>> 2 ** (int(float(789).hex().split('p+')[1]) + 1)
1024
- 203
- 3
- 11
import sys
def is_power2(x):
return x > 0 and ((x & (x - 1)) == 0)
def find_nearest_power2(x):
if x <= 0:
raise ValueError("invalid input")
if is_power2(x):
return x
else:
bits = get_bits(x)
upper = 1 << (bits)
lower = 1 << (bits - 1)
mid = (upper + lower) // 2
if (x - mid) > 0:
return upper
else:
return lower
def get_bits(x):
"""return number of bits in binary representation"""
if x < 0:
raise ValueError("invalid input: input should be positive integer")
count = 0
while (x != 0):
try:
x = x >> 1
except TypeError as error:
print(error, "input should be of type integer")
sys.exit(1)
count += 1
return count
- 1,800
- 15
- 13
If you want an one-line-template. Here it is
int nxt_po2(int n) { return 1 + (n|=(n|=(n|=(n|=(n|=(n-=1)>>1)>>2)>>4)>>8)>>16); }
or
int nxt_po2(int n) { return 1 + (n|=(n|=(n|=(n|=(n|=(n-=1)>>(1<<0))>>(1<<1))>>(1<<2))>>(1<<3))>>(1<<4)); }
- 1
- 1
-
1This is undefined behaviour in C or C++ and will lead to errors. Modifying `n` multiple times without a sequence point is invalid. You wrote it as if `n-=1` should happen first but the only guarantee here is that `n` contains its new value after the `;` and the parentheses do not change that. – sam hocevar May 19 '20 at 12:18
from math import ceil, log2
pot_ceil = lambda N: 0x1 << ceil(log2(N))
Test:
for i in range(10):
print(i, pot_ceil(i))
Output:
1 1
2 2
3 4
4 4
5 8
6 8
7 8
8 8
9 16
10 16
- 26,994
- 34
- 144
- 244