-7

Is it possible to convert a char* to uppercase without traversing character by character in a loop ?

Assumption:
1. Char pointer points to fixed size string array.
2. The array pointed to contains only lowercase characters

sunil_rbc
  • 626
  • 1
  • 6
  • 13
  • 6
    If you know the maximum length of the string, you can unroll the loop. Otherwise, no. – StoryTeller - Unslander Monica Feb 22 '18 at 06:25
  • 4
    Use recursion instead of a loop. – Bjorn A. Feb 22 '18 at 06:26
  • 1
    Not only C but all languages have to use a loop or recursion to achieve your request. – snr Feb 22 '18 at 06:26
  • @snr you don't have to use a loop if `char*` points to a `char` variable instead if an array, or if the array has fixed size – phuclv Feb 22 '18 at 06:30
  • 2
    Well ... yes. Maybe. You don't have to traverse "character by character". You could do say 4 or 8 characters at a time ... depending on your word size and memory alignment. Converting to uppercase can be done with a single bitwise operation. – MFisherKDX Feb 22 '18 at 06:31
  • 3
    ... single bitwise operation, provided that the string only contains alphabetic characters, and the ASCII character set is being used. – user3386109 Feb 22 '18 at 06:40
  • C has nothing analogous to Python's list comprehension, if that's what you're asking. – user3386109 Feb 22 '18 at 06:50
  • My question has been downvoted so many times. Very weird. I have asked this question because I wanted to find out a way out of 'looping and converting each character to uppercase'. – sunil_rbc Feb 22 '18 at 10:01
  • @SunilKoiri because before you add that "fixed size string array" requirement it's an impossible task. And you didn't show what you've tried, which makes this off-topic – phuclv Feb 22 '18 at 14:09

1 Answers1

-1

In the ASCII encoding, converting lowercase to uppercase amounts to setting the bit of weight 32 (i.e. 20H, the space character).

With a bitwise operator,

Char|= 0x20;

You can process several characters at a time by mapping longer data types on the array. For instance, to convert an array of 11 characters,

int ToUpper= 0x20202020;

*(int*)  &Char[0]|=  ToUpper;
*(int*)  &Char[4]|=  ToUpper;
*(short*)&Char[8]|=  ToUpper;
          Char[10]|= ToUpper;

You can go to 64 bit ints and even larger (up to 512 bits = 64 characters at a time) with the SIMD intrinsics (SSE, AVX).

If your code allows it, it is better to extend the buffer length to the next larger data type so that all bytes can be updated in a single operation. But don't forget to restore the terminating null.

Yves Daoust
  • 53,540
  • 8
  • 41
  • 94
  • So many undefined behavior bugs here. [What is the strict aliasing rule?](https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule) – Lundin Feb 22 '18 at 09:09
  • 1
    @Lundin: please spare me this comment and downvote.The purpose is obviously multibyte processing hacks. – Yves Daoust Feb 22 '18 at 09:15
  • No? You are teaching beginners how to write buggy, dangerous code. To use this algorithm safely, you must use memcpy(). In addition, professional programmers do not use signed types for bit-fiddling hacks. – Lundin Feb 22 '18 at 09:17
  • @Lundin: please spare me that. Professional hackers know very well that bitwise operations are sign agnostic. – Yves Daoust Feb 22 '18 at 09:18
  • Why? Do you have any actual _arguments_ explaining why this code is not a buggy mess? – Lundin Feb 22 '18 at 09:20
  • No it doesn't, since it violates the strict aliasing rule. "It seems to work" is one form of undefined behavior. Just go read the link I posted. – Lundin Feb 22 '18 at 09:22
  • In addition, the algorithm is actually incorrect. To convert to upper case, you need `&= ~0x20`, not `|= 0x20`. – Lundin Feb 22 '18 at 09:32
  • 1
    Always impressive to see how people can be so picky about undefinedbehaviorness and ignore the rest. – Yves Daoust Feb 22 '18 at 09:40
  • I'm fairly certain I can get gcc to conjure an example where the string remains untouched after going through your algorithm. Just outsource it to another translation unit and there you go, strict aliasing will kill it. What's more impressive here is that some people can confidently post code examples on the internet without a single line of correct code and then arrogantly refuse to admit that it's all wrong. I'd either delete this answer or fix it, as you'll just accumulate down votes from here. – Lundin Feb 22 '18 at 09:49
  • Anyway, there's no need for this obscure code in the first place, simply do this instead: `for(size_t i=0; i – Lundin Feb 22 '18 at 09:49
  • 1
    @Lundin: I don't think you want to understand the purpose of my answer. – Yves Daoust Feb 22 '18 at 09:53
  • 1
    Upvote. I was reading some assembly/shellcode and they did this. This is the only post on the internet I found to answer my question. – KANJICODER Apr 14 '20 at 19:16