I have a list of US postal zip codes of 5 digits, but some lost their leading zeros. How do I add those zeros back in, while keeping others without the leading 0s intact? I tried formatC, springf, str_pad, and none of them worked, because I am not adding 0s to all values.
Asked
Active
Viewed 69 times
-2
-
1In the future, *"I tried ... and none of them worked"* does not help much: we are generally much better at helping with not-working code when we see the code, some input data, and your expected output. Please see https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info for some good ways to make questions *reproducible*, which will speed-up and make more-relevant answers you may get. – r2evans Jul 12 '21 at 19:40
-
That was a good suggestion! Will keep that in mind when posting questions next time. – SilverSpringbb Jul 12 '21 at 19:50
-
1If one of the answers addresses your question, please [accept it](https://stackoverflow.com/help/someone-answers); doing so not only provides a little perk to the answerer with some points, but also provides some closure for readers with similar questions. Though you can only accept one answer, you have the option to up-vote as many as you think are helpful. (If there are still issues, you will likely need to edit your question with further details.) (Please consider going back to [previous questions](https://stackoverflow.com/users/15243394/silverspringbb?tab=questions) and accepting some.) – r2evans Jul 13 '21 at 18:25
-
I voted, but don't know how to accept an answer. – SilverSpringbb Jul 14 '21 at 19:42
4 Answers
7
We can use sprintf
sprintf('%05d', as.integer(zipcodes))
akrun
- 789,025
- 32
- 460
- 575
-
Man thanks! my problem with sprintf is that, it creates a vector, but my zip code is a column in a dateframe. How do I add the 0s back in the dataframe? – SilverSpringbb Jul 12 '21 at 19:36
-
1
-
-
3
In which way did str_pad not work?
https://www.rdocumentation.org/packages/stringr/versions/1.4.0/topics/str_pad
df<-data.frame(zip=c(1,22,333,4444,55555))
df$zip <- stringr::str_pad(df$zip, width=5, pad = "0")
[1] "00001" "00022" "00333" "04444" "55555"
M.Viking
- 2,426
- 3
- 12
- 27
2
Update:
As of the valuable comment of r2evans:
My solution is not very efficient and to get leading 0 we have to modify the paste0 part slightly see here with a dataframe example:
sapply(df$zip, function(x){if(nchar(x)<5){paste0(0,x)}else{x}})
data:
df <- tribble(
~zip,
7889,
2345,
45567,
4394,
34566,
4392,
4599)
df
Output:
[1] "07889" "02345" "45567" "04394" "34566" "04392" "04599"
Fist answer:
This will add a trailing zero to each integer < 5 digits
Where zip is a vector:
sapply(zip, function(x){if(nchar(x)<5){paste0(x,0)}else{x}})
TarJae
- 43,365
- 4
- 14
- 40
-
1If it's a vector, both `nchar` and `paste0` are vectorized, is there a reason you're explicitly de-vectorizing this process? Also, I don't think this will work correctly, for two reasons: it pads the `0` on the *right*, and it only pads 1 regardless of the length of the string. – r2evans Jul 12 '21 at 19:35
-
1Thanks r2evans. I mixed leading and trailing. I will update my answer. And also thank you for your advice according to efficiency! – TarJae Jul 12 '21 at 20:01
2
If they start as strings and you don't want to (or cannot) convert to integers first, then an alternative to sprintf is
vec <- c('1','11','11111')
paste0(strrep('0', pmax(0, 5 - nchar(vec))), vec)
# [1] "00001" "00011" "11111"
This will handle strings of any length, and is a no-op for strings of 5 or greater characters.
In a frame, that would be
dat$colname <- paste0(strrep('0', pmax(0, 5 - nchar(dat$colname))), dat$colname)
r2evans
- 108,754
- 5
- 72
- 122