0

I have an excel sheet I need to sanitise. I have over 1000 addresses and for some random reason, the data sent over by the supplier has random 16 digits at the start of the street address

Not all of them though.

Can I write a formula of sorts to strip the first 16 digits if they exist in the format

xxxx xxxx xxxx xxxx

Data is like this

0123 4567 8910 1234 1 Lovely Road (1 Lovely Road)
2 Lovely Road
0123 4567 8910 1234 11 Another Road (11 Another Road)

1 Answers1

2

Assuming that A1 contains the address then a simple solution would be:

=IF(ISNUMBER(SUBSTITUTE(LEFT(A1, 20), " ", "")*1), MID(A1, 21, LEN(A1)), A1)

It's not perfect (for example the position of the spaces is irrelevant) but should hopefully work for most of the data.

How it works

To show how this works, we'll be breaking down the formula and placing each component in a cell starting with A1 containing the string.

In B1 you want to get the 20 left most characters:

=LEFT(A1, 20)

In C1 you want to replace the spaces:

=SUBSTITUTE(B1, " ", "")

In D1 you need to check if it's a number using ISNUMBER - however that will return False if it's a string containing a number. So we can work around that by multiplying by 1 (source):

=ISNUMBER(C1*1)

So now D1 contains True or False depending on whether or not we have a set of numbers at the beginning of the string.

So for E1 we just have an If which checks if D1 is True and, if so, displays everything in A1 from position 21 onwards. If D1 is False then E1 just displays the entire contents of A1:

=IF(D1, MID(A1, 21, LEN(A1)), A1)
Richard
  • 6,150