453

Why does the second line of this code throw ArrayIndexOutOfBoundsException?

String filename = "D:/some folder/001.docx";
String extensionRemoved = filename.split(".")[0];

While this works:

String driveLetter = filename.split("/")[0];

I use Java 7.

Sebastian Nielsen
  • 3,143
  • 4
  • 19
  • 35
Ali Ismayilov
  • 5,477
  • 2
  • 20
  • 24

4 Answers4

897

You need to escape the dot if you want to split on a literal dot:

String extensionRemoved = filename.split("\\.")[0];

Otherwise you are splitting on the regex ., which means "any character".
Note the double backslash needed to create a single backslash in the regex.


You're getting an ArrayIndexOutOfBoundsException because your input string is just a dot, ie ".", which is an edge case that produces an empty array when split on dot; split(regex) removes all trailing blanks from the result, but since splitting a dot on a dot leaves only two blanks, after trailing blanks are removed you're left with an empty array.

To avoid getting an ArrayIndexOutOfBoundsException for this edge case, use the overloaded version of split(regex, limit), which has a second parameter that is the size limit for the resulting array. When limit is negative, the behaviour of removing trailing blanks from the resulting array is disabled:

".".split("\\.", -1) // returns an array of two blanks, ie ["", ""]

ie, when filename is just a dot ".", calling filename.split("\\.", -1)[0] will return a blank, but calling filename.split("\\.")[0] will throw an ArrayIndexOutOfBoundsException.

Bohemian
  • 389,931
  • 88
  • 552
  • 692
  • 1
    Note that filename can contain multiple dots. One must use the last index of "." and use that to find the substring of the filename. – saurabheights Jun 12 '17 at 14:24
  • 2
    @saurabheights The question was not about a correct regex, but rather why there was a an `ArrayIndexOutOfBoundsException`. That said, you are incorrect: You don't need to know where the last dot is; you just need the right regex: `filename.split("\\.(?=[^.]*$)")`. This uses a *look ahead* to assert there are no dots anywhere in the input that follows the matching dot. – Bohemian Jun 12 '17 at 15:10
  • 1
    @emma you can delete them yourself via the “delete” link just beneath the question – Bohemian Aug 31 '19 at 01:17
130

The dot "." is a special character in java regex engine, so you have to use "\\." to escape this character:

final String extensionRemoved = filename.split("\\.")[0];
Nimantha
  • 5,793
  • 5
  • 23
  • 56
aimhaj
  • 1,505
  • 1
  • 10
  • 16
  • 27
    It is _not_ a special character in Java. It's a special character in Java's regex engine. – Nic Apr 08 '16 at 12:09
  • 1
    I just wasn't very accurate in my response but I agree with you. thanks for the precision ;) – aimhaj Apr 08 '16 at 13:20
  • 1
    It's a distinction worth making. Also, I just realized that I messed up a bit myself; it is a special char in Java, but that's not why it's causing a problem here. Anyway. – Nic Apr 08 '16 at 13:21
35

This is because . is a reserved character in regular expression, representing any character. Instead, we should use the following statement:

String extensionRemoved = filename.split("\\.")[0];
Gabriele Mariotti
  • 250,295
  • 77
  • 670
  • 690
20

I believe you should escape the dot. Try:

String filename = "D:/some folder/001.docx";
String extensionRemoved = filename.split("\\.")[0];

Otherwise dot is interpreted as any character in regular expressions.

Ivaylo Strandjev
  • 66,530
  • 15
  • 117
  • 170