does this theory still hold when a data set is not normally distributed?
It depends on what you mean by "does this theory still hold," the nature of your data, and how strict you want to be in identifying outliers.
The frequently used rule you cite was designed to flag about 1% of normally distributed values as potential outliers. It will flag different percentages of values if your data follow different distributions.
Here are some quick examples based on 1000 random draws from each of some widely known distributions: t with 1 degree of freedom (Cauchy); t with 2 degrees of freedom; standard normal; standard lognormal.
set.seed(20220630)
t1vals <- rt(1000,1)
t2vals <- rt(1000,2)
nvals <- rnorm(1000)
lnvals <- rlnorm(1000)
Do boxplot calculations for these samples.
boxt1 <- boxplot(t1vals)
boxt2 <- boxplot(t2vals)
boxn <- boxplot(nvals)
boxln <- boxplot(lnvals)
A stats component of a boxplot object "contains the extreme of the lower whisker, the lower hinge, the median, the upper hinge and the extreme of the upper whisker"; the first and last of those are the cutoffs for outliers as you defined them.* With 1000 values and a hoped-for 1% outliers, you would get about 10.
Here's what you find by adding up how many values in each case are below the lower whisker or above the upper whisker:
Not bad for the normally distributed data, just 15 instead of 10:
sum(nvals < boxn$stats[1,1] | nvals > boxn$stats[5,1])
# [1] 15
But for the others (lognormal, 2-df t and 1-df t in order):
sum(lnvals < boxln$stats[1,1] | lnvals > boxln$stats[5,1])
# [1] 64
sum(t2vals < boxt2$stats[1,1] | t2vals > boxt2$stats[5,1])
# [1] 83
sum(t1vals < boxt1$stats[1,1] | t1vals > boxt1$stats[5,1])
# [1] 150
So instead of flagging about 1% as outliers with that rule, you flag 6 to 15% as outliers with these distributions.** Yes, the 1-df t (Cauchy) distribution is notorious for extreme behavior, but lognormal data are frequently found.
The question thus comes down to how many true members of the distribution you want to flag as outliers, given the nature of your data.
*Well, pretty close. The "hinges" used to define the box of the boxplot, and thus the ends of the whiskers at 1.5 IQR from the box, aren't exactly at the first and third quartiles, but they are close. Compare things like summary(t1vals) for quartiles against corresponding boxt1$stats[c(2,4),1] for hinges.
**You shouldn't trust single sets of samples like this. Try repeating these types of calculations multiple times to gauge their reliability. Don't forget to set a random seed for reproducibility.