You wouldn't expect "balance" in these data, so it's not right to call them imbalanced. Imbalance is a trait that usually signals "problem" to reviewers and audiences who are used to interpreting blocked or randomized trial results.
Logistic regression with categorical covariates is pretty straightforward - it just gives you what you see. The predictions equal the empirical proportions, so that species 4 would have an estimated proportion of 3/(409+3) which is a really small value. We call this a "saturated model". There's nothing wrong with saturated models for estimation. If you go on to add other adjustments or strata variables, the model quickly will get out of hand. It's likely that separation or small sample bias will cause instability of odds ratios, predictions, etc.
Alan Agresti proposed some nifty adjustments for small cells that amount to basically "adding something". You can for instance, just add 1 or 2 or 10 or ... to the cells. Consequently, you trade an unbiased estimator for a biased one that has lower variance - and hopefully lower MSE.
If you perform inference, simulation studies show that logistic regression is well behaved in terms of power and control of type 1 error rate even when the sample size is small. So you can test differences between Species 1, 2, 3, and 4 - while some cell sizes are small, the proportion-differences are remarkable and striking and the overall $N$ is large for each species, so the comparisons are very well powered. Alternately, you can do Fisher's Exact Tests which allows for cells to be exactly 0 even, but it does not correctly achieve the alpha level and has lower power than the Pearson Chi-Square test.
In more complicated (non-saturated) logistic models, people often speak of "events per variable" - meaning the number of positive cases - presumably positive is less prevalent - needs to equal 10 (or 20 or more) per variable. So 4 adjustments require 40 or 80 "events" (and even more non-events) to provide adequate estimation. This rarely works out as expected, and it's easy to do little simulations to account for data structure and assess power and precision that way.