Separate the probabilistic modeling step from the decision step.
First, model and predict the conditional probability for a given instance to belong to the target class. This will likely always be a tiny number (and, per the thread Dikran links to, for highly imbalanced data the imprecision of your model may be the dealbreaking aspect here), but it will be slightly higher for certain instances than for others.
Then decide on your decision or action based on the probabilistic class membership predictions. Especially for highly imbalanced situations, we typically have differential actions, or more than one possible decision. If we are looking for a murderer (the murderer incidence is very low) and have someone whose DNA matches that found at the scene, we do not immediately fire up the electrical chair, or even schedule the trial. Instead, we collect more evidence: does the suspect have a motive, or an alibi? If the predicted probability is essentially zero, we look for other suspects; if it is somewhat higher (maybe there was not much DNA to be found), we collect more evidence; if the probability is overwhelming, we arrest the suspect.
At this point, the costs of wrong decisions are relevant. (I don't like the term "misclassification", because as above, there are often more than two possible actions.)
More information here.