1

I've been trying to do a check for result dataset in spark of whether it is empty or has data. I did below following things.

dataset.rdd().isEmpty();

2.

try{
           dataset.head(1)
         }catch(Exception e){
          status ="No data";
          }

3.

try{
         dataset.first();
          }catch(Exception e){  
           status ="No data";
          }

4.

dataset.limit(1).count()>0;

All this are taking a lot of time to complete when comparatively huge data is present. I need to get a efficient solution for this.

Alper t. Turker
  • 32,514
  • 8
  • 78
  • 112
Garry Steve
  • 119
  • 1
  • 10
  • Yeah . @philantrovert but That's what I have said in the Question description. That I have tried all those?? But evrything is taking a lot of time. when dataset is not empty. – Garry Steve Jun 08 '18 at 06:16
  • These are the options you get. If `dataset` has complex wide dependencies, then taking even a single element will be expensive. – Alper t. Turker Jun 08 '18 at 09:35

0 Answers0