Registering a Java sparkSQL User defined function

Question

I have written a Java Spark SQL UDF as below.

import org.apache.spark.sql.api.java.UDF1;
public class LowerCase_UDF implements UDF1<String,String> 
{
    @Override
    public String call(String t1) throws Exception 
    {   
        String output="";
        output=t1.toLowerCase();
        return output;
    }
}

What is the process to register this function in spark? If I run sqlContext.udf.register("LowerCaseUDF", call), it throws an exception "error: notfound: value call"

I have added the jar file generated to the spark-client/lib folder. But it does not seem to work. We want the function to be in Java for certain reasons. Any help on this will be appreciated. Thank you

also see this: [Spark (2.3+) Java functions callable from PySpark/Python](https://stackoverflow.com/questions/51797395/spark-2-3-java-functions-callable-from-pyspark-python/51805415#51805415) — Acid Rider, Aug 12 '18 at 03:51

score 0 · Answer 1 · answered May 18 '18 at 07:05

0

To register a UDF in Spark SQL using Java, you can use the following code:

sparkSession.udf().register("lowercase_udf", new LowerCase_UDF(), DataTypes.StringType);

And then you can use it like this:

dataset.withColumn("lower", functions.callUDF("lowercase_udf", functions.col("value")));

This will give you output something like this:

+--------+-------+
|value   |lower  |
+--------+-------+
|Michael |michael|
|Andy    |andy   |
|Justin  |justin |
+--------+-------+

I hope it helps!

answered May 18 '18 at 07:05

himanshuIIITian

5,699
4
48
65

@himanshullTian, could you please help on this issue https://stackoverflow.com/questions/55332897/how-to-add-new-column-to-datasetrow-using-map-function-on-the-dataset – Pyd Mar 25 '19 at 13:38

Registering a Java sparkSQL User defined function

1 Answers1