0

the ALSModel Object doesn't have parameters maxIter and regParam although these parameters actually exist. So when I do a grid search and get a best model I will not be able to know the optimal parameters for them


als = ALS( userCol="userId", itemCol="movieId", ratingCol="rating",
          coldStartStrategy="drop")



paramGrid = (ParamGridBuilder()
             .addGrid(als.regParam, [0.01,0.1])
             .addGrid(als.rank, [ 5,10])
             .addGrid(als.maxIter, [ 5,10])
             .build())

evaluator = RegressionEvaluator(metricName="rmse", labelCol="rating",
                                predictionCol="prediction")


cv = CrossValidator(estimator=als, estimatorParamMaps=paramGrid, evaluator=evaluator, numFolds=5)


cvModel = cv.fit(training)

finalModel=cvModel.bestModel # This is an ALSModel object with 2 parameters gone
finalModel.extractParamMap()

Any idea how to find these values without doing grid search explicitly?

  • PySpark API is (and always have been) incomplete, check out https://stackoverflow.com/questions/36697304/how-to-extract-model-hyper-parameters-from-spark-ml-in-pyspark for related information on how to extract thos values – Gaarv Sep 30 '21 at 08:15

0 Answers0