1

Here is the documentation for the pipeline constructor from Sklearn website:

Sequentially apply a list of transforms and a final estimator. Intermediate steps of the pipeline must be ‘transforms’, that is, they must implement fit and transform methods. The final estimator only needs to implement fit. The transformers in the pipeline can be cached using memory argument.

I do not understand why isn't the final estimator being used to transform the input?

1 Answers1

2

The final estimator (i.e. final step in the pipeline) may be a transformer, but it does not have to be. It could be a classifier instead. Hence the final step may transform, but it can predict.

For the case of the last step being a transformer, pipelines support a method called fit_transform. See the referenced documentation and linked User Guide for more details.

steffen
  • 10,367