Pipeline¶

Pipeline allows to combine multiple preprocessors and one estimator, such as a classifier, regressor, or clustering algorithm into one estimator. Pipeline comes in especially handy in combination with cross validation, especially GridSearchCV.

Creation¶

A Pipeline is created by providing all of the necessary preprocessors and an estimator.

Pipeline{T<:Estimator}(prepros::Vector{Tuple{ASCIIString, Preprocessor}}, est::Tuple{ASCIIString, T})

Create a new pipeline object. The preprocessors are provided as a list of tuples. Each tuple contains the name of the preprocessor and the preprocessors object itself.

pipe = Pipeline([("mms", MinMaxScaler()), ("ss", StandardScaler())], ("svc", SVC()))

Functions¶

As the Pipeline is an Estimator the common functions for estimators all work on Pipeline as well.

fit!(pipe::Pipeline, X::Matrix{Float64}, y::Vector): Fit the pipeline for regression or classification.

fit!{T<:Cluster}(pipe::Pipeline{T}, X::Matrix{Float64}): Fit the pipeline with an unsupervised estimator, such as a Kmeans.

predict(pipe::Pipeline, X::Matrix{Float64})¶: Preprocess the data and predict values using the estimator’s predict method.

score{T<:AbstractFloat, S<:Union{Classifier, Regressor}}(pipe::Pipeline{S}, X::Matrix{T}, y_true::Vector; scoring::Union{Function, Void}=nothing): Scoring for classification and regression.

score{T<:AbstractFloat, S<:Cluster}(pipe::Pipeline{S}, X::Matrix{T}; scoring::Union{Function, Void}=nothing): Scoring for unsupservised learning.

Example¶

Create a pipeline with a MinMaxScaler, followed by a StandardScaler, and finally an SVC.

pipe = Pipeline([("mms", MinMaxScaler()), ("ss", StandardScaler())], ("svc", SVC()))
fit!(pipe, X, y)
y_pred = predict(pipe, X)
@test_approx_eq f1_score(y, y_pred)["1"] 1.0
@test f1_score(y, y_pred)["0"] == 1.0

Pipeline¶

Creation¶

Functions¶

Example¶

Table Of Contents

Related Topics

This Page