← Databricks Certified Machine Learning Professional - 340 Questions

Free Preview

Question 1 of 5

A data scientist is building a SparkML pipeline that includes StringIndexer, OneHotEncoder, VectorAssembler, and LogisticRegression. The pipeline must be reusable for both training and scoring. Which approach correctly constructs this pipeline?

Create a Pipeline with stages=[indexer, encoder, assembler, lr] and call pipeline.fit(train_df) to produce a PipelineModel Create a Pipeline and call pipeline.addStage(indexer).addStage(encoder).addStage(assembler).addStage(lr) then fit() Use Pipeline.create(indexer, encoder, assembler, lr).fit(train_df) to build the pipeline Fit each stage individually and combine the fitted models into a PipelineModel manually

This is a free preview. Create an account to access all 340 questions.

Free Preview

We use cookies

Cookie Settings

Essential Cookies Always Active

Analytics Cookies

Advertising Cookies