The datarobot
package is now dependent on R >= 3.5.
The datarobot
package now covers the entirety of the DataRobot Public API. It is now dependent on the datarobot.apicore
package, which provides auto-generated functions to access the Public API. The datarobot
package provides a number of "API wrapper functions" around the apicore
package to make it easier to use.
DataRobot recommends starting with package documentation: try ?datarobot
and ?datarobot.apicore
in your R session. Also, take a look at the vignettes.
New API Functions:
- Generated API wrapper functions are organized into categories based on their tags from the OpenAPI specification, which were themselves redone for the entire DataRobot Public API in v2.27.
- These functions use camel-cased argument names, to be consistent with the rest of the package.
- Most function names follow a
VerbObject
pattern based on the OpenAPI specification. - Some function names match "legacy" functions that existed in v2.18 of the R Client, if they invoked the same underlying endpoint. For example, the wrapper function is called
GetModel
, notRetrieveProjectsModels
, since the latter is what was implemented in the R client for the endpoint/projects/{mId}/models/{mId}
. - Similarly, these functions use the same arguments as the corresponding "legacy" functions, to ensure that DataRobot does not break existing code that calls those functions.
Other New Features:
- The R client (both
datarobot
anddatarobot.apicore
packages) will output a warning when you attempt to access certain resources (projects, models, deployments, etc.) that are deprecated or disabled by the DataRobot platform migration to Python 3. - Added the helper function
EditConfig
that allows you to interactively modify drconfig.yaml. - Added the
DownloadDatasetAsCsv
function to retrieve dataset as CSV using catalogId. - Added the
GetFeatureDiscoveryRelationships
function to get the feature discovery relationships for a project. - Added support for comprehensive autopilot: use
mode = AutopilotMode.Comprehensive
.
Enhancements:
- The function
RequestFeatureImpact
now accepts arowCount
argument, which will change the sample size used for Feature Impact calculations. - The internal helper function
ValidateModel
was renamed toValidateAndReturnModel
and now works with model classes from theapicore
package.
Bugfixes:
- The enum
ModelCapability
has been properly exported. - Fixed
FullAverageDataset
function in the PartialDependence vignette to ignoreNA
when calculating themin
andmax
of the data range. - Fixed
RetrieveAutomatedDocuments
function to accept filename argument that is used to specify where to save the automated document. - Fixed
datarobot.apicore
file upload functions to properly encode the payload as "multipart". - Fixed
datarobot.apicore
JSON serialization bugs. - Fixed some tests exercising the
BuildPath
helper function.
API Changes:
- The helper function
isApicoreModel
no longer checks fordatarobot.apicore::FrozenModelRetrieveResponse
since it no longer exists in the Public API. It was replaced by the classdatarobot.apicore::ModelDetailsResponse
. - The functions
ListProjects
andas.data.frame.projectSummaryList
no longer return fields related to recommender models, which were removed in v2.5.0. - The function
SetTarget
now sets autopilot mode toQuick
by default. Additionally, whenQuick
is passed, the underlying/aim
endpoint will no longer be invoked withAuto
. - The functions
CreateDatasetsFromHDFS
,CreateDatasetsVersionsFromHDFS
, andCreateHdfsProjects
have been deprecated.
Deprecated and Defunct:
quickrun
argument is removed from the functionSetTarget
. Users should setmode = AutopilotMode.Quick
instead.- Compliance Documentation got deprecated in favor of Automated Documentation API.
- The Transferable Models family of functions (
ListTransferableModels
,GetTransferableModel
,RequestTransferableModel
,DownloadTransferableModel
,UploadTransferableModel
,UpdateTransferableModel
,DeleteTransferableModel
) have been removed. The underlying endpoints -- long deprecated -- were removed from the Public API with the removal of the Standalone Scoring Engine (SSE). - Removed files (code, tests, doc) representing parts of the Public API not present in v2.27-2.29.
Dependency Changes:
- The
datarobot
package is now dependent on R >= 3.5 due to changes in the updated "Introduction to DataRobot" vignette. - Added dependency on
AmesHousing
package for updated "Introduction to DataRobot" vignette. - Removed dependency on
MASS
package. - Removed dependency on
R6
package; it is already a dependency ofdatarobot.apicore
but is not used indatarobot
itself. - Client documentation is now explicitly generated with Roxygen2 v7.2.1.
Documentation Changes:
- Package-level documentation for both packages has been updated to explain how to use package options.
- Updated "Introduction to DataRobot" vignette to use Ames, Iowa housing data instead of Boston housing dataset.
- Compressed
extdata/Friedman1.csv
and updated vignettes dependent on that dataset. - Removed
extdata/anomFrame.csv
as it was unused.
This was a Private Preview release of the R API Client. Earlier preview releases of this package were versioned v3.0.0
but are subsumed by this one.
New Features:
- Added the ability to retrieve and restore features that have been reduced using the time series
feature generation and reduction functionality. Discarded features can be retrieved using
a
RetrieveDiscardedFeaturesInformation
and restored usingRestoreDiscardedFeatures
- Added the ability to create and retrieve DataEngineQueryGenerators and create a Dataset from a DataEngineQueryGenerator for time series data prep.
- Added support to upload a prediction dataset from the AI catalog.
- Added an ability to calculate and retrieve Datetime trend plots for datetime aware model. This includes Accuracy over Time, Forecast vs Actual, and Anomaly over Time.
- Plots can be calculated using a common
ComputeDatetimeTrendPlots
function. - Metadata for plots can be retrieved using the
GetAccuracyOverTimePlotsMetadata
,GetForecastVsActualPlotsMetadata
, andGetAnomalyOverTimePlotsMetadata
functions. - Plots can be retrieved using the
GetAccuracyOverTimePlot
,GetForecastVsActualPlot
, andGetAnomalyOverTimePlot
functions. - Preview plots can be retrieved using the
GetAccuracyOverTimePlotPreview
,GetForecastVsActualPlotPreview
, andGetAnomalyOverTimePlotPreview
functions.
- Plots can be calculated using a common
Enhancements:
GetProject
now returns all available output from the API.- Many functions no longer use the internal function
ApplySchema
to filter a response from the DataRobot API to a given schema. This ensures that we return all information the API provides whenever necessary. Only a handful of functions, such asIsBlenderEligible
still use this function where it is primarily useful.
Bugfixes:
- The function
GetBlenderModelFromJobId
now returns an object with the fieldmodelId
rather thanid
, consistent withGetBlenderModel
.
API Changes:
Deprecated and Defunct:
- The deprecated
BlendMethod$FORECAST_DISTANCE
has been removed. UseBlendMethod$FORECAST_DISTANCE_ENET
instead.
Dependency Changes:
- Client documentation is now explicitly generated with Roxygen2 v7.1.2.
Documentation Changes:
This release brings the R Client to parity with DataRobot API v2.18 (DataRobot 5.2), but also includes a number of features from API v2.19 (DataRobot 5.3) as well as Anomaly Assessment, a DataRobot 7.1 feature.
New Features:
- Residuals Chart data for models can be retrieved using
GetLiftCharts
andGetAllLiftCharts
functions. This is valid only for regression models that are not time-aware. - Added "Average by Forecast Distance" blender for time series projects configured with more than one Forecast Distance. The blender blends the selected models, selecting the best 3
models based on the backtesting score for each Forecast Distance and averaging their predictions. The new blender method
FORECAST_DISTANCE_AVG
has been added asBlendMethod$FORECAST_DISTANCE_AVG
. - Added functions for tracking the health and status of a deployment.
GetDeploymentServiceStats
retrieves metrics that track deployment utilization and performance, whileGetDeploymentAccuracy
retrieves metrics that track the accuracy of a deployment's predictions.GetDeploymentServiceStatsOverTime
andGetDeploymentAccuracyOverTime
will track changes to those metrics over a specified time interval. SubmitActuals
can now be used to submit data about actual results from a deployed model, which can be used to calculate accuracy metrics.- Projects can be cloned using
CloneProject
. The clone will be post-EDA1 and ready for setting targets and modeling options. CreateCalendar
now supports series-specific events via themultiSeriesIdColumn
argument. An example of a series-specific event: some but not all stores being affected by a holiday.GetDeploymentAssociationId
andUpdateDeploymentAssociationId
can be used to manage a deployment's association ID for use withSubmitActuals
and the Deployment Accuracy functions.GetDeploymentSettings
can be used to retrieve any and all settings related to a deployed model.UpdateDeploymentSettings
will allow you to make piecemeal changes as well. The convenience functionsGetDeploymentDriftTrackingSettings
andGetDeploymentAssociationId
use these methods internally.- Time series model exports also support prediction intervals:
RequestTransferableModel
now has apredictionIntervalsSize
parameter. - Added support for Anomaly Assessment insight. This insight is available for anomaly detection models in time series unsupervised projects which also support calculation of Shapley values. It is possible to:
InitializeAnomalyAssessment
initializes an anomaly assessment insight for the specified subsetListAnomalyAssessmentRecords
retrieves recordsGetAnomalyAssessmentExplanations
retrieves shap explanationsGetAnomalyAssessmentPredictionsPreview
retrieves predictions previewDeleteAnomalyAssessmentRecord
deletes records
Enhancements:
- Monotonic constraints are now supported for OTV projects. To that end, the parameters monotonicIncreasingFeaturelistId and monotonicDecreasingFeaturelistId can be specified in calls to
RequestNewDatetimeModel
. - You can now get the model associated with a model job by getting the
modelId
field on theGetModelJob
orListModelJobs
response objects. - Added the new field
recommendedFeaturelistId
to theBlueprint
response object. If absent, there is no recommended feature list for this blueprint. - The
Model
S3 class now exposes themodelNumber
field. This field is also exposed in the responses toGetFrozenModel
,GetDatetimeModel
,GetBlenderModel
, andGetRatingTableModel
. - The method
GetModelCapabilities
has been extended to returnsupportsCodeGeneration
,supportsShap
, and other newly-added capabilities. SeeModelCapability
for more details. GetFeatureInfo
will now return descriptive statistics on summarized categorical features in the fieldkeySummary
.ListDeployments
now supports sorting and searching the results using the neworderBy
andsearch
parameters.GetResidualsChart
andListResidualsCharts
are now backwards-compatible with DataRobot 5.2, which does not return rowNumber.GetWordCloud
now includes avariable
field that represents the source of each ngram, as well as aclass
field that represents values of the target class.- Performance improvements for
GetPredictions
andPredict
when retrieving probabilities for large prediction datasets on multiclass projects, i.e.Predict(irisModel, largeDataset, type = "probability")
- Unit tests can now be written against testthat edition 3. This is an opt-in feature; all tests are run against edition 2 by default.
Bugfixes:
- Calls to
ListDeployments
will now return more than 20 deployments when available. ListPrimeModels
now returns an empty data frame when the API returns zero results, consistent with its documentation. Previously it would return an empty list. This response is also classed asdataRobotPrimeModels
.CreateCalendar
now terminates properly when DataRobot is unable to create the calendar. Previously, it would hang due to the R package not checking for the right error response.formatRFC3339Timestamp
now works for vectors of length > 1.
API Changes:
- The first argument of
CreateCalendar
andCreateRatingTable
is changed fromfile
todataSource
to reflect that the functions can process data frames as well as CSV files. - The helper method
ProjectFromAsyncUrl
is replaced withProjectFromJobResponse
; this change allowed us to simplify the package's dependency on httr.
Deprecated and Defunct:
BlendMethod$FORECAST_DISTANCE
is deprecated and will be removed in 2.19. UseBlendMethod$FORECAST_DISTANCE_ENET
instead.
Dependency Changes:
- Client documentation is now explicitly generated with Roxygen2 v7.1.1.
- To support new features,
testthat@>3.0.0
anddevtools@>2.4.0
is now required. The test suites are being updated to meet testthat 3e requirements. - Removed
Suggests: rex
as it is no longer needed for package development.
Documentation Changes:
- This
NEWS
file was renamed toNEWS.md
and formatted as Markdown. - Added unit test guidelines for developers to the README.
- Documentation for
GetBlenderModel
andGetBlenderModelFromJobId
are now more consistent. - Parameter documentation for
StartAutopilot
andSetTarget
is clarified. - Organized some functions into families for easier reference.
- Tweaked documentation related to predictions and time series projects.
- Fixed some spelling mistakes, typos, and Roxygen syntax errors.
- Removed dependency on V8 package by removing code that used the colormap package. The V8 package was flagged by the CRAN maintainers as not building so this removal was necessary to keep the datarobot package on CRAN.
- Removed
curl
fromImports
since it was causing a NOTE whendevtools::check_win_devel()
was run.
New Features:
- You can now deploy models via the API! Use
CreateDeployment
to create a deployment against a particular prediction server. UseListPredictionServers
to list all the available prediction servers. UseGetDeployment
andListDeployments
to see particular deployments that you have. You can delete a deployment withDeleteDeployment
. - The model backing a deployment can be replaced with
ReplaceDeployedModel
. UseValidateReplaceDeployedModel
first to test that the deployment replacement is valid, if desired. - Deployments support drift tracking. Use
GetDeploymentDriftTrackingSettings
to get drift tracking settings for a deployment. You can update the drift tracking usingUpdateDeploymentDriftTrackingSettings
. - Information on feature clustering and the association strength between pairs of numeric or categorical features is now available with
GetFeatureAssociationMatrix
. Relative pairwise feature association statistics can be retrieved withGetFeatureAssociationMatrixDetails
. - Multiple feature type transformations can now be executed in a single batch request using
BatchFeaturesTypeTransform
.
Enhancements:
- You can now use
doNotDerive
in thefeatureSettings
ofCreateDatetimePartition
to disable DataRobot's automatic time series feature engineering for a particular feature (e.g., so you can derive lags yourself manually). - Users can now embed DataRobot-generated content in compliance doc templates (see
UploadComplianceDocTemplate
) using keyword tags. - Prediction intervals are now supported for start-end retrained models in a time series project.
- Previously, all backtests had to be run before prediction intervals for a time series project could be requested with predictions. Now, backtests will be computed automatically if needed when prediction intervals are requested.
Bugfixes:
- Calls to
GetPredictionExplanationsRowsAsDataFrame
previously did not work with numeric labels. This has been fixed.
API Changes:
Deprecated and Defunct:
- The
defaultToAPriori
parameter inCreateDatetimePartitionSpecification
has been renamed todefaultToKnownInAdvance
.defaultToAPriori
has now been fully removed. - The
aPriori
flag in thefeatureSettings
parameter inCreateDatetimePartitionSpecification
as been renamed toknownInAdvance
.aPriori
has now been fully removed. - The deprecated
SetupProjectFromMySQL
,SetupProjectFromOracle
andSetupProjectFromPostgreSQL
have now been removed. UseSetupProjectFromDataSource
instead. GetTransferrableModel
,ListTransferrableModels
,UpdateTransferrableModel
,DeleteTransferrableModel
,DownloadTransferrableModel
, andUploadTransferrableModel
have been removed and replaced with their correctly spelled counterparts (GetTransferableModel
,ListTransferableModels
,UpdateTransferableModel
,DeleteTransferableModel
,DownloadTransferableModel
, andUploadTransferableModel
).
Dependency Changes:
Documentation Changes:
New Features:
- You can now retrieve series accuracy information, showing accuracy metrics for each series for a multiseries project. Use
GetSeriesAccuracy
to retrieve the accuracy. You can also download it as a CSV withDownloadSeriesAccuracy
.
Enhancements:
- Prediction intervals can now be returned for predictions with datetime models. Use
includePredictionIntervals = TRUE
in calls toPredict
. For each model, prediction intervals estimate the range of values DataRobot expects actual values of the target to fall within. They are similar to a confidence interval of a prediction, but are based on the residual errors measured during the backtesting for the selected model. ListPredictions
now returns metadata on prediction intervals.includesPredictionIntervals
is TRUE if there are prediction intervals in the predictions andFALSE
otherwise.predictionIntervals
specifies the size (in percent) of intervals or isNULL
if there are no intervals.- For time series projects, the effective Feature Derivation Window, specifying the full span of historical data required at predict time, is now available through the API. It may be longer than the feature derivation window of the project depending on the differencing settings used.
- More of the project partitioning settings are also available in the metadata for datetime models (see
GetDatetimeModel
). The new attributes areeffectiveFeatureDerivationWindowStart
,effectiveFeatureDerivationWindowEnd
,forecastWindowStart
,forecastWindowEnd
, andwindowsBasisUnit
. DownloadComplianceDocumentation
andGetSeriesAccuracy
now support amaxWait
parameter to customize the amount of time to wait before raising a timeout error.
Deprecated and Defunct:
RecommendedModelType$Recommended
type forGetModelRecommendation
andGetRecommendedModel
has been removed and replaced withRecommendedModelType$RecommendedForDeployment
.
Documentation Changes:
- Fixed more spelling mistakes in the documentation.
New Features:
- Advanced tuning can now be done on any model. See
StartTuningSession
for details. - DataRobot time series now supports calendar files, which allow specifying special events like holidays. See
CreateCalendar
,GetCalendar
,ListCalendars
,UpdateCalendar
, andDeleteCalendar
. - Projects now can be shared with other users. See
Share
for details. - Calendars can be shared with other users. See
Share
for details.
Enhancements:
UploadPredictionDataset
andUploadPredictionDatasetFromDataSource
will now returndataQualityWarnings
that mention any potential problems with the uploaded dataset.UploadPredictionDataset
andUploadPredictionDatasetFromDataSource
now have a parameterrelaxKIAFeaturesCheck
. IfTRUE
, uploaded datasets for time series projects will allow missing values for the Known in Advance features in the forecast window at prediction time.- ROC Curve information retrieval has been extended to contain four new fields (
fractionPredictedAsPositive
,fractionPredictedAsNegative
,liftPositive
, andliftNegative
) with cumulative gains and lift data. - Added Forecast Distance blender for time series projects configured with more than one Forecast Distance. It blends the selected models creating separate linear models for each Forecast Distance.
- Added a
filter
option toListProjects
that supports filtering retrieval of project lists by name using theprojectName
filter. GetCalendarFromProject
can be used to get the calendar associated with a project.- Data source objects can now be used in
StartProject
to quickly create a project from a data source. - Data source objects can now be used in addition to data source IDs in
SetupProjectFromDataSource
. crossSeriesGroupByColumns
has been added to datetime partitioning to allow users the ability to indicate how to further split series in to related groups.- The prediction explanations workflow is now ~3x faster for most use cases.
Bugfixes:
- Time series
windowBasisUnit
has been renamed to the correctwindowsBasisUnit
.
Deprecated and Defunct:
- Reason codes have been renamed to Prediction Explanations to provide increased clarity and accessibility.
DeleteReasonCodes
,DeleteReasonCodesInitialization
,DownloadReasonCodes
,GetAllReasonCodesRowsAsDataFrame
,GetReasonCodesInitialization
,GetReasonCodesInitializationFromJobId
,GetReasonCodesMetadata
,GetReasonCodesMetadataFromJobId
,GetReasonCodesRows
,ListReasonCodesMetadata
,RequestReasonCodes
, andRequestReasonCodesInitialization
have all been removed and replaced with appropriately renamed functions and a new workflow. SeeGetPredictionExplanations
for more. SetupProjectFromMySQL
,SetupProjectFromOracle
,SetupProjectFromPostgreSQL
,SetupProjectFromHDFS
are now deprecated. They will be removed in v2.17. UseSetupProjectFromDataSource
instead.RequestPredictionsForDataset
has been renamed toRequestPredictions
. The originalRequestPredictionsForDataset
has been removed.GetDatetimeModelObject
has been renamed toGetDatetimeModel
. The originalGetDatetimeModelObject
has been removed.- The
defaultToAPriori
parameter inCreateDatetimePartitionSpecification
has been renamed todefaultToKnownInAdvance
.defaultToAPriori
is now removed. - The
aPriori
flag in thefeatureSettings
parameter inCreateDatetimePartitionSpecification
as been renamed toknownInAdvance
.aPriori
is now removed. GetTransferrableModel
,ListTransferrableModels
,UpdateTransferrableModel
,DeleteTransferrableModel
,DownloadTransferrableModel
, andUploadTransferrableModel
have all been deprecated and replaced with their correctly spelled counterparts (GetTransferableModel
,ListTransferableModels
,UpdateTransferableModel
,DeleteTransferableModel
,DownloadTransferableModel
, andUploadTransferableModel
). The misspelled versions will be removed in v2.17.- Support for numeric modes in
StartAutopilot
has now been fully removed.
Dependency Changes:
- To support new features,
curl
at version 2.7 or higher is now required.
Documentation Changes:
- Fixed more spelling mistakes in the documentation.
Enhancements:
- Training predictions for multiseries projects will now return the
SeriesID
,forecastPoint
, andforecastDistance
. GetDatetimePartition
now returnsisCrossSeries
to indicate whether the datetime partition uses cross-series features.ScoreBacktests
now accepts a parameterwait = TRUE
to wait for job completion.Predict
andGetPredictions
no longer returnpositiveProbability
for non-binary problems.Predict
andGetPredictions
no longer returnseriesId
for non-multiseries problems.
Documentation Changes:
- Fixed a typo in how
knownInAdvance
was defined in thefeatureSettings
in the "time series" vignette.
Bugfixes:
- Requesting a multiseries project now will work even if the Series ID cannot be automatically inferred by DataRobot.
New Features:
DownloadComplianceDocumentation
can be used to download compliance documentation. Compliance documentation also can be created with default or custom templates - useGetComplianceDocTemplate
to get particular templates andUploadComplianceDocTemplate
to use your own. See the vignette on "Compliance Documentation" for more information.- Data sources and data stores can now be shared with other users. Use
Share
to share a data source or data store. UseListSharingAccess
to see current access rights. UseUpdateAccess
for more complex access right modification operations. - Multiseries projects can now include derived cross series features. Use
useCrossSeries = TRUE
inCreateDatetimePartitionSpecification
to enable. - You can now get a feature histogram (a histogram of feature counts and target distribution over bins of values for a particular feature) using
GetFeatureHistogram
. - Get supported capabilities for a model using
GetModelCapabilities
.
Enhancements:
- Data sources and data stores can be passed into functions directly in addition to being passed as IDs.
- Binary classification for time series is now supported as a project type.
- Calls to
StartProject
andUpdateProject
that up the worker count can now set the worker count to"max"
, which uses the maximum available number of workers. fallbackToParentInsights
is now available as a parameter on all insights functions (GetRocCurve
,ListRocCurves
,GetLiftChart
,ListLiftCharts
,GetConfusionChart
,ListConfusionCharts
). WhenTRUE
, a frozen model with missing insights will attempt to retrieve the missing insight data from its parent model.- Time series partitions can now have the forecast window and feature derivation windows defined in a number of rows by using
windowBasisUnit
and setting it to"ROW"
. - Time series partitions can now be defined in millisecond intervals.
- Training predictions for datetime partitioned projects now support the new data subset
DataSubset$AllBacktests
for requesting the predictions for all backtest validation folds. - Training predictions for datetime partitioned projects now return the relevant timestamp associated with the prediction.
Bugfixes:
- In cases where a request would return hundreds of responses, sometimes not all those responses would be returned due to improper pagination. This has now been fixed.
- Variables can now be correctly used as tuning parameters for
StartTuningSession
. - If you use
StartProject
without defining a project name, it now correctly uses the name of the variable passed (likeSetupProject
) rather than erroneously just calling it "dataSource".
Deprecated and Defunct:
GetAllLiftCharts
has now been removed (useListLiftCharts
instead).GetModelJobs
has now been removed (useListModelJobs
instead).GetProjectList
has now been removed (useListProjects
instead).GetAllRocCurves
has now been removed (useListRocCurves
instead).RecommendedModelType$Recommended
type forGetModelRecommendation
andGetRecommendedModel
has been deprecated and replaced withRecommendedModelType$RecommendedForDeployment
. It will be removed in v2.16.PeriodicityTimeUnits
has been renamed toTimeUnits
.PeriodicityTimeUnits
still exists for backwards compatibility.
Documentation Changes:
- Added more documentation for various enums.
- The use of
Predict
in the "Introduction to DataRobot" vignette was previously inaccurate. It has been fixed. RecommendedModelType
andGetModelRecommendation
now have more documentation about the model recommendation process.- Vignettes have been updated to use
StartProject
throughout. - The intro vignette has been cleaned up and now has an example of using feature impact.
- Data used in vignettes has now been added to package data. Broken references to data have been fixed.
New Features:
- An API for advanced tuning is now available, which allows you to manually set model parameters and override the DataRobot default selections. You can get information on available model hyperparameters via
GetTuningParameters
and start a tuning session withStartTuningSession
. You can interactively iterate through all the parameters for a model usingRunInteractiveTuning
. These advanced tuning features are currently generally available for Eureqa models. To use this feature with other model types, contact your CFDS for more information. Predict
can now be used to create predictions directly from a model and a test dataset, bypassing the need toUploadPredictionDataset
,RequestPredictions
, andGetPredictions
.ListPredictions
can be used to summarize all the different predictions available for a particular project, model, and/or prediction dataset.GetPredictionExplanations
now supports a single workflow to get prediction explanations (previously called reason codes, see "Deprecated and Defunct") for a model and a dataset without needing all the various intermediary steps.GetPredictionDataset
can be used to get metadata on a particular prediction dataset.- The prediction threshold for binary classification models can now be changed via
SetPredictionThreshold
. GetFeatureImpact
works likeGetFeatureImpactForModel
, but will also request the feature impact if it has not already been requested.GetTrainingPredictionsForModel
retrieves training predictions for a given model object, requesting them in the process.- Models can now be starred, which highlights them.
StarModel
will star a model.UnstarModel
will unstar it.ToggleStarForModel
will toggle the star status.ListStarredModels
will list all the starred models for a particular project. Model objects will also have anisStarred
parameter returned to tell whether they are starred or not. (All models are unstarred by default.) DeleteFeaturelist
andDeleteModelingFeaturelist
can now be used to delete featurelists and modeling featurelists respectively.UpdateFeaturelist
can be used to change the name and description of a featurelist.UpdateModelingFeaturelist
works for modeling featurelists as well.
Enhancements:
- Using
type = "raw"
inGetPredictions
(orPredict
) will return the raw dataframe of predictions metadata. ListModels
now can take anorderBy
parameter to sort the output list bymetric
orsamplePct
.ListModels
now can take afilter
parameter to filter output bysamplePct
,name
, and/orisStarred
.StartProject
andSetupProject
can now work without aprojectName
StartProject
can now take aworkerCount
parameter to set the worker count for the project.StartProject
can now takewait = TRUE
to automatically wait for the autopilot to complete (thus making an explicit call toWaitForAutopilot
unnecessary).ProjectStage
can now be used to get a list of the available project stages.- It is now no longer necessary to call
RequestMultiSeriesDetection
manually for a multiseries project. - Feature impact now returns not only the impact score for the features but also whether they were detected to be redundant with other high-impact features.
- Jobs now report a parameter called
isBlocked
that specifies whether a job is blocked from execution because one or more dependencies have not yet been met. ListModelJobs
now returns thetrainingRowCount
key.GetPredictions
now can get predictions using a projectId and a predictionId (seeListPredictions
) in addition to its prior ability to retrieve predictions using apredictionJobId
.- Featurelists (see
GetFeaturelist
andGetModelingFeaturelist
) now return acreated
value with the timestamp, aisUserCreated
value explaining whether or not the feature was created by a user (as opposed to DataRobot automation),numModels
showing how many models use the featurelist, anddescription
which gives a text description of the featurelist. - Prediction datasets are now
dataRobotPredictionDataset
class in addition to beinglist
class. GetPredictions
now can get predictions using a projectId and a predictionId (seeListPredictions
) in addition to its prior ability to retrieve predictions using apredictionJobId
.- Featurelists (see
GetFeaturelist
andGetModelingFeaturelist
) now return acreated
value with the timestamp, aisUserCreated
value explaining whether or not the feature was created by a user (as opposed to DataRobot automation),numModels
showing how many models use the featurelist, anddescription
which gives a text description of the featurelist. GetDatetimePartition
now reports information on the number of "known in advance" features.GetDatetimePartition
reports if the partition was drawn from a time series project and/or a multiseries project.
Bugfixes:
- In rare instances, requests would fail due to not being able to properly create authentication headers. This has now been resolved.
API Changes:
RequestMultiSeriesDetection
(which is now no longer necessary to invoke directly) now blocks until the multiseries request is complete and returns details about which multiseries columns were detected.
Deprecated and Defunct:
- Reason codes have been renamed to Prediction Explanations.
DeleteReasonCodes
,DeleteReasonCodesInitialization
,DownloadReasonCodes
,GetAllReasonCodesRowsAsDataFrame
,GetReasonCodesInitialization
,GetReasonCodesInitializationFromJobId
,GetReasonCodesMetadata
,GetReasonCodesMetadataFromJobId
,GetReasonCodesRows
,ListReasonCodesMetadata
,RequestReasonCodes
, andRequestReasonCodesInitialization
have all been deprecated (and will be removed in v2.15). These functions have been replaced with appropriately renamed functions and a new workflow. SeeGetPredictionExplanations
for more. - RequestPredictionsForDataset is replaced by RequestPredictions and deprecated (and will be removed in v2.15).
GetDatetimeModelObject
has been renamed toGetDatetimeModel
. The originalGetDatetimeModelObject
has been deprecated and will be removed in v2.15.
Documentation Changes:
- Added a vignette explaining the advanced tuning interfaces.
- Vignettes have been updated to use the new
Predict
workflow. - The vignette on time series and multiseries has been expanded to include more useful information.
- Vignettes now use
GetRecommendedModel
instead of calculating the best mode instead of calculating the best model. - Corrected various typos in docstrings.
Enhancements:
plot
on ListOfModels returns an error if the desired percent passed topct
is not found.
Bugfixes:
- Fix
as.data.frame
to handle missing featurelist IDs whensimple = FALSE
. - Fix
as.data.frame
to handle a list of prediction datasets. - Fix
as.data.frame
to handle a list of models whensamplePct
is not set. - Fix
plot
on ListOfModels to work with missing featurelist IDs. - Fix summary methods to work on zero-length lists.
Deprecated and Defunct:
- Passing a non-logical value to
simple
parameter inas.data.frame
now produces an error instead of a warning.
New Features:
- A new shorthand,
StartProject
, combines bothSetupProject
andSetTarget
into one function. - A report on how a model handles missing values for all features in the project can now be retrieved using
GetMissingValuesReport
. - You can now use
UploadPredictionDatasetFromDataSource
to create a prediction dataset from a DataRobot data source (introduced in v2.11).
Enhancements:
- If you have enabled monotonic constraints for your project, you can now disable these constraints for training a new model. You do this by using
RequestNewModel
and passing""
(an empty string) as the value for each monotonic constraint featurelist you wish to override.
Bugfixes:
- In v2.10 and v2.11, the
AutopilotMode$Quick
"mode" forSetTarget
had been broken and no longer triggered the quick mode (instead it ran the full autopilot). This has been fixed.
Deprecated and Defunct:
- GetProjectList is replaced by ListProjects and deprecated (and will be removed in v2.14).
- GetAllRocCurves is replaced by ListRocCurves and deprecated (and will be removed in v2.14).
- GetAllLiftCharts is replaced by ListLiftCharts and deprecated (and will be removed in v2.14).
- GetModelJobs is replaced by ListModelJobs and deprecated (and will be removed in v2.14).
Dependency Changes:
- To support new features,
httr
at version 1.2.0 or higher is now required.
New Features:
- DataRobot now recommends particular models.
ListModelRecommendations
has been added to get all the model recommendations,GetModelRecommendation
can return a particular recommendation, andGetRecommendedModel
returns the particular model object corresponding with a particular recommendation. - DataRobot now supports "Database Connectivity", allowing databases to be used as the source of data for projects and prediction datasets. The feature works on top of the JDBC standard, so a variety of databases conforming to that standard are available; a list of databases with tested support for DataRobot is available in the user guide in the web application. See
ListDrivers
andGetDriver
to get available drivers,CreateDataStore
to create a data store from a driver, andCreateDataSource
to create a data source from a data store. - You can also create a project from a specified data source using
SetupProjectFromDataSource
. - Time series projects support multiseries as well as single series data. See the vignette on time series for details.
GetTimeSeriesFeatureDerivationLog
can now be used to retrieve a lot of information on details for derived features for time series projects.DownloadTimeSeriesFeatureDerivationLog
can download the log to a text document.
Enhancements:
GetFeatureInfo
andListFeatureInfo
now reporttargetLeakage
, specifying whether a feature is considered to have target leakage or not.- Added a helper method to easily cross validate a model. Just call
CrossValidateModel
on your model object. ConnectToDataRobot
now works with environment variables. SetDATAROBOT_API_ENDPOINT
andDATAROBOT_API_TOKEN
to connect to DataRobot. Note that previously the R client unofficially usedDataRobot_URL
andDataRobot_Token
as environment variables to facilitate connecting to DataRobot, but these variables are now no longer supported.
Bugfixes:
- Fix
as.data.frame
to handle missing featurelist IDs.
API Changes:
- New parameters predictionsStartDate and predictionsEndDate added to
UploadPredictionDataset
to support bulk predictions upload for time series projects.
Bugfixes:
- The Model Deployment interface which was previously visible in the client has been removed to allow the interface to mature.
- Fix
as.data.frame
to handle multiple featurelists. - Clarified the time series workflow in the time series vignette.
- Fix
partitionKeyCols
parameter inCreateGroupPartition
to more clearly error if more than one partition key is passed. - Formatting within vignettes has been cleaned and standardized.
Deprecated and Defunct:
- The following were deprecated and have now been removed: the
quickrun
parameter on SetTarget, the ability to useGetFeatureInfo
with feature IDs, theGetRecommendedBlueprints
function,GetModelObject
,GetAllModels
,GetBlueprintDocuments
, and theRequestPredictions
function. - The
defaultToAPriori
parameter inCreateDatetimePartitionSpecification
is being deprecated and has been renamed todefaultToKnownInAdvance
.defaultToAPriori
will be fully removed in v2.15. - The
aPriori
flag in thefeatureSettings
parameter inCreateDatetimePartitionSpecification
is being deprecated and has been renamed toknownInAdvance
. TheaPriori
will be fully removed in v2.15.
New features:
- Models can now be deployed to dedicated prediction servers using the new model monitoring system via the API. Create a deployment via
RequestModelDeployment
, get information on a specific deployment usingGetModelDeployment
, and list information on all deployments across all projects viaListModelDeployments
. You can also get more information on the service health of a particular deployment usingGetModelDeploymentServiceStatistics
or get the action log for a deployed model usingGetModelDeploymentActionLog
.
Enhancements:
- DataRobot API now supports creating 3 new blender types - Random Forest, TensorFlow, LightGBM.
- Multiclass projects now support blenders creation for 3 new blender types as well as Average and ENET blenders.
- New attributes
maxTrainRows
,scaleoutMaxTrainPct
, andscaleoutMaxTrainRows
have been added to projects retrieved byGetProject
.maxTrainRows
specified the equivalent value to the existingmaxTrainPct
as a row count. The scaleout fields can be used to see how far scaleout models can be trained on projects, which for projects taking advantage of scalable ingest may exceed the limits on the data available to non-scaleout blueprints. - Models can be trained by requesting a particular row count using the new
trainingRowCount
argument, specifying a desired amount of rows instead of a desired percentage of the dataset (via the currentsamplePct
parameter). Specifying model size by row count is recommended when the float precision of sample_pct could be problematic, e.g. when training on a small percentage of the dataset or when training up to partition boundaries. This new approach is available forRequestNewModel
,RequestFrozenModel
, andRequestSampleSizeUpdate
.RequestFrozenDatetimeModel
already had this feature. GetPredictions
now returns a more informative error message when the async service times out.- Individual features can now be marked as a priori or not a priori using the new
featureSettings
attribute when setting the target or specifying datetime partitioning settings on time series projects. Any features not specified in thefeatureSettings
parameter will be assigned according to thedefaultToAPriori
value. - Three new options have been made available in the
DatetimePartitioningSpecification
to fine-tune how time-series projects derive modeling features.treatAsExponential
can control whether data is analyzed as an exponential trend and transformations like log-transform are applied.differencingMethod
can control which differencing method to use for stationary data.periodicities
can be used to specify periodicities occurring within the data. All are optional and defaults will be chosen automatically if they are unspecified. - An error is now raised if you do not pass a valid partition to partitioning in
SetTarget
.
Bugfixes:
- Fixed latency issues in
UploadPredictionDataset
andGetPredictions
. These functions have now been fully tested to handle data up to 1GB, and likely can handle more than that. If you run into issues, try incrementing themaxWait
parameter. - You can now set
ssl_verify: FALSE
indrconfig.yaml
to not verify SSL when connecting with DataRobot. - Fixed a typo in the training predictions vignette. It previously read
DownloadRatingTable
when it meant to readDownloadTrainingPredictions
. - Fixed a typo in the reason codes docstring examples. It previously read
reasonCodeId <- GetReasonCodesMetadataFromJobId(projectId, jobId)
when it should readreasonCodeId <- GetReasonCodesMetadataFromJobId(projectId, jobId)$id
.
API Changes:
- Now
trainingRowCount
is available on non-datetime models as well as "rowCount" based datetime models. It reports the number of rows used to train the model (equivalent tosamplePct
). - Added support for retrieving details about the Pareto Front for a Eureqa model. Use
GetParetoFront
to get the Pareto Front details andAddEureqaSolution
to add a new solution to the leaderboard.
New features:
- A new premium feature, time series, is now available. New projects can be created as time series projects which automatically derive features from past data and forecast the future. See the time series documentation in the web app for more information.
- The DataRobot API supports the creation, training, and predicting of multiclass classification projects. DataRobot, by default, handles a dataset with a numeric target column as regression. If your data has a numeric cardinality of up to 10 classes, you can override this behavior to instead create a multiclass classification project from the data. To do so, use the
SetTarget
function, settingtargetType = TargetType$Multiclass
. If DataRobot recognizes your data as categorical, and it has up to 10 classes, using multiclass will create a project that classifies which label the data belongs to. - With the introduction of Multiclass Classification projects, DataRobot needed a better way to explain the performance of a multiclass model so we created a new Confusion Chart. The API now supports retrieving and interacting with confusion charts.
GetFeatureInfo
andListFeatureInfo
now return the EDA summary statistics (i.e., mean, median, minum, maximum, and standard deviation) for features where this is available (e.g., numeric, date, time, currency, and length features). These summary statistics will be formatted in the same format as the data it summarizes.- The DataRobot API now includes Rating Tables. A rating table is an exportable CSV representation of a model. Users can influence predictions by modifying them and creating a new model with the modified table.
- You can now set
scaleoutModelingMode
when setting a project target. It can be used to control whether scaleout models appear in the autopilot and/or available blueprints. Scaleout models are only supported in the Hadoop enviroment with the corresponding user permission set. - You can now set
accuracyOptimizedBlueprints
when setting a project target. Accuracy optimized blueprints are longer running model blueprints that provide increased accuracy over the normal blueprints that run during autopilot. - DataRobot now supports retrieving model blueprint charts via
GetModelBlueprintChart
and model blueprint documentation viaGetModelBlueprintDocumentation
. These are like regular blueprint charts and blueprint documentation, except for model blueprints, which are a reduced representation of the blueprint run by the model to only include the relevant branches actually executed by the model. - The Datarobot API now supports generating and retrieving training predictions, which are predictions made by the model on out-of-fold training data. Users can start a job which will make training predictions and retrieve them. See the training predictions documentation in the web app for more information on how to use training predictions.
Enhancements:
CreateDatetimePartitionSpecification
now includes the optionaldisableHoldout
flag that can be used to disable the holdout fold when creating a project with datetime partitioning.- The advanced options available when setting the target have been extended to include the new parameters
offset
andexposure
to allow specifying offset and exposure columns to apply to predictions generated by models within the project. See the user guide documentation in the web app for more information on offset and exposure columns. - The advanced options available when setting the target have been extended to include the new parameter
eventsCount
to allow specifying the events count column. See the user guide documentation in the webapp for more information on events count. - File URIs can now be used as sourcedata when creating a project or uploading a prediction dataset. The file URI must refer to an allowed location on the server, which is configured as described in the user guide documentation.
- If this package is used in RStudio v1.1 or higher, it is possible to use the RStudio Connections UI to open a DataRobot connection.
- When retrieving reason codes on a project using an exposure column, predictions that are adjusted for exposure can be retrieved.
ConnectToDataRobot
now supports an optionsslVerify
that turns off SSL verification if set to FALSE.
Bugfixes:
- Fixes a bug that prevented
GetReasonCodesMetadataFromJobId
from being called with a project directly (instead of a project id). - Fixes a bug that prevented
RequestNewModel
from being called whenoptions(stringsAsFactors = TRUE)
is set. - Fixes a bug that prevented more than one blueprint document from being returned by
GetBlueprintDocuments
(now namedGetBlueprintDocumentation
).
Deprecated and Defunct:
- The quickrun parameter on SetTarget, the ability to use GetFeatureInfo with feature IDs, the
GetRecommendedBlueprints
function, and theRequestPredictions
function were all originally planned to be deprecated in version 3.0. These features and functions will now be deprecated in v2.10 instead. - GetModelObject is replaced by GetModel and deprecated (and will be removed in v2.10).
- GetAllModels is replaced by ListModels and deprecated (and will be removed in v2.10).
GetBlueprintDocuments
is replaced byGetBlueprintDocumentation
and deprecated (and will be removed in v2.10).
Documentation Changes:
- The
modelwordcloud
package is now available on CRAN, so the documentation has been updated to reflect CRAN installation instructions.
New features:
- Word cloud data for text processing models can be retrieved using
GetWordCloud
function. - Scoring code JAR file can be downloaded for models supporting code generation using 'DownloadScoringCode` function.
- Lift Chart data can be retrieved using
GetLiftCharts
andGetAllLiftCharts
function. - Roc Curve data for binary classification projects can be retrieved using
GetRocCurve
andGetAllRocCurves
- Status and information about individual jobs can be retrieved using
GetPredictJob',
GetModelJob,
GetJobfunctions. Any job can be retrieve via
GetJobwhich is less specific. Only prediction jobs can be retrieved with
GetPredictJoband only modeling jobs can be retrieved with
GetModelJob`.
Enhancements:
GetModelParameters
now includes an additional key showing the coefficients for individual stages of multistage models (e.g. Frequency-Severity models).- When training a
DatetimeModel
on a window of data, atimeWindowSamplePct
can be specified to take a uniform random sample of the training data instead of using all data within the window.
Bugfixes:
- Fixed a bug where depending on what version of the R curl library was installed, the client could hang after requesting certain DataRobot jobs.
- DownloadTransferrableModel now correctly handles HTTP errors.
Dependency Changes:
- To support new features,
jsonlite
at version 1.0 or higher andcurl
at version 1.1 or higher are now required.
Deprecated and Defunct:
- Semi-automatic autopilot mode is removed. Quick or manual mode can be used instead to get a sparser autopilot.
New features:
- Function CreateDerivedFeatureIntAsCategorical has been added. It creates new categorical feature based on parent numerical feature while truncating numerical values to integer. (All of the data in the column should be considered categorical in its string form when cast to an int by truncation. For example the value
3
will be cast as the string3
and the value3.14
will also be cast as the string3
. Further, the value-3.6
will become the string-3
. Missing values will still be recognized as missing.) - Reason Codes, a new feature in DataRobot, is fully supported in the package through several new functions.
- Functions which allow to access blueprint chart and documentation have been added.
- Model parameters can now be retrieved using GetModelParameters function.
- A new partitioning method (datetime partitioning) has been added. The recommended workflow is to preview the partitioning by creating a
DatetimePartitioningSpecification
using CreateDatetimePartition and CreateBacktestSpecification function and passing it into GenerateDatetimePartition, inspect the results and adjust as needed for the specific project dataset by adjusting theDatetimePartitioningSpecification
and re-generating, and then set the target by passing the finalDatetimePartitioningSpecification
object to the partitioning_method parameter of SetTarget.
Enhancements:
- The default value of the maxWait parameter used to control how long asynchronous routes are polled has been changed from 1 minute to 10 minutes.
API Changes:
- projectId has been added to Feature schema
- The UnpauseQueue function will not longer set the autopilot mode of a project to full autopilot. This means that projects using the (deprecated) SemiAuto autopilot mode will require the autopilot to be advanced via the webapp.
New features:
- Functions RequestFrozenModel, GetFrozenModel, GetFrozenModelFromJobId have been added. They allow user to create model with the same tuning parameters as parent model but with different data sample size and get information about frozen models in a project.
- Functions RequestBlender, GetBlenderModelFromJobId, GetBlenderModel have been added. They allow user to create blender models and get information about blender models in a project.
- Projects created via the API can now use smart downsampling when setting the target by passing smartDownsampled and majorityDownsamplingRate into the SetTarget function.
Enhancements:
- Meaningful error messages have been added when the DataRobot endpoint is incorrectly specified in a way that causes redirects (e.g. specifying http for an https endpoint).
- Previously it was not possible to use user partition columns with cross-validation without
specifying a holdout level using the API. This can now be be done by either omitting the
cvHoldoutLevel parameter or providing it as
NA
.
Bugfixes:
API Changes:
Deprecated and Defunct:
- Support for recommender models has been removed from the DataRobot API. The package has been updated to remove functionality that formerly used this feature.
Documentation Changes:
New features:
- The premium feature DataRobot Prime has been added. You can now approximate a model on the leaderboard and download executable code for it. Talk to your account representative if the feature is not available on your account. The new related functions are GetPrimeEligibility, RequestApproximation, ListPrimeModels, GetPrimeModel, GetRulesets, RequestPrimeModel, GetPrimeModelFromJobId, CreatePrimeCode, GetPrimeFileFromJobid, ListPrimeFiles, GetPrimeFile, DownloadPrimeCode
- A utility function, WaitForJobToComplete, has been added. It will block until the specified job finishes, or raise an error if it does not finish within a specified timeout.
- Functions SetupProjectFromMySQL, SetupProjectFromOracle, SetupProjectFromPostgreSQL and SetupProjectFromHDFS have been added. They allow user to create DataRobot projects from MySQL, Oracle, PostgreSQL and HDFS data sources.
- Functions RequestTransferrrableModel, DownloadTransferrableModel, UploadTransferrableModel, GetTransferrrableModel, ListTransferrrableModels, UpdateTransferrrableModel, DeleteTransferrrableModel have been added. They allow user to download models from modeling server and transfer them to special dedicated prediction server (those functions are only useful to users with on-premise environment)
Enhancements:
- An optional maxWait parameter has been added to GetModelFromJobId and GetFeatureImpactForJobId, to allow users to specify an amount of time to wait for the job to complete other than the default 60 seconds.
- Projects can now be run in quickrun mode (which skips some autopilot stages and longer-running models) by passing "quick" as the mode parameter, in the same way "auto" and "manual" modes can be specified.
- The client will now check the API version offered by the server specified in configuration, and a warning if the client version is newer than the server version. The DataRobot server is always backwards compatible with old clients, but new clients may have functionality that is not implemented on older server versions. This issue mainly affects users with on-premise deployments of DataRobot.
- SetupProject and UploadPredictionDataset accept url as dataSource parameter now
Bugfixes:
- If a model job errors, GetModelFromJobId will now immediately raise an exception, rather than waiting for the timeout.
- The maxWait parameter on UploadPredictionDataset will now be correctly applied.
API Changes:
Deprecated and Defunct:
- The quickrun parameter on SetTarget is deprecated (and will be removed in 3.0). Pass "quick" as the mode parameter instead.
Documentation Changes:
Enhancements:
- When project creation using SetupProject times out, the error message now includes a URL to use with the new ProjectFromAsyncUrl function to resume waiting for the project creation.
- GetFeatureInfo now supports retrieving features by feature name. (For backwards compatibility, feature IDs are still supported until 3.0.)
- The package no longer relies on a particular version of the methods package. (This dependency was too strict and required some users to unnecessarily upgrade R.)
- The projectName argument of SetupProject no longer defaults to the string 'None'. (The new default is not to send a name, which results in the name 'Untitled Project'.)
- The maxWait argument for SetupProject now controls the timeout for the initial POST request and has a larger default value. The reason for this is that for large project creation file uploads, the server may take a longer-than-normal amount of time to respond, and waiting longer than the default timeout may be necessary.
Deprecated and Defunct:
- The ability to use GetFeatureInfo with feature IDs is deprecated (and will be removed in 3.0). Use feature names instead.
- GetRecommendedBlueprints is replaced by ListBlueprints and deprecated (and will be removed in 3.0).
- RequestPredictions is deprecated and replaced by RequestPredictionsForDataset. RequestPredictionsForDataset will be renamed to RequestPredictions in 3.0.
- DeletePendingJobs is removed; use DeleteModelJob instead
- GetFeatures is removed; use ListModelFeatures instead
- GetPendingJobs is removed; use GetModelJobs instead
- StartAutopilot is removed; use SetTarget instead
- parameter url is removed from ConnectToDataRobot
- parameter jobStatus is removed from GetModelJobs
- parameters saveFile and csvExtension are removed from RequestPredictions
- parameters saveFile and csvExtension are removed from SetupProject
- "semi" mode option (functions SetTarget, StartNewAutopilot) is deprecated (and will be removed in 3.0).
New features:
- The API now supports the new Feature Impact feature. Use RequestFeatureImpact to start a job to compute FeatureImpact, and GetFeatureImpactForModel or GetFeatureImpactForJobId to retrieve the completed Feature Impact results.
- The new functions CreateDerivedFeatureAsCategorical, CreateDerivedFeatureAsText, CreateDerivedFeatureAsNumeric can be used to create derived features as type transforms of existing features.
- The API now supports uploading (UploadPredictionDataset), listing (ListPredictionDatasets), and deleting (DeletePredictionDataset) datasets for prediction as well as requesting predictions (RequestPredictionsForDataset) against such datasets.
Bugfixes
- as.data.frame fixed for empty listOfBlueprints, listOfFeaturelists, listOfModels
- The documentation for SetTarget incorrectly referred to the 'semiauto' (rather than 'semi') autopilot setting. This is fixed.
- GetPredictions previously used a maxWait of 60, regardless of what maxWait the user specified. This is fixed.
Bugfixes
- GetModelJobFromId was broken by v2.2.32 and is now fixed.
- CreateFeaturelist was broken by v2.2.32 and is now fixed.
API Changes
- Package renamed to
datarobot
.
New features:
- ListJobs and DeleteJob functions added. ListJobs lists the jobs in the project queue (of any type). DeleteJob can be used to cancel one of these jobs.
- ListFeatureInfo (for all features) and GetFeatureInfo (for one feature) have been added for retrieving feature details.
Enhancements:
- In line with new functionality in version 2.2 of the DataRobot API,
CreateUserPartition now allows
holdoutLevel
to be NULL (which results in not sending the holdout level, in line with backend API changes to allow user partitions to be created without a holdout level). - Slices using
[
from objects of type listOfBlueprints, listOfFeaturelists, and listOfModels will now retain the appropriate type. - Several functions (e.g. ConnectToDataRobot, DeleteModel, PauseQueue, etc.) used to return TRUE as their only possible return value. Now they return nothing instead.
- GetValidMetrics no longer has special-casing for the situation when the project is not yet ready to give you the valid metrics for a potential metric. In this case, an error will now be returned from the server.
- Error messages from the server now include additional detail.
- To improve error messages, in several places error messages no longer reference the top-level function the user called.
- The SetTarget function will now properly block execution until the server indicates the project has finished initializing and is ready to build models
Deprecated and Defunct:
- GetFeatures has been deprecated and renamed to ListModelFeatures (for more more clarity/consistency in naming and to avoid confusion with the now GetFeatureInfo and ListFeatureInfo)
- Support for authenticating via username/password has been removed. Use an API Token instead
- Removed broken UpdateDefaultPartition. To use one of the default partition methods with updated settings, please use CreateRandomPartition or CreateStratifiedPartition.
Enhancements
- Use of the WaitForAutopilot function will no longer trigger deprecation warnings
Bugfixes
- Due to a dependency on the methods package (which is loaded by default interactively but not running Rscript), RequestPredictions did not work when invoked with Rscript. This is fixed. The methods package is now in 'depends' instead of 'imports' to prevent this problem from ever occurring again.
Deprecated & Defunct
- Removed broken UpdateDefaultPartition. Please use the other partition-creating functions.
Bugfixes
- Due to a dependency on the methods package (which is loaded by default interactively but not running Rscript), some functions did not work when invoked with Rscript. This is fixed.
- SetupProject and GetPredictions now check for and displays errors in project creation (previously they would keep waiting and time out if there are errors)
- Previously errors would sometimes appear missing a space between two words. This is fixed.
Bugfixes
- Fixed a problem that caused an error when getting predictions if the installed version of the httr package was 1.0 and older.
Enhancements:
- HTTP requests now include User-Agent headers for logging purposes, e.g. "DataRobotRClient/2.0.25 (Darwin 14.5.0 x86_64)".
- We now provide a more informative error message after receiving HTML from the server when we expected JSON.
- We avoid httr encoding warning messages by specifying UTF-8.
- It is now possible to not specify the desired jobStatus in GetPendingJobs (by passing NULL for the jobStatus argument, which is now the default).
- GetPredictions now checks whether a prediction job has errored or been canceled and will error right away in that case (instead of waiting until the timeout)
- When specifying the data source as a dataframe (in RequestPredictions or SetupProject), the class may now be a subclass of dataframe (it need not be equal to dataframe).
- Previously GetModelJobs returned a dataframe when there are jobs but an empty list when there are none. Now it consistently returns a dataframe (with zero rows if there are no jobs) either way.
New features:
- ConnectToDataRobot can now read from a YAML config file.
- On package startup, we look for a config file in the default location, so the user does not need to call ConnectToDataRobot explicitly
- WaitForAutopilot function added. This function periodically checks whether Autopilot is finished and returns only after it is.
- SetupProject and RequestPredictions now default to using a tempfile instead of placing the file to be uploaded into the current working directory.
- New function StartNewAutopilot can be used to restart autopilot on a specific featurelist if it was previously running on a different one.
- New function SetTarget provides the functionality that StartAutopilot used to be responsible for. StartAutopilot is now deprecated, and SetTarget should be used instead. This function can now take a featurelistId argument, specifying which featurelist to use.
Bugfixes:
- GetPendingJobs (now deprecated in favor of GetModelJobs) was broken and is now fixed.
- GetValidMetrics was broken and is now fixed.
- GetProjectList no longer errors when there are no projects. It now returns an object whose structure matches the returned object when there are projects.
Deprecated and Defunct:
- The arguments controlling where the tempfile goes (in SetupProject and RequestPredictions) are now deprecated
- DeletePendingJob is deprecated (use DeleteModelJob instead)
- GetPendingJob is deprecated (use GetModelJob instead)
- jobStatus argument to GetModelJob/GetPendingJob is deprecated (use status instead)
- StartAutopilot is deprecated (use SetTarget instead).
API Changes Summary:
- Support for the experimental date partitioning has been removed in DataRobot API, so it is being removed from the client immediately - the CreateDatePartition function has been removed.
Enhancements:
- Codebase cleaned of many lint violations.
New Features:
- DeletePredictJob, GetPredictJobs, GetPredictions, RequestPredictions all added to control the prediction functionality created in v2.0 featureset of the API.
- "quickrun" parameter added to StartAutopilot. This boolean enables use of the quickrun autopilot feature of DataRobot.
Bugfixes: None
Deprecated and Defunct: None
API Changes: None
- fixes the maxWait parameter that was unsuccessfully introduced in 0.2.23
- maxWait parameter added to SetupProject to allow for datasets that take very long to initialize on the DataRobot server
- Documentation structure changed to use Roxygen2