Skip to contents

Loads large Quartzbio EDP dataset and returns an R data frame containing all records.

Usage

Dataset_load(
  id = NULL,
  full_path = NULL,
  get_schema = FALSE,
  filter_expr = NULL,
  ...
)

Arguments

id

(character) The ID of a QuartzBio EDP dataset.

full_path

(character) a valid dataset full path, including the account, vault and path to EDP Dataset.

get_schema

(boolean) Retrieves the schema of the Quartzbio EDP dataset loaded. Default value: FALSE

filter_expr

(character) A arrow Expression to filter the scanned rows by, or (default) to keep all rows. Check arrow::Scanner()

...

Arguments passed on to arrow::read_parquet

col_select

A character vector of column names to keep, as in the "select" argument to data.table::fread(), or a tidy selection specification of columns, as used in dplyr::select().

as_data_frame

Should the function return a tibble (default) or an Arrow Table?

Value

A tibble which is the default, or an Arrow Table otherwise. If the get_schema parameter is set to TRUE, the function returns a list containing both the tibble and its schema.