Skip to contents

SOMADataFrame is a multi-column table that must contain a column called soma_joinid of type int64, which contains a unique value for each row and is intended to act as a join key for other objects, such as SOMASparseNDArray. (lifecycle: experimental)

Methods

Inherited methods


Method create()

Create (lifecycle: experimental)

Usage

SOMADataFrame$create(
  schema,
  index_column_names = c("soma_joinid"),
  platform_config = NULL,
  internal_use_only = NULL
)

Arguments

schema

an arrow::schema.

index_column_names

A vector of column names to use as user-defined index columns. All named columns must exist in the schema, and at least one index column name is required.

platform_config

A platform configuration object

internal_use_only

Character value to signal this is a 'permitted' call, as create() is considered internal and should not be called directly.


Method write()

Write (lifecycle: experimental)

Usage

SOMADataFrame$write(values)

Arguments

values

An arrow::Table or arrow::RecordBatch containing all columns, including any index columns. The schema for values must match the schema for the SOMADataFrame.


Method read()

Read (lifecycle: experimental) Read a user-defined subset of data, addressed by the dataframe indexing column, and optionally filtered.

Usage

SOMADataFrame$read(
  coords = NULL,
  column_names = NULL,
  value_filter = NULL,
  result_order = "auto",
  iterated = FALSE,
  log_level = "auto"
)

Arguments

coords

Optional named list of indices specifying the rows to read; each (named) list element corresponds to a dimension of the same name.

column_names

Optional character vector of column names to return.

value_filter

Optional string containing a logical expression that is used to filter the returned values. See tiledb::parse_query_condition for more information.

result_order

Optional order of read results. This can be one of either "ROW_MAJOR, "COL_MAJOR", or "auto"` (default).

iterated

Option boolean indicated whether data is read in call (when FALSE, the default value) or in several iterated steps.

log_level

Optional logging level with default value of "warn".

Returns

arrow::Table or TableReadIter


Method update()

Update (lifecycle: experimental)

Usage

SOMADataFrame$update(values, row_index_name = NULL)

Arguments

values

A data.frame, arrow::Table, or arrow::RecordBatch.

row_index_name

An optional scalar character. If provided, and if the values argument is a data.frame with row names, then the row names will be extracted and added as a new column to the data.frame prior to performing the update. The name of this new column will be set to the value specified by row_index_name.

Details

Update the existing SOMADataFrame to add or remove columns based on the input:

  • columns present in the current the SOMADataFrame but absent from the new values will be dropped

  • columns absent in current SOMADataFrame but present in the new values will be added

  • any columns present in both will be left alone, with the exception that if values has a different type for the column, the entire update will fail because attribute types cannot be changed.

Furthermore, values must contain the same number of rows as the current SOMADataFrame.


Method clone()

The objects of this class are cloneable with this method.

Usage

SOMADataFrame$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.