readwritesqlite is an R package to enhance reading and writing to SQLite databases.
The first task after loading
readwritesqlite is to create an object of class SQLiteConnection. Below we create one in memory although in general the user will want to specify a path.
Individual data frames, environments or named lists of data frames can be written to a connection using
In the case of a data frame the default table name is the name of the object (an alternative name can be specified using the
The fact that
rws_write() accepts environments means that the user can easily write all the data frames in the current environment to a SQLiteConnection.
Objects which are not data frames are silently filtered from environments but cause an error with lists.
By default the
exists argument to the
rws_write() function is
TRUE. This means that only existing tables (which were presumably created by the database designer with appropriate checks and foreign keys) can be written to. If the user wishes to automatically create new tables (if they don’t exist) before writing then they should set
exists = NA. If the user wishes to only write to previously non-existent tables then they should set
exists = FALSE.
rws_write() function ensures that writing the new data frame(s) to the database does not violate foreign keys. If any data does the database is left unchanged.
By default, if no check or key violations occur the data frame(s) are written to the database. Otherwise an error message is issued and all changes are rolled back. If the user wishes to simply check whether data could be written to a database without actually making any changes then they can call
commit = FALSE.
Meta data is recorded if the user uses
rws_write() to write a data frame to an empty table. In order to change or add meta data the user should read the existing data from the table (using
rws_read_sqlite()), modify it accordingly and then re-write it using
delete = TRUE. The only exception is for factors and ordered factors. If the existing factor levels are already recorded in the meta data then the user can pass data with additional or rearranged levels for factor and with additional levels for ordered factors without deleting the existing data.
If writing data violates a unique or primary key an error message is returned and the table is unaltered. The only exception to this is if
replace = TRUE in which case existing rows which cause unique or primary key violations are removed.
When passing data frames to
rws_write() in the form of an environment or named list, each table must be represented by just one data frame if
unique = TRUE (the default). Duplicates are also not permitted if
delete = TRUE (because the first data to be written will be overwritten) or
exists = FALSE (because the table will exist when the duplicate is written).
When passing data frames to
rws_write() in the form of an environment or named list, if
all = TRUE (the default) and
exists is not
FALSE then each existing tables must be represented at least once. This option is useful for checking all the tables in a data frame are populated when transferring data from an old to new database.
strict = TRUE and extra columns in an input data frame cause an error to be thrown. If
strict = FALSE the error is replaced by a warning and the additional columns are automatically removed from the data. When writing environments or list of data with
exists = TRUE, the
strict argument also determines if extra data frames cause an informative error or are automatically removed with a warning.
If the user wishes to suppress package specific messages or warnings then they should set
silent = TRUE. As the default value is
silent = getOption("rws.silent", FALSE) the user can silence the package session-wide with
options(silent = TRUE).
Data can be read using
rws_read() which either takes a vector of table names or the connection as the first argument.
rws_read() returns a named list of tibbles. If the connection is the first argument then the named list consists of all tables in the data base.
tables <- rws_read(conn) names(tables) #>  "a_table" "another_table" "rws_data" tables$rws_data #> # A tibble: 3 x 6 #> logical date factor ordered posixct units #> <lgl> <date> <fct> <ord> <dttm> [m] #> 1 TRUE 2000-01-01 x x 2001-01-02 03:04:05 10.0 #> 2 FALSE 2001-02-03 y y 2006-07-08 09:10:11 11.5 #> 3 NA NA <NA> <NA> NA NA
The table names can of course be manipulated and
list2env() used to assign the data frames to the current environment.
If the user wishes to read a single data frame they can use
rws_read_log() allow the user to read the meta and log tables.
rws_read_meta(conn) #> # A tibble: 10 x 4 #> TableMeta ColumnMeta MetaMeta DescriptionMeta #> <chr> <chr> <chr> <chr> #> 1 ANOTHER_TABLE FACTOR factor: 'x', 'y' <NA> #> 2 ANOTHER_TABLE ORDERED ordered: 'y', 'x' <NA> #> 3 A_TABLE DATE class: Date <NA> #> 4 A_TABLE LOGICAL class: logical <NA> #> 5 RWS_DATA DATE class: Date <NA> #> 6 RWS_DATA FACTOR factor: 'x', 'y' <NA> #> 7 RWS_DATA LOGICAL class: logical <NA> #> 8 RWS_DATA ORDERED ordered: 'y', 'x' <NA> #> 9 RWS_DATA POSIXCT tz: Etc/GMT+8 <NA> #> 10 RWS_DATA UNITS units: m <NA>
rws_read_log(conn) #> # A tibble: 6 x 5 #> DateTimeUTCLog UserLog TableLog CommandLog NRowLog #> <dttm> <chr> <chr> <chr> <int> #> 1 2019-07-07 16:09:59 joe RWS_DATA CREATE 0 #> 2 2019-07-07 16:10:00 joe RWS_DATA INSERT 3 #> 3 2019-07-07 16:10:00 joe A_TABLE CREATE 0 #> 4 2019-07-07 16:10:00 joe A_TABLE INSERT 3 #> 5 2019-07-07 16:10:00 joe ANOTHER_TABLE CREATE 0 #> 6 2019-07-07 16:10:00 joe ANOTHER_TABLE INSERT 3