The goal of tab2xml
is to convert spreadsheet files
(.xlsx or .ods) into structured XML documents using a predefined
template. The package processes the spreadsheet data, replacing template
tokens with corresponding values, and manages foreign key relationships
automatically.
You can install the released version of tab2xml
from CRAN with:
install.packages("tab2xml")
And the development version from GitHub with:
# install.packages("pak")
::pak("josesamos/tab2xml") pak
This example demonstrates how to use tab2xml
to convert
an Excel or ODS file into an XML file, based on a schema example from
the Mondrian
Documentation.
library(tab2xml)
# Define file paths
<- system.file("extdata", "schema_template.xml", package = "tab2xml")
source_xml <- system.file("extdata", "schema.xlsx", package = "tab2xml")
source_xlsx <- tempfile(fileext = ".xml")
temp_file
# Show spreadsheet contents
<- readxl::excel_sheets(source_xlsx)
sheet_names for (sheet in sheet_names) {
cat("\n## Sheet:", sheet)
<- readxl::read_excel(source_xlsx, sheet = sheet)
data print(knitr::kable(data))
}#>
#> ## Sheet: Cube
#>
#> | cube_pk|name | table_fk|
#> |-------:|:-----|--------:|
#> | 1|Sales | 1|
#>
#> ## Sheet: Table
#>
#> | table_pk|name |
#> |--------:|:---------------|
#> | 1|sales_fact_1997 |
#> | 2|customer |
#> | 3|time_by_day |
#>
#> ## Sheet: Dimension
#>
#> | dimension_pk|name |foreignKey | cube_fk|
#> |------------:|:------|:-----------|-------:|
#> | 1|Gender |customer_id | 1|
#> | 2|Time |time_id | 1|
#>
#> ## Sheet: Hierarchy
#>
#> | hierarchy_pk|name |hasAll |allMemberName |primaryKey | dimension_fk| table_fk|
#> |------------:|:------|:------|:---------------------------|:-----------|------------:|--------:|
#> | 1|Gender |true |allMemberName="All Genders" |customer_id | 1| 2|
#> | 2|Time |false |NA |time_id | 2| 3|
#>
#> ## Sheet: Level
#>
#> |name |column |type |uniqueMembers | hierarchy_fk|
#> |:-------|:-------------|:-------|:-------------|------------:|
#> |Gender |gender |String |true | 1|
#> |Year |the_year |Numeric |true | 2|
#> |Quarter |quarter |Numeric |false | 2|
#> |Month |month_of_year |Numeric |false | 2|
#>
#> ## Sheet: Measure
#>
#> |name |column |aggregator |formatString | cube_fk|
#> |:-----------|:-----------|:----------|:------------|-------:|
#> |Unit Sales |unit_sales |sum |#,### | 1|
#> |Store Sales |store_sales |sum |#,###.## | 1|
#> |Store Cost |store_cost |sum |#,###.00 | 1|
#>
#> ## Sheet: CalculatedMember
#>
#> | calculatedmember_pk|name |dimension |formula | cube_fk|
#> |-------------------:|:------|:---------|:--------------------------------------------------|-------:|
#> | 1|Profit |Measures |[Measures].[Store Sales] - [Measures].[Store Cost] | 1|
#>
#> ## Sheet: CalculatedMemberProperty
#>
#> |name |value | calculatedmember_fk|
#> |:-------------|:---------|-------------------:|
#> |FORMAT_STRING |$#,##0.00 | 1|
# Convert spreadsheet to XML
<- sheet2xml(source_xlsx, source_xml, temp_file)
file
# Check output
cat(readLines(file), sep = "\n")
#> <Schema>
#> <Cube name="Sales">
#> <Table name="sales_fact_1997" />
#> <Dimension name="Gender" foreignKey="customer_id">
#> <Hierarchy name="Gender" hasAll="true" allMemberName="All Genders" primaryKey="customer_id">
#> <Table name="customer" />
#> <Level name="Gender" column="gender" type="String" uniqueMembers="true" />
#> </Hierarchy>
#> </Dimension>
#> <Dimension name="Time" foreignKey="time_id">
#> <Hierarchy name="Time" hasAll="false" primaryKey="time_id">
#> <Table name="time_by_day" />
#> <Level name="Year" column="the_year" type="Numeric" uniqueMembers="true" />
#> <Level name="Quarter" column="quarter" type="Numeric" uniqueMembers="false" />
#> <Level name="Month" column="month_of_year" type="Numeric" uniqueMembers="false" />
#> </Hierarchy>
#> </Dimension>
#> <Measure name="Unit Sales" column="unit_sales" aggregator="sum" formatString="#,###" />
#> <Measure name="Store Sales" column="store_sales" aggregator="sum" formatString="#,###.##" />
#> <Measure name="Store Cost" column="store_cost" aggregator="sum" formatString="#,###.00" />
#> <CalculatedMember name="Profit" dimension="Measures" formula="[Measures].[Store Sales] - [Measures].[Store Cost]">
#> <CalculatedMemberProperty name="FORMAT_STRING" value="$#,##0.00" />
#>
#> </CalculatedMember>
#> </Cube>
#> </Schema>
In this way, we can organize and work with the data in tabular form and generate XML documents directly using the provided templates.