[Bioc-devel] Adding additional validity checks when calling setter methods of inherited S4Vectors::DataFrame class

Nanda, Pariksheet PAN79 @end|ng |rom p|tt@edu
Fri Jan 3 01:17:02 CET 2025


Hello S4-class boffins,

How bad of an idea it is to inherit from a S4Vectors::DataFrame / DFrame S4 class to impose additional constraints on it? I'm writing a light-weight wrapper around the ToppGene web API (github com/ImmuSystems-Lab/toppgene/blob/main/R/categories.R) [1] and while it's functional, it currently only runs JSON web queries using default values. To pass non-default values, each category queried needs to have associated parameters within some boundaries. While it's intuitive for a Biconductor user to see and manipulate DataFrames containing the parameters, the trouble I'm seeing is that validObject() is of course not automagically run on the dispatched S4Vectors setters and I don't know how to inject validObject() into the process without rewriting / repeating a lot of the S4Vectors method implementation internals; callNextMethod() does not seem like it would work? Currently, the only time validObject() is called is when invoking the constructor, CategoriesDataFrame() and because code is worth a thousand words, see below the "---" line for what I mean.

Pariksheet

[1] Yes, I'm trying to avoid the GitHub URL from being mangled into illegible horrors by removing the protocol prefix and the dot before the domain, so you'll have to add at least the latter back in to visit the GitHub page.

---

> devtools::load_all()
[...]

> cats <- CategoriesDataFrame()

> cats
ToppGene CategoriesDataFrame with 19 enabled categories
                              PValue MinGenes MaxGenes MaxResults Correction Enabled
Coexpression                    0.05        2     1500         50        FDR    TRUE
CoexpressionAtlas               0.05        2     1500         50        FDR    TRUE
Computational                   0.05        2     1500         50        FDR    TRUE
Cytoband                        0.05        2     1500         50        FDR    TRUE
Disease                         0.05        2     1500         50        FDR    TRUE
Domain                          0.05        2     1500         50        FDR    TRUE
Drug                            0.05        2     1500         50        FDR    TRUE
GeneFamily                      0.05        2     1500         50        FDR    TRUE
GeneOntologyBiologicalProcess   0.05        2     1500         50        FDR    TRUE
GeneOntologyCellularComponent   0.05        2     1500         50        FDR    TRUE
GeneOntologyMolecularFunction   0.05        2     1500         50        FDR    TRUE
HumanPheno                      0.05        2     1500         50        FDR    TRUE
Interaction                     0.05        2     1500         50        FDR    TRUE
MicroRNA                        0.05        2     1500         50        FDR    TRUE
MousePheno                      0.05        2     1500         50        FDR    TRUE
Pathway                         0.05        2     1500         50        FDR    TRUE
Pubmed                          0.05        2     1500         50        FDR    TRUE
TFBS                            0.05        2     1500         50        FDR    TRUE
ToppCell                        0.05        2     1500         50        FDR    TRUE
------------------------------
Values allowed by ToppGene are:
  PValue: [0, 1] <numeric>
  MinGenes: [1, 5000] <integer>
  MaxGenes: [2, 5000] <integer>
  MaxResults: [1, 5000] <integer>
  Correction: {None, FDR, Bonferroni} <character>

## This next line should not complete without an error!  But it does.
> cats[, "PValue"] <- 2

## Explicitly calling validObject() will point out the problem post-hoc,
## but not prevent the above assignment.
> validObject(cats)
Error in validObject(cats) : 
  invalid class “CategoriesDataFrame” object: column PValue must contain values <= 1



More information about the Bioc-devel mailing list