Title: | Searching for Optimal MDS Procedure for Metric and Interval-Valued Data |
Version: | 0.7-7 |
Date: | 2025-06-26 |
Depends: | R (≥ 3.6.0), smacof, clusterSim, symbolicDA |
Imports: | animation, plotrix, spdep |
Suggests: | testthat, R.rsp |
VignetteBuilder: | R.rsp |
Description: | Selecting the optimal multidimensional scaling (MDS) procedure for metric data via metric MDS (ratio, interval, mspline) and nonmetric MDS (ordinal). Selecting the optimal multidimensional scaling (MDS) procedure for interval-valued data via metric MDS (ratio, interval, mspline).Selecting the optimal multidimensional scaling procedure for interval-valued data by varying all combinations of normalization and optimization methods.Selecting the optimal MDS procedure for statistical data referring to the evaluation of tourist attractiveness of Lower Silesian counties. (Borg, I., Groenen, P.J.F., Mair, P. (2013) <doi:10.1007/978-3-642-31848-1>, Walesiak, M. (2016) <doi:10.15611/ekt.2016.2.01>, Walesiak, M. (2017) <doi:10.15611/ekt.2017.3.01>). |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Encoding: | UTF-8 |
NeedsCompilation: | no |
Packaged: | 2025-06-27 09:59:17 UTC; andrzej |
Author: | Marek Walesiak |
Maintainer: | Andrzej Dudek <andrzej.dudek@ue.wroc.pl> |
Repository: | CRAN |
Date/Publication: | 2025-06-27 12:10:02 UTC |
The evaluation of tourist attractiveness of Lower Silesian counties
Description
The empirical study uses the statistical data presented in the article (Gryszel, Walesiak, 2014) and referring to the attractiveness level of 31 objects (29 Lower Silesian counties, pattern and antipattern object) The evaluation of tourist attractiveness of Lower Silesian counties was performed using 16 metric variables (measured on a ratio scale): x1 – beds in hotels per 1 km2 of a county area, x2 – number of nights spent daily by resident tourists per 1000 inhabitants of a county, x3 – number of nights spent daily by foreign tourists per 1000 inhabitants of a county, x4 – gas pollution emission in tons per 1 km2 of a county area, x5 – number of criminal offences and crimes against life and health per 1000 inhabitants of a county, x6 – number of property crimes per 1000 inhabitants of a county, x7 – number of historical buildings per 100 km2 of a county area, x8 – x9 – x10 – number of events as well as cultural and tourist ventures in a county, x11 – number of natural monuments calculated per 1 km2 of a county area, x12 – number of tourist economy entities per 1000 inhabitants of a county (natural and legal persons), x13 – expenditure of municipalities and counties on tourism, culture and national heritage protection as well as physical culture per 1 inhabitant of a county in PLN, x14 – viewers in cinemas per 1000 inhabitants of a county, x15 – museum visitors per 1000 inhabitants of a county, x16 – number of construction permits (hotels and accommodation buildings, commercial and service buildings, transport and communication buildings, civil and water engineering constructions) issued in a county in the years 2011-2012 per 1 km2 of a county area. The statistical data were collected in 2012 and come from the Local Data Bank of the Central Statistical Office of Poland, the data for x7 variable only were obtained from the regional conservation officer.
Format
data.frame: 31 objects (29 counties, pattern and antipattern object), 16 variables. The coordinates of a pattern object cover the most preferred preference variable (stimulants, destimulants, nominants) values. The coordinates of an anti-pattern object cover the least preferred preference variable values.
Source
Gryszel, P., Walesiak, M., (2014), Zastosowanie uogólnionej miary odległości GDM w ocenie atrakcyjności turystycznej powiatów Dolnego Śląska [The Application of the General Distance Measure (GDM) in the Evaluation of Lower Silesian Districts’ Attractiveness], Folia Turistica, 31, 127-147.
Examples
library(mdsOpt)
metnor<-c("n1","n2","n3","n5","n5a","n8","n9","n9a","n11","n12a")
metscale<-c("ratio","interval")
metdist<-c("euclidean","GDM1")
data(data_lower_silesian)
res<-optSmacofSym_mMDS(data_lower_silesian,normalizations=metnor,
distances=metdist,mdsmodels=metscale)
print(findOptimalSmacofSym(res))
draw series of isoquants
Description
function draw series of isoquants (a contour line drawn through the set of points at which the same quantity of output is produced while changing the quantities of two or more inputs)
Usage
drawIsoquants(x,y=NULL,number=6,steps=NULL)
Arguments
x |
two dimensional point (center) |
y |
optional - second point, used for calculations of step size if |
number |
number of isoquants |
steps |
distance between following isoquants starting from x, if length of this arguments is lower than |
Value
This is a plotting function, thus does not return any value
Author(s)
Marek Walesiak marek.walesiak@ue.wroc.pl, Andrzej Dudek andrzej.dudek@ue.wroc.pl
Department of Econometrics and Computer Science, Wroclaw University of Economics and Business, Poland
References
Walesiak, M., (2016), Visualization of Linear Ordering Results for Metric Data with the Application of Multidimensional Scaling, Ekonometria, 2(52), 9-21. Available at: doi:10.15611/ekt.2016.2.01.
Walesiak, M. (2017), The application of multidimensional scaling to measure and assess changes in the level of social cohesion of the Lower Silesia region in the period 2005-2015, Ekonometria, 3(57), 9-25. Available at: doi:10.15611/ekt.2017.3.01.
Walesiak, M., Dudek, A. (2017), Selecting the Optimal Multidimensional Scaling Procedure for Metric Data with R Environment, STATISTICS IN TRANSITION new series, September, Vol. 18, No. 3, pp. 521-540.
Examples
#Example 1
library(mdsOpt)
library(smacof)
library(clusterSim)
data(data_lower_silesian)
z<-data.Normalization(data_lower_silesian, type="n1")
d<-dist.GDM(z, method="GDM1")
res <- smacofSym(delta=d,ndim=2,type="interval")
print("Objects configuration", quote=FALSE)
plot(res, plot.type="confplot")
r1<-res$conf[nrow(z),1]
r2<-res$conf[nrow(z),2]
r3<-res$conf[nrow(z)-1,1]
r4<-res$conf[nrow(z)-1,2]
arrows(r1,r2,r3,r4,length=0.1,col="black")
res_up<-as.matrix(dist(res$conf,method="euclidean"))
drawIsoquants(res$conf[nrow(z)-1,],steps=max(res_up)/6)
# or
# drawIsoquants(res$conf[nrow(z)-1,],steps=c(0.3,0.2),number=8)
#Example 2
library(mdsOpt)
library(smacof)
library(clusterSim)
data(data_lower_silesian)
z<-data.Normalization(data_lower_silesian, type="n1")
d<-dist.GDM(z, method="GDM1")
res<-smacofSym(delta=d,ndim=2,type="interval")
res1<-res$conf
#write.table(res1,"conf_2d.csv",dec=",",sep=";",col.names=NA,row.names=TRUE)
alfa<- 1.05*pi
a<- cos(alfa)
b<- -sin(alfa)
c<- sin(alfa)
d<- cos(alfa)
D<-array(c(a,b,c,d), c(2,2))
#res1<-read.csv2("conf_2d.csv", header=TRUE, row.names=1)
res1<-as.matrix(res1)
res2<-res1
plot(res2, xlab="Dimension 1",ylab="Dimension 2",main="",asp=1)
points(res2[1:31,],pch=1,font=2)
text(res2[c(1:31),],pos=3,cex=0.7,row.names(z[c(1:31),]))
r1<-res2[nrow(z),1]
r2<-res2[nrow(z),2]
r3<-res2[nrow(z)-1,1]
r4<-res2[nrow(z)-1,2]
arrows(r1,r2,r3,r4,length=0.1,col="black")
res_up<-as.matrix(dist(res2,method="euclidean"))
drawIsoquants(res2[nrow(z)-1,],steps=max(res_up)/6)
Selecting the optimal multidimensional scaling (MDS) procedure
Description
Selecting the optimal multidimensional scaling procedure - metric MDS (by varying all combinations of normalization methods, distance measures, and metric MDS models) and nonmetric MDS (by varying all combinations of normalization methods and distance measures)
Usage
findOptimalSmacofSym(table,
critical_stress=(max(as.numeric(gsub(",",".",table[,"STRESS 1"],fixed=TRUE)))+
min(as.numeric(gsub(",",".",table[,"STRESS 1"],fixed=TRUE))))/2,
critical_HHI=NA)
Arguments
table |
result from
|
critical_stress |
threshold value of Kruskal's Stress-1 fit measure. Default - mid-range of Kruskal's Stress-1 fit measures calculated for all MDS procedures |
critical_HHI |
threshold value of Hirschman-Herfindahl HHI index. Only one parameter critical_stress or critical_HHI can be set, and the function finds the optimal value among the procedures for which the selected measure is lower or equal treshold value |
Value
Nr |
number of row in |
Normalization_method |
normalization method used for optimal multidimensional scaling procedure |
MDS_model |
MDS model used for optimal multidimensional scaling procedure |
Spline_degree |
Additional spline.degree value for optimal procedure, if mspline model is used for simulation. For other models there is no value for this field |
Distance_measure |
distance measure used for optimal multidimensional scaling procedure |
STRESS_1 |
value of Kruskal Stress-1 fit measure for optimal multidimensional scaling procedure |
HHI_spp |
Hirschman-Herfindahl HHI index, calculated based on stress per point, for optimal multidimensional scaling procedure |
Author(s)
Marek Walesiak marek.walesiak@ue.wroc.pl, Andrzej Dudek andrzej.dudek@ue.wroc.pl
Department of Econometrics and Computer Science, Wroclaw University of Economics and Business, Poland
References
Borg, I., Groenen, P.J.F. (2005), Modern Multidimensional Scaling. Theory and Applications, 2nd Edition, Springer Science+Business Media, New York. ISBN: 978-0387-25150-9. Available at: https://link.springer.com/book/10.1007/0-387-28981-X.
Borg, I., Groenen, P.J.F., Mair, P. (2013), Applied Multidimensional Scaling, Springer, Heidelberg, New York, Dordrecht, London. Available at: doi:10.1007/978-3-642-31848-1.
De Leeuw, J., Mair, P. (2015), Shepard Diagram, Wiley StatsRef: Statistics Reference Online, John Wiley & Sons Ltd.
Dudek, A., Walesiak, M. (2020), The Choice of Variable Normalization Method in Cluster Analysis, pp. 325-340, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.
Herfindahl, O.C. (1950), Concentration in the Steel Industry, Doctoral thesis, Columbia University.
Hirschman, A.O. (1964). The Paternity of an Index, The American Economic Review, Vol. 54, 761-762.
Walesiak, M. (2014), Przegląd formuł normalizacji wartości zmiennych oraz ich własności w statystycznej analizie wielowymiarowej [Data Normalization in Multivariate Data Analysis. An Overview and Properties], Przegląd Statystyczny, tom 61, z. 4, 363-372. Available at: doi:10.5604/01.3001.0016.1740.
Walesiak, M. (2016a), Wybór grup metod normalizacji wartości zmiennych w skalowaniu wielowymiarowym [The Choice of Groups of Variable Normalization Methods in Multidimensional Scaling], Przegląd Statystyczny, tom 63, z. 1, 7-18. Available at: doi:10.5604/01.3001.0014.1145.
Walesiak, M. (2016b), Visualization of Linear Ordering Results for Metric Data with the Application of Multidimensional Scaling, Ekonometria, 2(52), 9-21. Available at: doi:10.15611/ekt.2016.2.01.
Walesiak, M., Dudek, A. (2017), Selecting the Optimal Multidimensional Scaling Procedure for Metric Data with R Environment, STATISTICS IN TRANSITION new series, September, Vol. 18, No. 3, pp. 521-540.
Walesiak, M., Dudek, A. (2020), Searching for an Optimal MDS Procedure for Metric and Interval-Valued Data using mdsOpt R package, pp. 307-324, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.
See Also
data.Normalization
, dist.GDM
, dist
, smacofSym
Examples
library(mdsOpt)
metnor<-c("n1","n2","n3","n5","n5a","n8","n9","n9a","n11","n12a")
metscale<-c("ratio","interval")
metdist<-c("euclidean","manhattan","maximum","seuclidean","GDM1")
data(data_lower_silesian)
res<-optSmacofSym_mMDS(data_lower_silesian,normalizations=metnor,
distances=metdist,mdsmodels=metscale,outDec=".")
print(findOptimalSmacofSym(res))
Selecting the optimal multidimensional scaling procedure for interval-valued data
Description
Selecting the optimal multidimensional scaling procedure by varying all combinations of normalization methods, distance measures for interval-valued data, and metric MDS models/
Usage
optSmacofSymInterval(x,dataType="simple",normalizations=NULL,
distances=NULL,mdsmodels=NULL,spline.degrees=c(2),outputCsv="",
outputCsv2="",y=NULL,outDec=",",
stressDigits=6,HHIDigits=2,...)
Arguments
x |
interval-valued data table or matrix or dataset |
dataType |
Type of symbolic data table passed to function: 'sda' - full symbolicDA format object; 'simple' - three dimensional array with lower and upper bound of intervals in third dimension; 'separate_tables' - lower bound of intervals in 'rows' - lower and upper bound of intervals in neighbouring rows; 'columns' - lower and upper bound of intervals in neighbouring columns |
normalizations |
optional, vector of normalization methods that should be used in procedure |
distances |
optional, vector of distance measures (Hausdorf, Ichino-Yaguchi) that should be used in procedure |
mdsmodels |
optional, vector of multidimensional models (ratio, interval, mspline) that should be used in procedure |
spline.degrees |
optional, vector (e.g. 2:4) of spline.degree parameter values that should be used in procedure for mspline model |
outputCsv |
optional, name of csv file with results |
outputCsv2 |
optional, name of csv (comma as decimal point sign) file with results |
y |
matrix or dataset with upper bounds of intervals if argument |
outDec |
decimal sign used in returned table |
stressDigits |
Number of decimal digits for displaying Stress 1 value |
HHIDigits |
Number of decimal digits for displaying HHI spp value |
... |
arguments passed to smacofSym, like ndim, itmax, eps and others |
Details
Parameter normalizations
may be the subset of the following values:
"n1","n2","n3","n3a","n4","n5","n5a","n6","n6a",
"n7","n8","n9","n9a","n10","n11","n12","n12a","n13"
(e.g. normalizations=c("n1","n2","n3","n5","n5a",
"n8","n9","n9a","n11","n12a"))
if normalizations
is set to "n0" no normalization is applied
Parameter distances
may be the subset of the following values:
"H_q1","H_q2","U_2_q1","U_2_q2" (In following order: Hausdorff distance with q=1, Euclidean Hausdorff distance with q=2, Ichino-Yaguchi distance with q=1; Euclidean Ichino-Yaguchi distance with q=2)
(e.g. distances=c("H_q1","U_2_q1"))
Parameter mdsmodels
may be the subset of the following values (metric MDS):
"ratio","interval","mspline" (e.g. c("ratio","interval"))
Value
Data frame ordered by increasing value of Stress-1 fit measure with columns:
Normalization method |
normalization method used for p-th multidimensional scaling procedure |
MDS model |
MDS model used for p-th multidimensional scaling procedure |
Spline degree |
Additional spline.degree value if mspline model is used for simulation, for other models there is no value in this cell |
Distance measure |
distance measures for interval-valued data used for p-th multidimensional scaling procedure |
STRESS 1 |
value of Kruskal Stress-1 fit measure for p-th multidimensional scaling procedure |
HHI spp |
Hirschman-Herfindahl HHI index calculated based on stress per point for p-th multidimensional scaling procedure |
Author(s)
Marek Walesiak marek.walesiak@ue.wroc.pl, Andrzej Dudek andrzej.dudek@ue.wroc.pl
Department of Econometrics and Computer Science, Wroclaw University of Economics and Business, Poland
References
Borg, I., Groenen, P.J.F. (2005), Modern Multidimensional Scaling. Theory and Applications, 2nd Edition, Springer Science+Business Media, New York. ISBN: 978-0387-25150-9. Available at: https://link.springer.com/book/10.1007/0-387-28981-X.
Borg, I., Groenen, P.J.F., Mair, P. (2013), Applied Multidimensional Scaling, Springer, Heidelberg, New York, Dordrecht, London. Available at: doi:10.1007/978-3-642-31848-1.
De Leeuw, J., Mair, P. (2015), Shepard Diagram, Wiley StatsRef: Statistics Reference Online, John Wiley & Sons Ltd.
Dudek, A., Walesiak, M. (2020), The Choice of Variable Normalization Method in Cluster Analysis, pp. 325-340, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.
Herfindahl, O.C. (1950), Concentration in the Steel Industry, Doctoral thesis, Columbia University.
Hirschman, A.O. (1964), The Paternity of an Index, The American Economic Review, Vol. 54, 761-762.
Walesiak, M. (2014), Przegląd formuł normalizacji wartości zmiennych oraz ich własności w statystycznej analizie wielowymiarowej [Data Normalization in Multivariate Data Analysis. An Overview and Properties], Przegląd Statystyczny, tom 61, z. 4, 363-372. Available at: doi:10.5604/01.3001.0016.1740.
Walesiak, M., Dudek, A. (2017), Selecting the Optimal Multidimensional Scaling Procedure for Metric Data with R Environment, STATISTICS IN TRANSITION new series, September, Vol. 18, No. 3, pp. 521-540.
Walesiak, M., Dudek, A. (2020), Searching for an Optimal MDS Procedure for Metric and Interval-Valued Data using mdsOpt R package, pp. 307-324, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.
See Also
data.Normalization
, interval_normalization
, dist.Symbolic
, smacofSym
Examples
library(mdsOpt)
library(clusterSim)
data(data_symbolic_interval_polish_voivodships)
metnor<-c("n1","n2","n3","n5","n5a","n8","n9","n9a","n11","n12a")
metscale<-c("ratio","interval","mspline")
metdist<-c("H_q1","H_q2","U_2_q1","U_2_q2")
res<-optSmacofSymInterval(data_symbolic_interval_polish_voivodships,dataType="simple",
normalizations=metnor,distances=metdist,mdsmodels=metscale,spline.degrees=c(2,3),outDec=".")
stress<-as.numeric(gsub(",",".",res[,"STRESS 1"],fixed=TRUE))
hhi<-as.numeric(gsub(",",".",res[,"HHI spp"],fixed=TRUE))
t<-findOptimalSmacofSym(res)
cs<-(min(stress)+max(stress))/2 # critical stress
plot(stress[-t$Nr],hhi[-t$Nr], xlab="Stress-1", ylab="HHI",type="n",font.lab=3)
text(stress[-t$Nr],hhi[-t$Nr],labels=(1:nrow(res))[-t$Nr])
abline(v=cs,col="red")
points(stress[t$Nr],hhi[t$Nr], cex=5,col="red")
text(stress[t$Nr],hhi[t$Nr],labels=(1:nrow(res))[t$Nr],col="red")
print(t)
Selecting the optimal multidimensional scaling procedure - metric MDS
Description
Selecting the optimal multidimensional scaling procedure by varying all combinations of normalization methods, distance measures, and metric MDS models
Usage
optSmacofSym_mMDS(x,normalizations=NULL,distances=NULL,
mdsmodels=NULL,weights=NULL,spline.degrees=c(2),
outputCsv="",outputCsv2="",outDec=",",
stressDigits=6,HHIDigits=2,...)
Arguments
x |
matrix or dataset |
normalizations |
optional, vector of normalization methods that should be used in procedure |
distances |
optional, vector of distance measures (manhattan, Euclidean, Chebyshew, squared Euclidean, GDM1) that should be used in procedure |
mdsmodels |
optional, vector of multidimensional models (ratio, interval, mspline) that should be used in procedure |
spline.degrees |
optional, vector (e.g. 2:4) of spline.degree parameter values that should be used in procedure for mspline model |
weights |
optional, variable weights used in distance calculation. Each weight takes value from interval [0; 1] and sum of weights equals one |
outputCsv |
optional, name of csv file with results |
outputCsv2 |
optional, name of csv (comma as decimal point sign) file with results |
outDec |
decimal sign used in returned table |
stressDigits |
Number of decimal digits for displaying Stress 1 value |
HHIDigits |
Number of decimal digits for displaying HHI spp value |
... |
arguments passed to smacofSym, like ndim, itmax, eps and others |
Details
Parameter normalizations
may be the subset of the following values:
"n1","n2","n3","n3a","n4","n5","n5a","n6","n6a",
"n7","n8","n9","n9a","n10","n11","n12","n12a","n13"
(e.g. normalizations=c("n1","n2","n3","n5","n5a",
"n8","n9","n9a","n11","n12a"))
if normalizations
is set to "n0" no normalization is applied
Parameter distances
may be the subset of the following values:
"euclidean","manhattan","maximum","seuclidean","GDM1"
(e.g. distances=c("euclidean","manhattan"))
Parameter mdsmodels
may be the subset of the following values (metric MDS):
"ratio","interval","mspline" (e.g. c("ratio","interval"))
Value
Data frame ordered by increasing value of Stress-1 fit measure with columns:
Normalization method |
normalization method used for p-th multidimensional scaling procedure |
MDS model |
MDS model used for p-th multidimensional scaling procedure |
Spline degree |
Additional spline.degree value if mspline model is used for simulation, for other models there is no value in this cell |
Distance measure |
distance measure used for p-th multidimensional scaling procedure |
STRESS 1 |
value of Kruskal Stress-1 fit measure for p-th multidimensional scaling procedure |
HHI spp |
Hirschman-Herfindahl HHI index calculated based on stress per point for p-th multidimensional scaling procedure |
Author(s)
Marek Walesiak marek.walesiak@ue.wroc.pl, Andrzej Dudek andrzej.dudek@ue.wroc.pl
Department of Econometrics and Computer Science, Wroclaw University of Economics and Business, Poland
References
Borg, I., Groenen, P.J.F. (2005), Modern Multidimensional Scaling. Theory and Applications, 2nd Edition, Springer Science+Business Media, New York. ISBN: 978-0387-25150-9. Available at: https://link.springer.com/book/10.1007/0-387-28981-X.
Borg, I., Groenen, P.J.F., Mair, P. (2013), Applied Multidimensional Scaling, Springer, Heidelberg, New York, Dordrecht, London. Available at: doi:10.1007/978-3-642-31848-1.
De Leeuw, J., Mair, P. (2015), Shepard Diagram, Wiley StatsRef: Statistics Reference Online, John Wiley & Sons Ltd.
Dudek, A., Walesiak, M. (2020), The Choice of Variable Normalization Method in Cluster Analysis, pp. 325-340, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.
Herfindahl, O.C. (1950), Concentration in the Steel Industry, Doctoral thesis, Columbia University.
Hirschman, A.O. (1964), The Paternity of an Index, The American Economic Review, Vol. 54, 761-762.
Walesiak, M. (2014), Przegląd formuł normalizacji wartości zmiennych oraz ich własności w statystycznej analizie wielowymiarowej [Data Normalization in Multivariate Data Analysis. An Overview and Properties], Przegląd Statystyczny, tom 61, z. 4, 363-372. Available at: doi:10.5604/01.3001.0016.1740.
Walesiak, M. (2016a), Wybór grup metod normalizacji wartości zmiennych w skalowaniu wielowymiarowym [The Choice of Groups of Variable Normalization Methods in Multidimensional Scaling], Przegląd Statystyczny, tom 63, z. 1, 7-18. Available at: doi:10.5604/01.3001.0014.1145.
Walesiak, M. (2016b), Visualization of Linear Ordering Results for Metric Data with the Application of Multidimensional Scaling, Ekonometria, 2(52), 9-21. Available at: doi:10.15611/ekt.2016.2.01.
Walesiak, M., Dudek, A. (2017), Selecting the Optimal Multidimensional Scaling Procedure for Metric Data with R Environment, STATISTICS IN TRANSITION new series, September, Vol. 18, No. 3, pp. 521-540.
Walesiak, M., Dudek, A. (2020), Searching for an Optimal MDS Procedure for Metric and Interval-Valued Data using mdsOpt R package, pp. 307-324, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.
See Also
data.Normalization
, dist.GDM
, dist
, smacofSym
Examples
library(mdsOpt)
metnor<-c("n1","n2","n3","n5","n5a","n8","n9","n9a","n11","n12a")
metscale<-c("ratio","interval","mspline")
metdist<-c("euclidean","manhattan","seuclidean","maximum","GDM1")
data(data_lower_silesian)
res<-optSmacofSym_mMDS(data_lower_silesian,,normalizations=metnor,distances=metdist,
mdsmodels=metscale, spline.degrees=c(2:3),outDec=".")
stress<-as.numeric(gsub(",",".",res[,"STRESS 1"],fixed=TRUE))
hhi<-as.numeric(gsub(",",".",res[,"HHI spp"],fixed=TRUE))
cs<-(min(stress)+max(stress))/2 # critical stress
t<-findOptimalSmacofSym(res,critical_stress=cs)
print(t)
plot(stress[-t$Nr],hhi[-t$Nr], xlab="Stress-1", ylab="HHI",type="n",font.lab=3)
text(stress[-t$Nr],hhi[-t$Nr],labels=(1:nrow(res))[-t$Nr])
abline(v=cs,col="red")
points(stress[t$Nr],hhi[t$Nr], cex=5,col="red")
text(stress[t$Nr],hhi[t$Nr],labels=(1:nrow(res))[t$Nr],col="red")
Selecting the optimal multidimensional scaling procedure - nonmetric MDS
Description
Selecting the optimal multidimensional scaling procedure by varying all combinations of normalization methods and distance measures
Usage
optSmacofSym_nMDS(x,normalizations=NULL,distances=NULL,
mdsmodels=c("ordinal"),weights=NULL,
outputCsv="",outputCsv2="",outDec=",",
stressDigits=6,HHIDigits=2,...)
Arguments
x |
matrix or dataset |
normalizations |
optional, vector of normalization methods that should be used in procedure |
distances |
optional, vector of distance measures (manhattan, Euclidean, Chebyshew, squared Euclidean, GDM1) that should be used in procedure |
mdsmodels |
"ordinal" (nonmetric MDS) |
weights |
optional, variable weights used in distance calculation. Each weight takes value from interval [0; 1] and sum of weights equals one |
outputCsv |
optional, name of csv file with results |
outputCsv2 |
optional, name of csv (comma as decimal point sign) file with results |
outDec |
decimal sign used in returned table |
stressDigits |
Number of decimal digits for displaying Stress 1 value |
HHIDigits |
Number of decimal digits for displaying HHI spp value |
... |
arguments passed to smacofSym |
Details
Parameter normalizations
may be the subset of the following values:
"n1","n2","n3","n3a","n4","n5","n5a","n6","n6a",
"n7","n8","n9","n9a","n10","n11","n12","n12a","n13"
(e.g. normalizations=c("n1","n2","n3","n5","n5a",
"n8","n9","n9a","n11","n12a"))
if normalizations
is set to "n0" no normalization is applied
Parameter distances
may be the subset of the following values:
"euclidean", "manhattan","maximum","seuclidean","GDM1"
(e.g. distances=c("euclidean","manhattan"))
Parameter mdsmodels
"ordinal" MDS model (nonmetric MDS)
Value
Data frame ordered by increasing value of Stress-1 fit measure with columns:
Normalization method |
normalization method used for p-th multidimensional scaling procedure |
MDS model |
"ordinal" MDS model (nonmetric MDS) for p-th multidimensional scaling procedure |
Distance measure |
distance measure used for p-th multidimensional scaling procedure |
STRESS 1 |
value of Kruskal Stress-1 fit measure for p-th multidimensional scaling procedure |
HHI spp |
Hirschman-Herfindahl HHI index calculated based on stress per point for p-th multidimensional scaling procedure |
Author(s)
Marek Walesiak marek.walesiak@ue.wroc.pl, Andrzej Dudek andrzej.dudek@ue.wroc.pl
Department of Econometrics and Computer Science, Wroclaw University of Economics and Business, Poland
References
Borg, I., Groenen, P.J.F. (2005), Modern Multidimensional Scaling. Theory and Applications, 2nd Edition, Springer Science+Business Media, New York. ISBN: 978-0387-25150-9. Available at: https://link.springer.com/book/10.1007/0-387-28981-X.
Borg, I., Groenen, P.J.F., Mair, P. (2013), Applied Multidimensional Scaling, Springer, Heidelberg, New York, Dordrecht, London. Available at: doi:10.1007/978-3-642-31848-1.
De Leeuw, J., Mair, P. (2015), Shepard Diagram, Wiley StatsRef: Statistics Reference Online, John Wiley & Sons Ltd.
Dudek, A., Walesiak, M. (2020), The Choice of Variable Normalization Method in Cluster Analysis, pp. 325-340, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.
Herfindahl, O.C. (1950), Concentration in the Steel Industry, Doctoral thesis, Columbia University.
Hirschman, A.O. (1964), The Paternity of an Index, The American Economic Review, Vol. 54, 761-762.
Walesiak, M. (2014), Przegląd formuł normalizacji wartości zmiennych oraz ich własności w statystycznej analizie wielowymiarowej [Data Normalization in Multivariate Data Analysis. An Overview and Properties], Przegląd Statystyczny, tom 61, z. 4, 363-372. Available at: doi:10.5604/01.3001.0016.1740.
Walesiak, M. (2016a), Wybór grup metod normalizacji wartości zmiennych w skalowaniu wielowymiarowym [The Choice of Groups of Variable Normalization Methods in Multidimensional Scaling], Przegląd Statystyczny, tom 63, z. 1, 7-18. Available at: doi:10.5604/01.3001.0014.1145.
Walesiak, M. (2016b), Visualization of Linear Ordering Results for Metric Data with the Application of Multidimensional Scaling, Ekonometria, 2(52), 9-21. Available at: doi:10.15611/ekt.2016.2.01.
Walesiak, M., Dudek, A. (2017), Selecting the Optimal Multidimensional Scaling Procedure for Metric Data with R Environment, STATISTICS IN TRANSITION new series, September, Vol. 18, No. 3, pp. 521-540.
Walesiak, M., Dudek, A. (2020), Searching for an Optimal MDS Procedure for Metric and Interval-Valued Data using mdsOpt R package, pp. 307-324, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.
See Also
data.Normalization
, dist.GDM
, dist
, smacofSym
Examples
library(mdsOpt)
metnor<-c("n1","n2","n3","n5","n5a","n8","n9","n9a","n11","n12a")
metscale<-"ordinal"
metdist<-c("euclidean","manhattan","maximum","seuclidean","GDM1")
data(data_lower_silesian)
res<-optSmacofSym_nMDS(data_lower_silesian,normalizations=metnor,
distances=metdist,mdsmodels=metscale)
stress<-as.numeric(gsub(",",".",res[,"STRESS 1"],fixed=TRUE))
hhi<-as.numeric(gsub(",",".",res[,"HHI spp"],fixed=TRUE))
cs<-(min(stress)+max(stress))/2 # critical stress
t<-findOptimalSmacofSym(res,critical_stress=cs)
print(t)
plot(stress[-t$Nr],hhi[-t$Nr], xlab="Stress-1", ylab="HHI",type="n",font.lab=3)
text(stress[-t$Nr],hhi[-t$Nr],labels=(1:nrow(res))[-t$Nr])
abline(v=cs,col="red")
points(stress[t$Nr],hhi[t$Nr], cex=5,col="red")
text(stress[t$Nr],hhi[t$Nr],labels=(1:nrow(res))[t$Nr],col="red")
Cretaes video by FFmpeg with animation of dataset rotated
Description
This function opens a graphics device to record the images produced in the
code expr
, then uses FFmpeg to convert these images to a video.
Usage
rotation2dAnimation(conf2d,
ani.interval=0.2,
ani.nmax=361,
ani.width=500,
ani.height=500,
ani.video.name="mds_rotate.mp4",
angle.start=-pi,
angle.stop=pi,
angle.step=pi/180)
Arguments
conf2d |
two dimensional dataset ot matrix |
ani.video.name |
the file name of the output video (e.g. ‘animation.mp4’ or ‘animation.avi’) |
ani.interval |
interval betwwen animation frames |
ani.nmax |
maximal number of frames |
ani.width |
width of movie |
ani.height |
height of movie |
angle.start |
starting angle for animation |
angle.stop |
end angle for animation |
angle.step |
step of animation in radians |
Details
This function uses system
to call FFmpeg to convert the images
to a single video. The command line used in this function is: ffmpeg
-y -r <1/interval> -i <img.name>%d.<ani.type> other.opts video.name
where interval
comes from ani.options('interval')
, and
ani.type
is from ani.options('ani.type')
. For more details on
the numerous options of FFmpeg, please see the reference.
Some linux systems may use the alternate software 'avconv' instead of 'ffmpeg'. The package will attempt to determine which command is present and set ani.options('ffmpeg')
to an appropriate default value. This can be overridden by passing in the ffmpeg
argument.
Value
An integer indicating failure (-1) or success (0) of the converting
(refer to system
).
Author(s)
Marek Walesiak marek.walesiak@ue.wroc.pl, Andrzej Dudek andrzej.dudek@ue.wroc.pl
Department of Econometrics and Computer Science, Wroclaw University of Economics and Business, Poland
References
Walesiak, M. (2016), Visualization of Linear Ordering Results for Metric Data with the Application of Multidimensional Scaling, Ekonometria, 2(52), 9-21. Available at: doi:10.15611/ekt.2016.2.01.
Walesiak, M. (2017), The application of multidimensional scaling to measure and assess changes in the level of social cohesion of the Lower Silesia region in the period 2005-2015, Ekonometria, 3(57), 9-25. Available at: doi:10.15611/ekt.2017.3.01.
Walesiak, M., Dudek, A. (2017), Selecting the Optimal Multidimensional Scaling Procedure for Metric Data with R Environment, STATISTICS IN TRANSITION new series, September, Vol. 18, No. 3, pp. 521-540.
https://yihui.org/animation/example/savevideo/
http://ffmpeg.org/documentation.html
See Also
Other utilities: im.convert
,
saveGIF
, saveHTML
,
saveLatex
, saveSWF
Examples
library(mdsOpt)
library(smacof)
library(animation)
library(spdep)
library(clusterSim)
data(data_lower_silesian)
z<-data.Normalization(data_lower_silesian, type="n1")
d<-dist.GDM(z, method="GDM1")
res<-smacofSym(delta=d,ndim=2,type="interval")
konf<-as.matrix(res$conf)
#Uncomment only if ffmpeg is properly installed for animation package
#see: https://yihui.org/animation/example/savevideo/
#oopts = if (.Platform$OS.type == "windows") {
# ani.options(ffmpeg = "D:/Installer/ffmpeg/bin/ffmpeg.exe")
#}
#rotation2dAnimation(conf2d=konf,angle.start=-0,angle.stop=2*pi)