Title:  Using the Real Dimension of the Data 
Authors:  Christian Zirkelbach 
Series:  Linköping Electronic Articles
in Computer and Information Science ISSN 14019841 
Issue:  Vol. 5 (2000), No. 004 
URL:  http://www.ep.liu.se/ea/cis/2000/004/ 
Abstract:  This paper presents a method for extracting the real dimension
of a large data set in a highdimensional data cube and indicates its use
for visual data mining. A similarity measure structures a data set in a
general, but weak sense. If the elements are part of a highdimensional
host space (primary space), for instance a data warehouse cube, the resulting
structure doesn't necessarily reflect the real dimension of the embedded
(secondary) space. We show that a metricstructured set has, in general,
a fractal dimension. This means that the data set is a finite subset of
a fractal secondary space of lower dimension.
Mapping the set into the secondary space of lower dimension will not result in loss of information with regard to the semantics defined by the measure. However, it helps to reduce storage and computing efforts. Additionally, the secondary space itself reveals much about the set's structure and can facilitate data mining. The main problem with the secondary space is that it is unknown, and
if it is not a linear subspace of


First posting 20000308 
In ETAI area "Concept Based Knowledge Representation" 

Original publication 20001205 
Postscript
part I  Checksum
Checksum (old) Information about recalculation of checksum Postscript part II  Checksum II Checksum II (old) Information about recalculation of checksum 
This article was first posted on the Internet as specified under "First posting", and appeared on the Epress server on the date specified under "Original publication".