Data Regions
Data outside of Sphinx are conceptualized as structured units, referred to as data regions, which encapsulate both the core data and its descriptive metadata. Each data region is bounded by label regions, which are positioned around a central value region. These label regions, located at the top, bottom, left, or right of the value region, provide positional information, effectively relating the multidimensional information to rows and columns.
Key Components of how we think about data
- Data Region: The primary structural unit of a dataset, composed of a Value Region surrounded by Label Regions. The data region integrates core data and associated metadata, facilitating the organization of information within the dataset.
- Value Region: The value region represents the core of the data region, where primary data—quantitative or qualitative—is stored. The structure of the value region is influenced by the surrounding label regions, with the overall configuration ranging from matrix-like (symmetrically structured) to table-like (asymmetrically structured) forms.
- Label Region: The label region describes the value region by providing the necessary contextual information, such as categorical labels or metadata, that assigns meaning to the stored data. These labels are critical for interpreting the relationships between data points within the value region.
- Varied Size: A data region can exhibit arbitrary size variations, depending on the nature of the dataset and its instances. Data regions must extend in orthogonal directions relative to the label regions. When size variation occurs, the expansion of the data region is governed by whether the variation crosses over a label region, which determines the need for more complex handling.
- Data Layout: The number of label regions, their position, and the option for varied size define the data layout. A table is layout where one top label region is present. A matrix is where one or more top and left label regions are present. A place is a special case of a matrix where only one top and left label region is present.
You can see examples of data regions we can import in Data Import section. Sphinx uses this approach to data regions to structure data and make it into Tidy Data.