Tidy data is a standardized way of organizing data to make it easier to analyze. Applications expect data to be tidy to create plots and analyses in a standardized way. Sphinx helps you create tidy data during the data import process and by applying transformations to your data in an analysis.

The key principles of tidy data are:

Each variable forms a column: Each measured variable is placed in its own column.
Each observation forms a row: Each different observation of that variable is placed in its own row.
Each type of observational unit forms a table: Each dataset is organized into a table.

Messy vs Tidy Data

This table contains messy data, as each observations for a given wavelength and sample type are split across multiple columns. The experimental variable might be the gene and the treatment

ID	ABS_280_Control	ABS_320_Control	ABS_280_Sample	ABS_320_Sample
1	0.77
2		0.62
3			0.55
4				0.53

Here are the data in a tidy format. Notice that the variables encoded in columns names like ABS_280_Control is represented in appropriate columns by splitting on the _.

ID	Condition	Wavelength	Absorbance
1	Control	280	0.77
2	Control	320	0.62
3	Sample	280	0.55
4	Sample	320	0.53