Data and File Layouts
The SphinxBio team appreciates that not all data are in the form of a simple table. Instruments, data pipelines, and collaborators may use other formats that make it hard to use common analysis tools.
This page presents an overview of formats we have seen and how to upload them to Sphinx. Our reference data for these examples is this tidy table that presents measurements for two genes across multiple samples.
We believe we can upload many formats of data – if you don’t see an example that matches your data, please contact Support and we will work with you to help get your data into Sphinx.
Table
Table is the most common unit of data. Tables are how Datasets are presented in Sphinx and the preferred mode of data import as they are the fastest and simplest to upload.
Transposed
Transposed tables have the variables that were measures as fields in the first, leftmost column.
How to handle
When uploading data in this format, simple select the “unpivot” option upon upload.
Stacked column (horizontal)
Stacked columns are a set of tables placed next to each other. They do not form one table because the rows and columns contain different observations and variables.
How to handle
When uploading data in this format you will need to select each data region individually. Once the regions are selected you can define if they need to be stacked on top of each other or joined together. Data in these format often are un-related across stacked regions and require a stack.
Stacked row (vertical)
Stacked rows are a set of transposed tables stacked on top of each other. They do not form one transposed table because the rows and columns contain different observations and variables.
How to handle
When uploading data in this format you will need to select each data region individually. After selection you will need to unpivot the data to make them into a normal tabular format. Once the regions are selected you can define if they need to be stacked on top of each other or joined together. Data in these format often are un-related across stacked regions and require a stack.
Matrix
A Matrix has the variables for the experiment on the left most column and on the top row. Often these are the positions on a well plate, but may be values for a timecourse, kinectic measurement settings, or other varying settings for the experiment.
How to handle
When uploading data in this format you will need to select that these data are a plate.
Stacked matrix
A stacked matrix is a special case for a matrix, where multiple matricies are sequentially placed on top of each other or to the sides of each other.
How to handle
When uploading data in this format you will need to select each data region individually. After selection, you will need to select that these data are a plate.
Multi column or row matrix
A multi-column or multi-row matrix is one where multiple experimental variable are in the rows or columns of the matrix. This is often seen with instrument outputs. You can identify this format by the presence of multiple rows or column surroning the measured data. In this example the top two rows show it is a multi-row matrix.
In this example the left two columns show it is a multi-column matrix.
How to handle
When uploading data in this format you will need to select the data region as a matrix, and then indicate if it has muliple left, right, top, or bottom headers.
Unsupported Formats
Formats listed here are currently not supported in a single upload. Contact Support and we will help you build the correct import template.
Multi index column
A multi index column has multiple rows that identify information in a column. These can be in a normal or a transposed table.
Multi index row
A multi index row has multiple column that identify information in a row. These can be in a normal or a transposed table.
Matrix with top meta
A matrix with top meta is a combination of the a key:value or a table with one or more matricies sequentailly stacked on top.