Same Same Blog: Merging Datasets into a matrix for analysis

I have n samples/datasets, each which is giving me variables and their associated values.
In the case of mass spec data, this is a (string) formula and the (float) relative abundance of that formula.

For each sample, there may be a thousand or more formula.

Across all the samples, there will be formula common to some or all samples, and formula unique to some samples.

Examples shown at end of post.

Are there any easy ways to merge all of these into a single table which is then easily imported into R or Python for analysis?

(Sidenote: I wrote a script to do this with masses and intensities, its not particularly efficient or easily modified to work on strings).

Help!

Thanks :)

Example:
From four CSVs with a structure like this:

To one data table with a structure like this:

Sample ID	C10H20O2	C11H22O2	C10H20O3	C11H22O3
Sample1	0	15	26	3
Sample2	19	88	29	0
Sample3	54	0	66	0
Sample4	30	32	0	0

Same Same Blog

Pages

Friday, 4 March 2016

Merging Datasets into a matrix for analysis

No comments: