Tutorial: data analysis

In this tutorial, we are going to create a pipeline for processing data stored in an Experiment.

The tutorial project

In this tutorial we use the same dataset as in the first tutorial (see Tutorial: data management). We then suppose that we already have a created experiment with the imported and annotated data.

In the following of this tutorial, we are going to process the data in 3 steps.

Image deconvolution on each image to ease the spot segmentation
Auto thresholding and particle analysis on the deblurred and denoised images
Statistical testing with the Wilcoxon test to conclude if the two populations have significantly different number of spots

The 3 proposed steps are just one possible way to analyse the data. The purpose here is just to illustrate how to use the BioImageIT app. Many other processing pipeline are possible to analyse this dataset, but it is not the purpose of this tutorial.

First, we open the BioImageIT application:

Image deconvolution

To ease the spot segmentation, we chose to preprocess the data with a deconvolution algorithm. The selected algorithm is the Spitfire2D. It is a c++ implementation of a sparse variation deconvolution method.

Click on the Toolboxes button main application top bar:

Then click on the Deconvolution toolbox:

And click on the Spitfire 2D tool open button:

We can now run the tool on the tutorial Experiment. Select the tutorial experiment in the “Experiment” field. When the experiment is recognised, the field Input image is automatically filed with the data dataset which is the only dataset we have in our experiment.

Then we need to setup the deconvolution parameters. This task can be done by trial and error. In this example we previously selected the best parameters as:

Press Run and wait the process to finish:

We can now open the experiment from the home page:

And select the spitfiredeconv2d dataset

We can visualize the obtained result by clicking the View button of an image:

we can see that the sports are now easy to distinguish from the background. The metadata button of each image show the metadata of the image and the details of it origin (raw data annotations and run information).

Spot detection

After deconvolution, the spots are easy to detect on the images. We can simply threshold the image and count the number of independent component in the binary map. BioImageIT wrap a Fiji macro that runs an auto-threshold and the analyse particles tool. This is exactly what we need here.

Open the Toolboxes:

Click on the Spots detection toolbox.

Open the Count particles tool:

In the experiment field, select the tutorial experiment, and for the input image field select the deconvolution image from the previous process: spitfiredeconv2d:Denoised image

Press Run and wait for the process to finish:

We can now go back to the experiment editor tab, and press the refresh button for the new dataset threshold particles to appear.

We can see that we have 3 new data per image: count, measure, draw. count is the number of spot in the image. It is the output of interest for our problem. measure is a table with properties of the spots and draw is a representation of the spot localisation.

If we click on the view button of the count data, the viewer shows the number of spot for this image:

And clicking on the view button of the count data shows the localization of the detected spots:

Statistical testing

In the previous processing step, we extracted the number of spots for each image. This number is contained in the count data file for each image. In this step we are going to run a statistical testing on these number in order to measure if the Population1 and Population2 data have significant different numbers of spots.

To illustrate the use of statistical testing with BioImageIT, we chose in this tutorial to run a Wilcoxon rank test. This is not the best test for such statistical analysis, but the purpose of the tutorial is to show how to run tools, and Wilcoxon rank test is a simple easy to use example.

Go back to the toolboxes tab of the BioImageIT app,

and select the statistics toolbox:

Open the Wilcoxon tool:

Select the tutorial experiment in the Experiment field.

The Wilcoxon tool have too inputs: Population1 and Population2. These two inputs are in fact arrays of values corresponding to the two populations we want to process. In most of the existing applications, to construct such arrays, we need to write a script that read the values (number of spot) for each image, create the two arrays and run the statistical test.

Because in BioImageIT, we annotated the data, we can simply use Filter to automatically generate the data arrays.

For the Population1 and Population2, select the line threshold_particles:Number Of Particles (see figure above).

Now, we need to specify that for Population1 we want to select the images with the corresponding key-value pair: Population=population1. Click on the Filter button at the right of the Population1 input. It opens a popup window where you can tune a filter. Here we select the data where the key Population equals “population1”

When we validate, the filters status changes to ON.

Then, we do the same for the second population:

and validate:

Press the Run button:

We can now go to the experiment editor tab, press Refresh on the to toolbar and select the Wilcoxon dataset:

We can now see the Wilcoxon dataset contains 2 data:

t: the Wilcoxon statistic
p: the p-value

Click the view button of the p-value data:

We can read that the p-value equals 0.0075. This means that we can reject the null hypothesis saying that the 2 populations have the same number of spots.

Note

During the step, we mention that BioImageIT created two arrays from the dataset threshold_particles:Number Of Particles using the Filters that we tuned with the experiment annotations. In fact, these arrays are stored in the output dataset. Thus, if we open the directory path/to/tutorial/Wicoxon/ we can find the file x.csv and y.csv that actually contain these two arrays.

Conclusion

In this tutorial we saw how to use the BioImageIT app, to build step by step an image analysis pipeline without writing a single line of code.

All the data we generated are stored in an Experiment database with automatically generated metadata. This means that for every data in the Experiment database, we can track it origin and the parameters of each processing tool used to generate it.