Personal Website
Today I am going to show one of my function that I am using quite extensively in my work. It's plotLineHist and that's what it produces:
Well, what the heck is that? PlotLineHist take a cell array C, a matrix M, a function handle F, and several optional arguments. The cell array C is a nxm. You have to think of each row as being an experimental condition for one factor, and each column as the experimental condition for a second (optional) factor. The matrix M specifies the values for the row's factor ( λ in the figure). In the figure, each column's factor is represented by different colours.
PlotLineHist execute the function F (for example, @mean) for each cell of the cell array, plotting the resulting value and connecting the row's (continuous lines in the figure). It also calculates the standard error for each cell (vertical line on each marker).
However, the novelty is that it also plot a frequency distribution itself for each condition, and aligns it vertically on each row's factor. The distribution's value corresponds now to the vertical axis of the figure.
Each row distribution has a different colour. The first row's distribution is plotted on the left side, the others on the right (of course, having more than 3 row's condition will make the plot difficult to read, but it may still be useful in those cases).
If this sounds convoluted, I am sure you will get a better idea by a simple example.
Let's say that we are measuring the effect of a drug on cortisol level on a sample of 8 participants. We have just 1 factor, the horizontal one, which is drug dose: let's say 100mg, 200mg and 300mg. The cortisol level is found to be:
Patient N. | 100mg | 200mg | 300mg |
---|---|---|---|
1 | 10 | 15 | 20 |
2 | 12 | 16 | 23 |
3 | 13 | 14 | 15 |
4 | 13 | 14 | 15 |
5 | 20 | 17 | 20 |
6 | 50 | 30 | 21 |
7 | 12 | 15 | 19 |
8 | 14 | 14 | 23 |
Since we have only one factor, we need to put all the data in the first row of the cell array feed into the plotLineHist function:
p{1,2}=[15 16 14 17 30 15 14 16]; p{1,3}=[20 23 15 20 21 19 23 23]; plotLineHist(p, drugDose,fun);
The blue line with the markers indicates the mean response for each condition (with standard error). But then we also have a nice plot on the distribution for each condition! This allows us to spot some clear outliers in the first two distributions.
Now, let's say that we have two factor. For example, sometime the drug is taken together with another substance, and sometime is not.The drug is still given in 3 different dosages, 100, 200 and 300mg.
%example 2 %without substance A p{1,1}=[10 12 13 20 50 12 14 12]; p{1,2}=[15 16 14 17 30 15 14 16]; p{1,3}=[20 23 15 20 21 19 23 23]; %with substance A p{2,1}=[12 12 12 21 60 13 11 15]; p{2,2}=[16 16 14 18 31 19 18 17]; p{2,3}=[22 23 17 20 21 20 23 23]; drugDose=[100 200 300]; fun=@nanmean; plotLineHist(p, drugDose,fun)
You can see how the second row condition distribution is positioned on the right side, so that the two distributions can be easily compared. Note how must of the nitty-gritty details are automatically calculated by the functions, such as bin width, scaling factors, etc. However, you have the chance to change it manually by using the optional arguments.
For me, this has been a very useful plot to summarize different types of information at the same time, without having a lot of figures.
Try it out and let me know!
This is a little thing that I have done some time ago and never really got time to polish and publish the way I wanted, but it is still nice to put it out there.
This scripts generates X decision processes with some parameters and transform them in music. This just answer to the question that, for sure, every one of you is asking himself: how does a diffusion process sound like? And the answer is: horribly! As expected 😉
To code it, I used the nice library of Ken Schutte to read and write MIDI in MATLAB.
In the code, you can change the number of "voices" used (the num. of diff. processes), or the distribution of drift rate, starting tone , starting time, and length for each voice. Play around with this parameter and try to see if you can come up with anything reasonable. I couldn't!
The script also generate a figure representing the resulting voices (still thanks to Ken library), that may look like this:
As usual, you can download the file from here.
In this post I will present a code I've written to generate Quantile Probability Plots. You can download the code from the MATLAB File Exchange Website.
What are these Quantile Probability Plots? They are some particular plots very used in the Psychology field, expecially in RT experiments when we want to analyse several distributions from several subjects in different conditions. First of all let's see an example of a quantile probability plot in the literature:
(from Ratcliff and Smith, 2011)
As explained in Simen et al., 2009
"Quantile Probability Plots (QPP) provide a compact form of representation for RT and accuracy data across multiple conditions. In a QPP quantiles of a distribution of RTs of
a particular type (e.g., crrect responses) are plotted as a function of proportion of responses of that type: thus, a vertical column of N markers would be centered above the position 0.8 if N quantiles were computed from the correct RTs in a task condition in which accuracy was 80%). The ith quantile of each distribution is then connected by a line to the ith quantiles of othe distribution."
For example, in the graph presented we have four counditions. From the graph we can extrapolate the percentage of correct responses to be around 0.7, 0.85, 0.9 and 0.95 (look at the right side). For each one of this distribution we computer 5 quantiles, plotted against the y axis (in ms). In the left side we have the distributions from the same conditions, but for the error responses!
With my code you can easily generate this kind of graph and even something more. Let's see how to use it.
First of all, you need to organize the file in this way: first column has to be the dependent variable (for example, reaction times in our case); second column the correct or incorrect label (1 for correct, 0 for incorrect); third column, the condition (any float/integer number, does not need to be in order). This could be enough, but most of the time you want to calculate the average across more than one subject. If this is the case, you need to indicate another column with the subject number.
The classic way to use it is just call it with the data:
quantProbPlot(data) will generate:
We may want to put some labels to indicate the conditions. In this case we ca use the optional argument 'condLabel': quantProbPlot(data,'condLabel',1) will generate:
(for this and the following plots, the distributions will always look different just because everytime I generated a different sample distributions!)
This is only the beginning. In some papers they plot the classic QPP with a superimposed scatter plot of individual RTs in each condition. A random noise is added on the x axes to improve redeability. Example: quantProbPlot(data, 'scatterPlot',1):
Nice isn't it? I also elaborate some strategies to better compare error responses with correct responses, through two optional parameters. One is "separate" and the other one is "reverse". Separate can take 0, 1 or 2, whereas reverse can only take 0 or 1 and works only if separate is >0.
Generally, separate=1 separaters the correct responses with the incorrect one in two subplots, whereas separate=2 plot the correct and incorrect with two different lines. This is usually quite useless, for example:
however, you can make it much more interesting when you combine it with reverse. Infact, if you call quantProbPlot(quantData,'separate',2,'reverse',1) you will obtain this nice graph:
which allows you to easily compare correct and incorrect responses!
With all these options, you can play around with scatter plot, separating, labels etc. in order to easily analyse your data.
I include in the file also the Drift Diffusion Model file that I proposed last time. I used this file to generate the dataset I use to test the Quantile Probability Plot code. For some example, open the "testQuantProbPlot", also included in the zip file.
In psychology we often deal with data from several subjects, each subjet having several datapoints across more than one condition. Showing the results of an experiment for individual subjects in order to get a meaningful insight understandment of the data could be quite tricky. A classic way of organizing the data consists of plotting each subject's result in a subplot, in order to have all the subject's data in a single figure. MATLAB function subplot is quite good for the task, even if more and more often I find myself using the tight_subplot function. Anyway, one problem is that the subplot axes does not have the same scale by default, and this decrease the redeability of the results.
MATLAB allows you to link different axes (namely different subplots) so that you can scale all of them at the same time (linkaxes). However, once that all the subplots have the same scale, showing the axes label for each subplot seems a little bit redundant. It would be sufficient just to show the axes of the bottom and left most panels. I coded a function that does exactly that.
scaleSubplot(fig, varargin) allows you to scale at the same time all the subplots of a figure. You can specify a custom scale or let the code decide a nice unique scale for all the subplots. You can decide to scale at the same time the horizontal and vertical axes, or only one of them. Let's look at some example.
The easiest way of using the function is: scaleSubplot(gcf). In this case you will scale both x and y axes, and the script will find the best xlimit and ylimit for all the subplots. The code will make sure that all the data will be visualized in the subplots.
Note how the redundant x/ylabels have been suppressed:
We can have more control on the image by specifying some optional arguments. For example, if we want to have the same scale only the x axis, we specify 'sameScale',{'x'}. Notice how the x labels on the intermediate subplots are suppressed, but all the y labels are still there, since they do not have the same scale. In the figure below we also specify a custom xlimit. If you decide to specify a custom limit, make sure that all the datapoints in all the subplot fall within you limits.
We can also specify only one of the boundaries for the axes limit, and leave the code decide for the other one. For example we can write scaleSubplot(gcf,'xlimit',[-Inf 40]) or scaleSubplot(gcf,'ylimit',[0 Inf]).
My function is expecially useful when we want to have a tight subplot. In this case, eliminating the intermediate axis can make the subplot much more readable. For example, I generate the image on the left side with tight_subplot(2,4,0.05), where 0.05 is the gap between subplots. However, with all the axes labels, the image is a mess. After a simple scaleSubplot(gcf) the figure looks much better.
Hope you enjoy it.
You can find some information about the Drift Diffusion Model here. There are probably several versions of the Drift Diffusion Model coded in MATLAB. I coded my own for two purposes:
1) If I code something, I better understand it
2) I wanted to have a flexible version that I can easily modify.
I attach to this post my MATLAB code for the DDM. It is optimized to the best of my skill. It can be used to simulate a model with two choices (as usual) or one choice, with or without variability across trial (so it can actually be used to simulate a Pure DDM or a Extended DDM). The code is highly commented.
The file can be downloaded here or, if you are have a MATLAB account, here.
If you take a look at the code, you could get confused about the presence at the same time of the for loop and the cumsum function. The cumsum is a really fast way in MATLAB to operate a cumulative sum of a vector. However, in this case I have to sum an undefined number of points (since the process stops when it hits the threshold, and it is not possible to know it beforehand). I could just use an incredibly high number of points and hope the process hits the threshold at some point. Or I could use a for loop to keep summing points until it hits a threshold. Both these methods are computationally expensive. So I used a compromise: the software runs cumsum for the first maxWalk points,and, if the threshold has not been hit, runs cumsum again (starting from the end of the previous run) within a loop (it repeats the loop for 100 times, every time for maxWalk points). After some testing, this version is generally much faster than a version with only cumsum or only a for loop.
This is an example of the resulting RT distribution with a high drift rate (e.g., the correct stimulus is easily identifiable):
I will soon post an alternative version of this file. They are less efficient, but allow to plot the process as it accumulates, which is quite cool.