Wednesday, August 24, 2016

Interested in the Poldrack Lab for graduate school?

This is the time of year when I start getting lots of emails asking whether I am accepting new grad students for next year.  The answer is almost always going to be yes (unless I am moving, and I don’t plan on doing that again for a long time!), because I am always on the lookout for new superstars to join the lab.  If you are interested, here are some thoughts and tips that I hope will help make you more informed about the process.  These are completely my own opinions, and some of them may be totally inaccurate regarding other PIs or graduate programs, so please take them for what they are worth and no more.

Which program should I apply to? I am affiliated with three graduate programs at Stanford: Psychology, Neuroscience, and Biomedical Informatics. In choosing a program, there are several important differences:

  • Research: While most of these programs are fairly flexible, there are generally some expectations regarding the kind of research you will do, depending on the specific program.  For example, if you joining the BMI program then your work is expected to have at least some focus on  novel data analysis or informatics methods, whereas if you are joining Psychology your work is expected to make some contact with psychological function. Having said that, most of what we do in our lab could be done by a student in any of these programs.
  • Coursework: Perhaps the biggest difference between programs is the kind of courses you are required to take. Each program has a set of core requirements.  In psychology, you will take a number of core courses in different areas of psychology (cognitive, neuroscience, social, affective, developmental).  In the neuroscience program you will take a set of core modules spanning different areas of neuroscience (including one on cognitive neuroscience that Justin Gardner and I teach), whereas in BMI you take core courses around informatics-related topics.  In each program you will also take elective courses (often outside the department) that establish complementary core knowledge that is important for your particular research; for example, you can take courses in our world-class statistics department regardless of which program you enroll in. One way to think about this is:  What do I want to learn about that is outside of my specific content area? Take a look at the core courses in each program and see which ones interest you the most.
  • First-year experience: In Psychology, students generally jump straight into a specific lab (or a collaboration between labs), and spend their first year doing a first-year project that they present to their area meeting at the end of the year. In Neuroscience and BMI, students do rotations in multiple labs in their first year, and are expected to pick a lab by the end of their first year. 
  • Admissions: All of these programs are highly selective, but each differs in the nature of its admissions process.  At one end of the spectrum is the Psychology admissions process, where initial decisions for who to interview are made by the combined faculty within each area of the department.  At the other end is the Neuroscience program, where initial decisions are made by an admissions committee.  As a generalization, I would say that the Psychology process is better for candidates whose interests and experience fit very closely with a specific PI or set of PIs, whereas the committee process caters towards candidates who may not have settled on a specific topic or PI.
  • Career positioning: I think that the specific department that one graduates from matters a lot less than people think it does.  For example, I have been in psychology departments that have hired people with PhDs in physics, applied mathematics, and computer science. I think that the work that you do and the skills that you acquire ultimately matter a lot more than the name of the program that is listed on your diploma.  

What does it take to get accepted? There are always more qualified applicants than there are spots in our graduate programs, and there is no way to guarantee admission to any particular program.  On the flipside, there are also no absolute requirements: A perfect GRE score and a 4.0 GPA are great, but we look at the whole picture, and other factors can sometimes outweigh a weak GRE score or GPA.  There are a few factors that are particularly important for admission to my lab:

  • Research experience: It is very rare for someone to be accepted into any of the programs I am affiliated with at Stanford without significant research experience.  Sometimes this can be obtained as an undergraduate, but more often successful applicants to our program have spent at least a year working as a research assistant in an active research laboratory.  There are a couple of important reasons for this.  First, we want you to understand what you are getting into; many people have rosy ideas of what it’s like to be a scientist, which can fall away pretty quickly in light of the actual experience of doing science.  Spending some time in a lab helps you make sure that this is how you want to spend your life. In addition, it provides you with someone who can write a recommendation letter that speaks very directly to your potential as a researcher.  Letters are a very important part of the admissions process, and the most effective letters are those that go into specific detail about your abilities, aptitude, and motivation.
  • Technical skills: The research that we do in my lab is highly technical, requiring knowledge of computing systems, programming, and math/statistics.  I would say that decent programming ability is a pretty firm prerequisite for entering my lab; once you enter the lab I want you to be able to jump directly into doing science, and this just can’t happen if you have to spend a year teaching yourself how to program from scratch. More generally, we expect you to be able to pick up new technical topics easily; I don’t expect students to necessarily show up knowing how a reinforcement learning model works, but I expect them to be able to go and figure it out on their own by reading the relevant papers and then implement it on their own. The best way to demonstrate programming ability is to show a specific project that you have worked on. This could be an open source project that you have contributed to, or a project that you did on the side for fun (for example, mine your own social media feed, or program a cognitive task and measure how your own behavior changes from day to day). If you don’t currently know how to program, see my post on learning to program from scratch, and get going!
  • Risk taking and resilience: If we are doing interesting science then things are going to fail, and we have to learn from those failures and move on.  I want to know that you are someone who is willing to go out on a limb to try something risky, and can handle the inevitable failures gracefully.  Rather than seeing a statement of purpose that only lists all of your successes, I find it very useful to also know about risks you have taken (be they physical, social, or emotional), challenges you have faced, failures you have experienced, and most importantly what you learned from all of these experiences.
What is your lab working on? The ongoing work in my lab is particularly broad, so if you want to be in a lab that is deeply focused on one specific question then my lab is probably not the right place for you.  There are few broad questions that encompass much of the work that we are doing:
  • How can neuroimaging inform the structure of the mind?  My general approach to this question is outlined in my Annual Review chapter with Tal Yarkoni.  Our ongoing work on this topic is using large-scale behavioral studies (both in-lab and online) and imaging studies to characterize the underlying structure of the concept of “self-regulation” as it is used across multiple areas of psychology.  This work also ties into the Cognitive Atlas project, which aims to formally characterize the ontology of psychological functions and their relation to cognitive tasks. Much of the work in this domain is discovery-based data-driven, in the sense that we aim to discover structure using multivariate analysis techniques rather than testing specific existing theories. 
  • How do brains and behavior change over time?  We are examining this at several different timescales. First, we are interested in how experience affects value-based choices, and particularly how the exertion of cognitive control or response inhibition can affect representations of value (Schonberg et al., 2014). Second, we are studying dynamic changes in both resting state and task-related functional connectivity over the seconds/minutes timescale (Shine et al, 2016), in order to relate network-level brain function to cognition.  Third, we are mining the MyConnectome data and other large datasets to better understand how brain function changes over the weeks/months timescale (Shine et al, 2016, Poldrack et al., 2015).  
  • How can we make science better?  Much of our current effort is centered on developing frameworks for improving the reproducibility and transparency of science.  We have developed the OpenfMRI and Neurovault projects to help researchers share data, and our Center for Reproducible Neuroscience is currently developing a next-generation platform for analysis and sharing of neuroimaging data.  We have also developed the Experiment Factory infrastructure for performing large-scale online behavioral testing.  We are also trying to do our best to make our own science as reproducible as possible; for example, we now pre-register all of our studies, and for discovery studies we try when possible to validate the results using a held-out validation sample.

These aren’t the only topics we study, and we are always looking for new and interesting extensions to our ongoing work, so if you are interested in other topics then it’s worth inquiring to see if they would fit with the lab’s interests.   At present, roughly half of the lab is engaged in basic cognitive neuroscience questions, and the other half is engaged in questions related to data analysis/sharing and open science.  This can make for some interesting lab meetings, to say the least. 

What kind of adviser am I? Different advisers have different philosophies, and it’s important to be sure that you pick an advisor whose style is right for you.  I would say that the most important characteristic of my style is that I am to foster independent thinking in my trainees.  Publishing papers is important, but not as important as developing one’s ability to conceive novel and interesting questions and ask them in a rigorous way. This means that beyond the first year project, I don’t generally hand my students problems to work on; rather, I expect them to come up with their own questions, and then we work together to devise the right experiments to test them.  Another important thing to know is that I try to motivate by example, rather than by command.  I rarely breathe down my trainees necks about getting their work done, because I work on the assumption that they will work at least as hard as I work without prodding.  On the other hand, I’m fairly hands-on in the sense that I still love to get deep in the weeds of experimental design and analysis code.  I would also add that I am highly amenable to joint mentorship with other faculty.



If you have further questions about our lab, please don’t hesitate to contact me by email.  Unfortunately I don’t have time to discuss ongoing research with everyone who is interested in applying, but I try to do my best to answer specific questions about our lab’s current and future research interests. 

Sunday, August 21, 2016

The principle of assumed error

I’m going to be talking at the Neurohackweek meeting in a few weeks, giving an overview of issues around reproducibility in neuroimaging research.  In putting together my talk, I have been thinking about what general principles I want to convey, and I keep coming back to the quote from Richard Feynman in his 1974 Caltech commencement address: "The first principle is that you must not fool yourself and you are the easiest person to fool.”  In thinking about how can we keep from fooling ourselves, I have settled on a general principle, which I am calling the “principle of assumed error” (I doubt this is an original idea, and I would be interested to hear about relevant prior expressions of it).  The principle is that whenever one finds something using a computational analysis that fits with one’s predictions or seems like a “cool” finding, they should assume that it’s due to an error in the code rather than reflecting reality.  Having made this assumption, one should then do everything they can to find out what kind of error could have resulted in the effect.  This is really no different from the strategy that experimental scientists use (in theory), in which upon finding an effect they test every conceivable confound in order to rule them out as a cause of the effect.  However, I find that this kind of thinking is much less common in computational analyses. Instead, when something “works” (i.e. gives us an answer we like)  we run with it, whereas when the code doesn’t give us a good answer then we dig around for different ways to do the analysis that give a more satisfying answer.  Because we will be more likely to accept errors that fit our hypotheses than those that do not due to confirmation bias, this procedure is guaranteed to increase the overall error rate of our research.  If this sounds a lot like p-hacking, that’s because it is; as Gelman & Loken pointed out in their Garden of Forking Paths paper, one doesn't have to be on an explicit fishing expedition in order to engage in practices that inflate error due to data-dependent analysis choices and confirmation bias.  Ultimately I think that the best solution to this problem is to always reserve a validation dataset to confirm the results of any discovery analyses, but before one burns their only chance at such a validation, it’s important to make sure that the analysis has been thoroughly vetted.

Having made the assumption that there is an error, how does one go about finding it?  I think that standard software testing approaches offer a bit of help here, but in general it’s going to be very difficult to find complex algorithmic errors using basic unit tests.  Instead, there are a couple of strategies that I have found useful for diagnosing errors.

Parameter recovery
If your model involves estimating parameters from data, it can be very useful to generate data with known values of those parameters and test whether the estimates match the known values.  For example, I recently wrote a python implementation of the EZ-diffusion model, which is a simple model for estimating diffusion model parameters from behavioral data.  In order to make sure that the model is correctly estimating these parameters, I generated simulated data using parameters randomly sampled from a reasonable range (using the rdiffusion function from the rtdists R package), and then estimated the correlation between the parameters used to generate the data and the model estimates. I set an aribtrary threshold of 0.9 for the correlation between the estimated and actual parameters; since there will be some noise in the data, we can't expect them to match exactly, but this seems close enough to consider successful.  I set up a test using pytest, and then added CircleCI automated testing for my Github repo (which automatically runs the software tests any time a new commit is pushed to the repo)1. This shows how we can take advantage of software testing tools to do parameter recovery tests to make sure that our code is operating properly.  I would argue that whenever one implements a new model fitting routine, this is the first thing that should be done. 

Imposing the null hypothesis
Another approach is to generate data for which the null hypothesis is true, and make sure that the results come out as expected under the null.  This is a good way to protect one from cases where the error results in an overly optimistic result (e.g. as I discussed here previously). One place I have found this particularly useful is in checking to make sure that there is no data peeking when doing classification analysis.  In this example (Github repo here), I show how one can use random shuffling of labels to test whether a classification procedure is illegally peeking at test data during classifier training. In the following function, there is an error in which the classifier is trained on all of the data, rather than just the training data in each fold:

def cheating_classifier(X,y):
    skf=StratifiedKFold(y,n_folds=4)
    pred=numpy.zeros(len(y))
    knn=KNeighborsClassifier()
    for train,test in skf:
        knn.fit(X,y) # this is training on the entire dataset!
        pred[test]=knn.predict(X[test,:])
    return numpy.mean(pred==y)

Fit to a dataset with a true relation between the features and the outcome variable, this classifier predicts the outcome with about 80% accuracy.  In comparison, the correct procedure (separating training and test data):

def crossvalidated_classifier(X,y):
    skf=StratifiedKFold(y,n_folds=4)
    pred=numpy.zeros(len(y))
    knn=KNeighborsClassifier() 
    for train,test in skf:
        knn.fit(X[train,:],y[train])
        pred[test]=knn.predict(X[test,:])
    return numpy.mean(pred==y)

predicts the outcome with about 68% accuracy.  How would we know that the former is incorrect?  What we can do is to perform the classification repeatedly, each time shuffling the labels.  This is basically making the null hypothesis true, and thus accuracy should be at chance (which in this case is 50% because there are two outcomes with equal frequency).  We can assess this using the following:

def shuffle_test(X,y,clf,nperms=10000):
    acc=[]
    y_shuf=y.copy()

    for i in range(nperms):
        numpy.random.shuffle(y_shuf)
        acc.append(clf(X,y_shuf))
    return acc

This shuffles the data 10,000 times and assesses classifier accuracy.  When we do this with the crossvalidated classifier, we see that accuracy is now about 51% - close enough to chance that we can feel comfortable that our procedure is not biased.  However, when we submit the cheating classifier to this procedure, we see mean accuracy of about 69%; thus, our classifier will exhibit substantial classification accuracy even when there is no true relation between the labels and the features, due to overfitting of noise in the test data.

Randomization is not perfect; in particular, one needs to make sure that the samples are exchangeable under the null hypothesis.  This will generally be true when the samples were acquired through random sampling, but can fail when there is structure in the data (e.g. when the samples are individual subjects, but some sets of subjects are related). However, it’s often a very useful strategy when this assumption holds.

I’d love to hear other ideas about how to implement the principle of assumed error for computational analyses.  Please leave your comments below!

1 This should have been simple, but I hit some snags that point to just how difficult it can be to build truly reproducible analysis workflows. Running the code on my Mac, I found that my tests passed (i.e. the correlation between the estimated parameters using EZ-diffusion and the actual parameters used to generate the data was > 0.9), confirming that my implementation seemed to be accurate. However, when I ran it on CircleCI (which implements the code within a Ubuntu Linux virtual machine), the tests failed, showing much lower correlations between estimated and actual values. Many things differed between the two systems, but my hunch was that it was due to the R code that was used to generate the simulated data (since the EZ diffusion model code is quite simple). I found that when I updated my Mac to the latest version of the rtdists package used to generate the data, I reproduced the poor results that I had seen on the CircleCI test. (I turns out that the parameterization of the function that was using had changed, leading to bad results with the previous function call.). My interim solution was to simply install the older version of the package as part of my CircleCI setup; having done this, the CircleCI tests now pass as well.

Friday, July 22, 2016

Having my cake and eating it too?

Several years ago I blogged about some of the challenges around doing science in a field with emerging methodological standards.  Today, a person going by the handle "Student" posted a set of pointed questions to this post, which I am choosing to respond to here as a new post rather than burying them in the comments on the previous post. Here are the comments:

Dr. Poldrack has been at the forefront of advocating for increased rigor and reproducibility in neuroimaging and cognitive neuroscience. This paper provides many useful pieces of advice concerning the reporting of fMRI studies, and my comments are related to this paper and to other papers published by Dr. Poldrack. One of the sections in this paper deals specifically with the reporting of methods and associated parameters related to the control of type I error across multiple tests. In this section, Dr. Poldrack and colleagues write that "When cluster-based inference is used, this should be clearly noted and both the threshold used to create the clusters and the threshold for cluster size should be reported". I strongly agree with this sentiment, but find it frustrating that in later papers, Dr. Poldrack seemingly disregards his own advice with regard to the reporting of extent thresholds, opting to report only that data were cluster-corrected at P<0.05 (e.g. http://cercor.oxfordjournals.org/content/20/3/524.long, http://cercor.oxfordjournals.org/cgi/content/abstract/18/8/1923, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2876211/). In another paper (http://www.ncbi.nlm.nih.gov/pmc/articles/pmid/19915091/), the methods report that "Z (Gaussianised T ) statistic images were thresholded using cluster-corrected statistics with a height threshold of Z > 2.3 (unless otherwise noted) and a cluster probability threshold of P < 0.05, whole- brain corrected using the theory of Gaussian random fields", although every figure presented in the paper notes that the statistical maps shown were thresholded at Z>1.96, P<0.05, corrected. This last instance is particularly confusing, and borders on being misleading. While these are arguably minor omissions, I find it odd that I am thus far unable to find a paper where Dr. Poldrack actually follows his own advice here.  
In another opinion paper regarding fMRI analyses and reporting (http://www.ncbi.nlm.nih.gov/pubmed/21856431), Dr. Poldrack states “Some simple methodological improvements could make a big difference. First, the field needs to agree that inference based on uncorrected statistical results is not acceptable (cf. Bennett et al., 2009). Many researchers have digested this important fact, but it is still common to see results presented at thresholds such as uncorrected p<.005. Because such uncorrected thresholds do not adapt to the data (e.g., the number of voxels tests or their spatial smoothness), they are certain to be invalid in almost every situation (potentially being either overly liberal or overly conservative).” This is a good point, but given the fact that Dr. Poldrack has published papers in high impact journals that rely heavily on inferences from data using uncorrected thresholds (e.g. http://www.ncbi.nlm.nih.gov/pubmed/16157284), and does not appear to have issued any statements to the journals regarding their validity, one wonders whether Dr. Poldrack wants to have his cake and eat it too, so to say. A similar point can be made regarding Dr. Poldrack’s attitude regarding the use of small volume correction. In this paper, he states “Second, I have become increasingly concerned about the use of “small volume corrections” to address the multiple testing problem. The use of a priori masks to constrain statistical testing is perfectly legitimate, but one often gets the feeling that the masks used for small volume correction were chosen after seeing the initial results (perhaps after a whole-brain corrected analysis was not significant). In such a case, any inferences based on these corrections are circular and the statistics are useless”. While this is also true, one wonders whether Dr. Poldrack only trusts his group to use this tool correctly, since it is frequently employed in his papers. 
In a third opinion paper (http://www.ncbi.nlm.nih.gov/pubmed/20571517), Dr. Poldrack discusses the problem of circularity in fMRI analyses. While this is also an important topic, Dr. Poldrack’s group has also published papers using circular analyses (e.g. http://www.jneurosci.org/content/27/14/3743.full.pdf, http://www.jneurosci.org/content/26/9/2424, http://www.ncbi.nlm.nih.gov/pubmed/17255512). 
I would like to note that the reason for this comment is not to malign Dr. Poldrack or his research, but rather to attempt to clarify Dr. Poldrack’s opinion of how others should view his previous research when it fails to meet the rigorous standards that he persistently endorses. I am very much in agreement with Dr. Poldrack that rigorous methodology and transparency are important foundations for building a strong science. As a graduate student, it is frustrating to see high-profile scientists such as Dr. Poldrack call for increased methodological rigor by new researchers (typically while, rightfully, labeling work that does not meet methodological standards as being unreliable) when they (1) have benefited (and arguably continue to benefit) from the relatively lower barriers to entry that come from having entered a research field before the emergence of a rigid methodological framework (i.e. in having Neuron/PNAS/Science papers on their CV that would not be published in a low-tier journal today due to their methodological problems) , and (2) not applying the same level of criticism or skepticism to their own previous work as they do to emerging work when it does not meet current standards of rigor or transparency. I would like to know what Dr. Poldrack’s opinions are on these issues. I greatly appreciate any time and/or effort spent reading and/or replying to this comment. 

I appreciate these comments, and in fact I have been struggling with exactly these same issues myself, and my realizations about the shortcomings of our past approaches to fMRI analysis have shaken me deeply. Student is exactly right that I have been a coauthor on papers using methods or reporting standards that I now publicly claim to be inappropriate. S/he is also right that my career has benefited substantially from papers published in high profile journals prior using these methods that I now claim to inappropriate.  I'm not going to either defend or denounce the specific papers that the commentator mentions.  I am in agreement that some of my papers in the past used methods or standards that we would now find problematic, but I am actually heartened by that: If we were still satisfied with the same methods that we had been using 15 years ago, then that would suggest that our science had not progressed very far.  Some of those results have been replicated (at least conceptually), which is also heartening, but that's not really a defense.

I also appreciate Student's frustration with the fact that someone like myself can become prominent doing studies that are seemingly lacking according to today's standards, but then criticize the field for doing the same thing.  But at the same time I would ask: Is there a better alternative?  Would you rather that I defended those older techniques just because they were the basis for my career?  Should I lose my position in the field because I followed what we thought were best practices at the time but which turned out to be flawed? Alternatively, should I spend my entire career re-analyzing my old datasets to make sure that my previous claims withstand every new methodological development?  My answer to these questions has been to try to use the best methods I can, and to to be as open and transparent as possible.  Here I'd like to outline a few of the ways in which we have tried to do better.

First, I would note that if someone wishes to look back at the data from our previous studies and reanalyze them, almost all of them are available openly through openfmri.org, and in fact some of them have been the basis for previous analyses of reproducibility.  I and my lab have also spend a good deal of time and effort advocating for and supporting data sharing by other labs, because we think that ultimately this is one of the best ways to address questions about reproducibility (as I discussed in the recent piece by Greg Miller in Science).

Second, we have done our best to weed out questionable research practices and p-hacking.  I have become increasingly convinced regarding the utility of pre-registration, and I am now committed to pre-registering every new study that our lab does (starting with our first registration committed this week).  We are also moving towards the standard use of discovery and validation samples for all of our future studies, to ensure that any results we report are replicable. This is challenging due to the cost of fMRI studies, and it means that we will probably do less science, but that's part of the bargain.

Third, we have done our best to share everything.  For example, in the MyConnectome study, we shared the entire raw dataset, as well as putting an immense amount of working into sharing a reproducible analysis workflow.  Similarly, we now put all of our analysis code online upon publication, if not earlier.  

None of this is a guarantee, and I'm almost certain that in 20 years, either a very gray (and probably much more crotchety) version of myself or someone else will come along and tell us why the analyses were we doing in 2016 were wrong in some way that seems completely obvious in hindsight.  That's not something that I will get defensive about because it means that we are progressing as a science.  But it also doesn't mean that we weren't justified to do what we are doing now, trying to follow the best practices that we know how.  





Saturday, May 21, 2016

Scam journals will literally publish crap

In the last couple of years, researchers have started to experience an onslaught of invitations to attend scam conferences and submit papers to scam journals.  Many of these seem to emanate from the OMICS group of Henderson, NV and its various subsidiaries.  A couple of months ago I decided to start trolling these scammers, just to see if I could get a reaction.  After sending many of these, I finally got a response yesterday, which speaks to the complete lack of quality of these journals.  

This was the solicitation:
On May 20, 2016, at 12:55 AM, Abnormal and Behavioural Psychology <behaviouralpsychol@omicsinc.com> wrote: 
Dear Dr. Russell A. Poldrack,Greetings from the Journal of Abnormal and Behavioural Psychology
Journal of Abnormal and Behavioural Psychology is successfully publishing quality articles with the support of eminent scientists like you.
We have chosen selective scientists who have contributed excellent work, Thus I kindly request you to contribute a (Research, Review, Mini Review, Short commentary) or any type of article.
The Journal is indexed in with EBSCO (A-Z), Google Scholar, SHERPA-Romeo, Open J-gate, Journal Seek, Electronic Journals Library, Academic Keys, Safety Lit and many more reputed indexing databases.
 
We publish your manuscript within seven days of Acceptance. For your Excellent Research work we are offering huge discount in the publishing fee (70%). So, we will charge you only 300 USD. This huge offer we are giving in this month only. 
...
With kind regards
Sincerely,
Joyce V. Andria

I had previously received exactly this same solicitation about a month ago, to which I had responded like this:
Dear Ms Andria, 
Thanks for your message.  I just spent three minutes reading and thinking about your email.  My rate for commercial consulting is $500/hour.  Can you please remit your payment of $25 to me at the address below?  I’m sure you can understand that the messages from your organization take valuable time away from scientists, and that you would agree that it’s only fair to renumerate us for this time.
I look forward to receiving your payment promptly.  If you do remit within 30 days I will be forced to send this invoice out for collection.
Sincerely,
Russ Poldrack
I got no response to that message.  So when I received the new message, I decided to step up my troll-fu:
Dear Ms. Andria,
Many thanks for your message soliciting a (Research, Review, Mini Review, Short commentary) or any type of article for your journal. I have a paper that I would like to submit but I am not sure what kind of article it qualifies as. The title is "Tracking the gut microbiome". The paper does not include any text; it is composed entirely of photos of my bowel movements taken every morning for one year. Please let me know if your journal has the capability to publish such a paper; I have found that many other journals are not interested.
Sincerely,
Russell Poldrack
Within 12 hours, I had a response:
From: Abnormal and Behavioural Psychology <behaviouralpsychol@omicsinc.com>
Subject: RE: Appreciated your Excellent Research work
Date: May 20, 2016 at 9:47:28 PM PDT
To: "'Russell Alan Poldrack'" <russpold@stanford.edu>
Dear Dr. Russell A. Poldrack,

Greetings from the Journal of Abnormal and Behavioural Psychology

Thank you for your reply.

I hereby inform you that your article entitled: “Tracking the gut microbiome” is an image type article.

We are happy to know that you want to publish your manuscript with us.

We are waiting for your  earliest submission.

We want to introduce your research work in this month to our Journal. We will be honored to be a part of your scientific journey.

Kindly submit your article on before 26th may, 2016.


Awaiting your response.,

With kind regards
Sincerely,
Anna Watson
Journal Coordinator
Journal of Advances in Automobile Engineering
There you have it: These journals will literally publish complete crap. I hope the rest of you will join me in trolling these parasites - post your trolls and any results in the comments.

Friday, May 20, 2016

Advice for learning to code from scratch

I met this week with a psychology student who was interested in learning to code but had absolutely no experience.  I personally think it’s a travesty that programming is not part of the basic psychology curriculum, because doing novel and interesting research in psychology increasingly requires the ability to collect and work with large datasets and build new analysis tools, which are almost impossible without solid coding skills.  

Because it’s been a while since I learned to code (back when programs were stored on cassette tapes), I decided to ask my friends on the interwebs for some suggestions.  I got some really great feedback, which I thought I would synthesize for others who might be in the same boat.  

Some of the big questions that one should probably answer before getting started are:

  1. Why do you want to learn to code?  For most people who land in my office, it’s because they want to be able to analyze and wrangle data, run simulations, implement computational models, or create experiments to collect data.  
  2. How do you learn best?  I can’t stand watching videos, but some people swear by them.  Some people like to just jump in and start doing, whereas others like to learn the concepts and theories first.  Different strokes...
  3. What language should you start with?  This is the stuff of religious wars.  What’s important to realize, though, is that learning to program is not the same as learning to use a specific language.  Programming is about how to think algorithmically to solve problems; the specific language is just an expression of that thinking.  That said, languages differ in lots of ways, and some are more useful than others for particular purposes.  My feeling is that one should start by learning a first-class language, because it will be easier to learn good practices that are more general.  Your choice of a general purpose language should probably be driven by the field you are in; neuroscientists are increasingly turning to Python, whereas in genomics it seems that Java is very popular.  I personally think that Python offers a nice mix of power and usability, and it’s the language that I encourage everyone to start with.  However, if all you care about doing it performing statistical analyses, then learning R might be your first choice, whereas if you just want to build experiments for mTurk, then Javascript might be the answer.  There may be some problem for which MATLAB is the right answer, but I’m no longer sure what it is. A caveat to all of this is that if you have friends or colleagues who are programming, then you should strongly consider using whatever language they are using, because they will be your best source of help.
  4. What problem do you want to solve?  Some people can learn for the sake of learning, but I find that I need a problem in order to keep me motivated.  I would recommend thinking of a relevant problem that you want to solve and then targeting your learning towards that problem.  One good general strategy is to find a paper in your area of research interest, and try to implement their analysis. Another (suggested by Christina van Heer) is to take some data output from an experiment (e.g. in an Excel file), read it in, and compute some basic statistics.  If you don't have your own data, another alternative is to take a large open dataset (such as health data from NHANES or an openfmri dataset from openfmri.org ) and try to wrangle the data into a format that lets you ask an interesting question.
OK then, so where do you look for help in getting started?

The overwhelming favorite in my social media poll was codeacademy.  It offers interactive exercises in lots of different languages, including Python.  Another Pythonic suggestion was http://learnpythonthehardway.org/book/ which looks quite good. 

For those of you who prefer video courses, there were also a number of votes for online courses, including those from Coursera:
And  FutureLearn:
If you like video courses then these would be a good option.  

Other suggestions included:

Here are some suggested sites with various potentially useful tips




Finally, it’s also worth keeping an eye out for local Software Carpentry workshops.

If you have additional suggestions, please leave them in the comments!

Monday, April 18, 2016

How folksy is psychology? The linguistic history of cognitive ontologies

I just returned from a fabulous meeting on Rethinking the Taxonomy of Psychology, hosted by Mike Anderson, Tim Bayne, and Jackie Sullivan.  I think that in another life I must have been a philosopher, because I always have so much fun hanging out with them, and this time was no different.  In particular, the discussions at this meeting moved from simply talking about whether there is a problem with our ontology (which is old hat at this point) to specifically how we can think about using neuroscience to revise the ontology.  I was particularly excited to see all of the interest from a group of young philosophers whose work is spanning philosophy and cognitive neuroscience, who I am counting on to keep the field moving forward!

I have long made the the point that the conceptual structure of current psychology is not radically different from that of William James in the 19th century.  This seems plausible on its face if you look at some of the section headings from his 1890 “To How Many Things Can We Attend At Once?”
  • “The Varieties Of Attention.”
  • “The Improvement Of Discrimination By Practice”
  • “The Perception Of Time.”
  • “Accuracy Of Our Estimate Of Short Durations”
  • “To What Cerebral Process Is The Sense Of Time Due?”
  • “Forgetting.”
  • “The Neural Process Which Underlies Imagination”
  • “Is Perception Unconscious Inference?”
  • “How The Blind Perceive Space.”
  • “Emotion Follows Upon The Bodily Expression In The Coarser Emotions At Least.”
  • “No Special Brain-Centres For Emotion”
  • “Action After Deliberation”:
Beyond the sometimes flowery language, there are all topics that one could imagine being topics of research papers today, but for my talk I wanted to see if there was more direct evidence that the psychological ontology is less different (and thus more "folksy") than ontologies in other sciences.   To address this, I did a set of analyses that looked at the linguistic history of terms in the contemporary psychological ontology (as defined in the Cognitive Atlas) as compared to terms from contemporary biology (as enshrined in the Gene Ontology).  I started (with a bit of help from Vanessa Sochat) by examining the proportion of terms from the Cognitive Atlas that were present in James' Principles (from the full text available here).  This showed that 22.9% of the terms in our current ontology were present in James's text (some examples are: goal, deductive reasoning, effort, false memory, object perception, visual attention, task set, anxiety, mental imagery, unconscious perception, internal speech, primary memory, theory of mind, judgment).

How does this compare to biology?  To ask this, I obtained two biology textbooks published around the same time as James' Principles (T. H. Huxley's Course of Elementary Instruction in Practical Biology from 1892, and T. J. Parker's Lessons in Elementary Biology from 1893), which are both available in full text from Google Books.  In each of these books I assessed the presence of each term from the Gene Ontology, separately for each of the GO subdomains (biological processes, molecular functions, and cellular components).  Here are the results:

Huxley Parker Overlap
biological process (28,566) 0.09% (26) 0.1% (32) 20
molecular functions (10,057) 0 0 -
cellular components (3,903) 1.05% (41) 1.01% (40) 25

The percentages of overlap are much lower, perhaps not surprisingly since the number of GO terms is so much larger than the number of Cognitive Atlas terms.  But even the absolute numbers are substantially lower, and there is not one mention of any of the GO molecular functions (striking but completely unsurprising, since molecular biology would not be developed for many more decades).

These results were interesting, but it could be that they are specific to these particular books, so I generalized the analysis using the Google N-Gram corpus, which indexes the presence of individual words and phrases across more than 3 million books.  Using a python package that accesses the ngram viewer API, I estimated the presence of all of the Cognitive Atlas terms as well as randomly selected subsets of each of the GO subdomains in the English literature between 1800 and 2000; I'm planning to rerun the analysis on the full corpus using the downloaded version of the N-grams corpus, but using this API required throttling that prevented me from the full sets of GO terms.  Here are the results for the Cognitive Atlas:

It is difficult to imagine stronger evidence that the ontology of psychology is relying on pre-scientific concepts; around 80% of the one-word terms in the ontology were already in use in 1800! Compare this to the Gene Ontology terms (note that there were not enough single-word molecular function terms to get a reasonable estimate):




It's clear that the while a few of the terms in these ontologies were in use prior to the development of the biosciences, the proportion is much smaller than what one sees for psychology. In my talk, I laid out two possibilities arising from this:

  1. Psychology has special access to its ontology that obviates the need for a rejection of folk concepts
  2. Psychology is due for a conceptual revolution that will leave behind at least some of our current concepts
My guess is that the truth lies somewhere in between these.  The discussions that we had at the meeting in London provided some good ideas about how to conceptualize the kinds of changes that neuroscience might drive us to make to this ontology. Perhaps the biggest question to come out of the meeting was whether a data-driven approach can ever overcome the fact that the data were collected from experiments that are based on the current ontology. I am guessing that it can (given, e.g. the close relations between brain activity present in task and rest), but this remains one of the biggest questions to be answered.  Fortunately there seems to be lots of interest and I'm looking forward to great progress on these questions in the next few years.

Friday, February 26, 2016

Reproducibility and quantitative training in psychology

We had a great Town Hall Meeting of our department earlier this week, which was focused on issues around reproducibility, which Mike Frank has already discussed in his blog.  A number of the questions that were raised by both faculty and graduate students centered around training, and this has gotten many of us thinking about how we should update our quantitive training to address these concerns.  Currently the graduate statistics course is fairly standard, covering basic topics in probability and statistics including basic probability theory, sampling distributions, null hypothesis testing, general(ized) linear models (regression, ANOVA), and mixed models, with exercises done primarily using R.  While many of these topics remain essential for psychologists and neuroscientists, it's equally clear that there are a number of other topics that we might want to cover that are highly relevant to issues of reproducibility:

  • the statistics of reproducibility (e.g., implications of power for predictive validity; Ioannidis, 2005)
  • Bayesian estimation and inference
  • bias/variance tradeoffs and regularization
  • generalization and cross-validation
  • model-fitting and model comparison
There are also a number of topics that are clearly related to reproducibility but fall more squarely under the topic of "software hygiene":
  • data management
  • code validation and testing
  • version control
  • reproducible workflows (e.g., virtualization/containerization)
  • literate programming
I would love to hear your thoughts about what a 21st century graduate statistics course in psychology/neuroscience should cover- please leave comments below!