Reflections on Research

In which I struggle not to complain about bureaucracy

Aug 29, 2023

So this is going to be one of those “personal reflection” posts I do from time to time, so there won’t be a lot of citing of research. Rather this is going to be a reflection on my journey so far in regards to putting together a research project within a university context.

Just to be crystal clear: while there are common standards and practices between universities and departments, there is also wide range in variance between them. Also, these standards can vary over time, and for individuals within the same department, and for the same individual depending on their topic and specific methods. So, while I’m striving to be as accurate as possible, be cautious about generalising this.

I’m putting this out there for a couple of reasons. First, I genuinely think it’s interesting. By knowing more about how research is done, we’re better able to understand and engage with it in a useful fashion. We can decide better how much certain strengths and weaknesses of a given study actually matter. Many people really don’t understand how research actually happens, which makes it hard for them to understand or critically evaluate any given finding. I had to explain the replication crisis to someone a while ago, and it became apparent that they didn’t have the first clue how research actually was done, which made explaining the ways it went wrong quite challenging. All they knew was that researchers make publications, which were the thing that those guys showed were actually meaningless. And while that and other cases do a good job of highlighting serious problems with the publication process, you can’t really properly understand the context without more knowledge of the field and tensions and incentives.

Second, there’s an annoying tendency for psychology to be viewed as clinical psychology by default, with other approaches being fringe or outright ignored. I’ve taught undergraduate classes who honestly didn’t know that non-clinical psychology was even an option, despite being more than halfway through their degrees. I’ve had intelligent, sensible people be fundamentally incapable of understanding that my work has little to no clinical application, that I’m not trying to help clinicians (although if that happens by accident I have no objection, of course) – that I’m seeking merely to understand, not necessarily help, how people’s minds work. This tendency is exacerbated by a lot of research being justified through how it can influence clinical practice, and how it’s reported in the mainstream media; “This finding will help reduce radicalization”, and the like. So talking about the research process as the research process is, I think, useful information.

I, personally, kind of had to work out this researcher thing on my own, with little to no direct help or information, and frankly my path has suffered because of it. I am a worse researcher because I had to work out what it is and how it works. People think I’m nuts when I say that, but it’s true. Maybe things have changed, or maybe I’m just not good at picking up on opportunities, but my path has been the most roundabout, indirect, downright bizarre process out of any researcher I’ve heard from, bar none, mostly because there was no information available to me. So if I can help even one person think about this stuff, I’d consider that a substantial win.

Third, this is what’s eating my brain right now, and writing this helps me process things.

Context

So I’ve mentioned that I’m a research student in psychology. Without going into detail, I’ve done research broadly similar to this before in previous programs, although it didn’t end up getting published (partially due to weaknesses in the work, partially because it wasn’t a sexy topic, and partially because of life stuff getting in the way on my part including poor organisation). But that was at a different university a few years ago. For again complicated reasons which I won’t go into, I was outside of academia and doing no research beyond the odd hobby project for a few years, with me re-entering academia over the last year or so. So while I do have some degree of perspective, I can’t promise to be super-experienced in this area. Anyone who is, feel free to chime in in the comments, especially if what I say doesn’t fit with your understanding and experience.

Something that was true before and is even more true now is the scarcity of time for a research student. Oh, I’ve heard from industry people that university research moves comparatively slowly, and I’d believe that, but you never have enough time to do a proper job of the parts which need it (for reasons that will become apparent). So when I talk about how long a program took, bear in mind those times are coming out of a very small pot.

So when I began this program, I was told (inaccurately, as it turns out) that I had to attend this orientation event. It would take up half a day, but we’d be fed and caffeine-d, so eh, OK. It’d been a while and I didn’t know this university well, and figured some orientation would be helpful. I’d actually been struggling a bit to figure out the system and figured any insight would be helpful.

I won’t go into a blow-by-blow retelling, but in short the orientation event was totally worthless. And I mean totally worthless. Of a 3 hour talk, they spent at least 2 hours talking about how much they cared about us, how excited they were about our work, how diverse and progressive they were 1, how many supportive services they had for our mental health… and from what I can remember (and from what messages I was sending to a friend of mine during the talk) not a single word about the processes I needed to go through. There was a document I was working on at the time – basically beefing up a research proposal I’d already written – and they didn’t even mention it despite it being a common (I think university-wide?) part of the research student process and definitely the focus of every student there at the moment. This became a running theme; “we’re very supportive of you provided it doesn’t actually help you in any meaningful way”.

No, this whole post isn’t going to be me complaining about bureaucracy, but I’m illustrating right up front how bureaucratic and disconnected from actual reality these processes are. I kept thinking, for months, that there must be some aspect I’m missing, some purpose served that I’m not seeing. Maybe there is, but after nearly a year of working on this project and dealing with the bureaucratic hoops, I’m no closer to seeing it. So if a process I’m talking about doesn’t seem to serve any obvious purpose, that’s because it often doesn’t.

The project

I won’t go into detail (obviously), but I will say that it involves multiple studies (pretty common), the first of which involves interviewing people in such a way that while I’m not specifically looking for sensitive information, it’s definitely foreseeable that it’ll happen. So I’m obviously taking care to try to protect my participants as much as possible, privacy-wise. I’ll go through the steps I’m trying to implement as we go through, but while some of them are probably excessive (in such a way that puts more work on me, which I’m fine with), I don’t think any are outside the realm of reasonableness within a research context. The findings from the interviews will form the basis of later studies within the broader program of research.

Sampling

One of the first things I wanted to do was regard to sampling and representation. The issues with sampling bias are well-established at this point, but to be brief if I only interview e.g. mid-20’s men from a particular neighborhood on their drug use, I will need to be super-careful generalising any findings beyond mid-20’s men from that neighbourhood. It’s possible that other groups may show similar patterns, but it’s entirely plausible that they won’t. And if you build later work on that limited foundation, those limits can influence the later work, even if you do that later work more completely. To a point, this issue is inevitable – you’re never interviewing everyone, just your sample, which consists entirely of people who agreed to be interviewed. So I wanted to address this – I wanted to make a point of recruiting people from a wide range of backgrounds – gender, region, culture, age, etc. My topic is one that it is entirely feasible that people from different backgrounds will have different experiences and understandings of, so building that in from the very start is just good research. We often hear about work that skips this step, and we’re baffled – yes, OK, there’s pragmatic limitations, but it often seems they don’t even try.

Yeah, there’s a reason for that. One example I wanted to try to recruit was people descended from indigenous people to my country – like many indigenous people, they’ve had a historically bad run (and arguably still do in some ways, although as always it’s complicated) which may well impact relevant experiences. So building that in:

is just good research, and;
might even potentially help address gaps in our understanding allowing better solutions to common problems we all share, and;
makes it easier to get published because my work stands out more.

Alas, no. See, because of the perceived greater vulnerability of this population to exploitation, as soon as you do that a whole extra level of ethical burden and requirement is triggered. You need to work with specific authorities and experts, build specific protections and dynamics in place which will substantially impact your methods, and fill out even more paperwork, all within the same time frame (which, I remind you, is already hilariously short and already full of bureaucratic processes). It has a way of taking over your whole project – rather than doing work on Topic X including this population, you’re doing work on this population, with Topic X being basically flavouring2. I was advised, in strong terms, by my supervisors (whom I respect and trust a great deal) to run as far away from that idea as possible. Despite that, I tried to reach out to the university-defined authority on the topic, asking if they had any general advice or suggestion on the topic.

That was three months ago. I’ve yet to hear back. I don’t know if they’re swamped or useless, or maybe that department just doesn’t exist anymore and the information just hasn’t been updated yet, but it is what it is.

So I dropped that angle. I might encounter those people incidentally (although statistically I’m doubtful, they’re a pretty small minority), but it’s pretty unlikely to be in significant numbers. Similar dynamics with migrants or GSM people or people with a criminal history or anyone else who plausibly might have a different perspective which would improve the work. I’d love to include them! I wanted to! But doing so would create so many roadblocks I can’t afford, and necessarily divert the research in ways I can’t afford, that I would be basically shooting myself in the foot. Heck, I’ve had to fight pretty hard to include non-“first-year-psychology university” students.

Actually, that’s a topic worth speaking to in itself.

Psychology students

It’s a trope (and also true) that first-year psychology university students are grossly over-represented in psychology studies, to the point where a lot of findings are based basically entirely on that group. Which sometimes isn’t an issue – their visual systems probably work broadly the same as everyone else – but sometimes definitely is. University students in general, and psychology students in particular, tend to be young, middle-class, white, relatively well-educated, from urban areas, politically progressive (insofar as early adults have a coherent political view, anyway), and in psychology at least are something like 70% women3. Society – never mind humanity in general – is a lot more varied than that, and some of those differences matter. Politically conservative people experience disgust differently than politically progressive people, for example, so we already have issues. They’re younger, so developmentally they’re just at a different stage than most of society. They’re at university, so they’re at least potentially open to the idea that academia and professional careers is a worthwhile endeavour (rather than, say, more practical work like tradesmen), which not everyone is – up until quite recently, very few people went to university.

The reasons for doing so are pretty simple. It’s basically a trade – students get course credit and experience participating in research, which does give useful (albeit limited) insight in what these studies actually consist of, which allows them to better interpret and understand papers talking about them, while researchers get participants and data, which allows them to do research, get published, and stay employed. It’s not an unreasonable trade, and for a lot of research it’s not necessarily a problem – as I said earlier, their visual cortex probably works more or less the same.

So wanting to include participants who aren’t this topic is obviously well needed. But, there’s a problem. If you don’t offer participants some motivation to participate, you won’t get many people. And if you include some university students (which I will, just for sheer convenience), they get rewarded with course credit, which means if you have non-students and they don’t get some kind of reward, you’re committing an ethical violation since participants are getting different rewards in ways outside their control (also you won’t get people because people are busy). So the most common way around this is to pay them – you rarely give them cash, but usually vouchers or gift cards. This is part of where grant money goes.

But let’s take a lead from the rationalist community for a moment and shut up and multiply. Let’s say we’re interviewing say 15 people who we need to pay money to, and we need to pay them $20 each for their time. That’s $300 right there, not counting incidental costs like recording devices, licenses to use certain tests4, travel costs, maybe transcription costs if you’re not doing it yourself, and so forth. And if you’re in a situation where your grant is also supposed to be effectively your salary, that’s money directly out of your personal pocket. And 15 people for $20 each is not a large number, nor is it an especially high pay rate. You can see how this grows fast with more people. And the cost-to-me/benefit-to-individual-participants is pretty heavily skewed – if I pay another $5 per participant, that’ll cost me an extra $75, but participants are only going to gain the marginal motivation that $5 buys.

I can, in theory, throw out an almost unlimited amount of course credit. Oh, I have a “budget”, but that amount is almost entirely decided by me – the limits are pretty loose. Usually students can only gain up to some amount of their course credit this way, but if I wanted to offer that entire amount for doing just one 30 minute interview, in theory I could probably do that. I’m not limited in the way I am with actual money. So I’m pretty heavily incentivized to lean on the student pool, which leads to problems.

Privacy

Psychology often involves potentially sensitive information. Sometimes this can be obvious like medical issues, but also things like participation in criminal or socially taboo activities, membership of certain groups, etc. Even outside of those, if you want people to be honest with you, then you need to try to provide as much anonymity as possible. Sometimes this is easy – filling out an online survey is pretty simple to at least somewhat anonymise – but with things like interviews it’s much harder, especially face-to-face ones. So a lot of our behaviours involve being able to assure the participant of confidentiality – whatever they say to you remains between us, and while parts might be used in publications to clarify the broader findings drawn from a collection of people, at no point will that be done in such a way that it can be linked to them, at least not easily. There are certain common practices – we don’t record interviewees by name, but more often by number. We don’t store transcripts as “Jordan_Smith_Interview.odt”, we store them as “45.odt”. We refer to “Participant 45”. Maybe we keep a key somewhere, especially if we’re doing longitudinal work and we need to be able to connect information over time, but if we do we keep that seriously locked down – as in “on an encrypted hard drive, in a locked filing cabinet, in a locked office, and you tell nobody, and you delete it as soon as humanly possible” locked down. At least, that’s how it was – at this point I’m about 60% sure that I’m technically not allowed to have my own copy of the data (a rule I shall be ignoring).

A guiding principle of privacy is that “you can’t leak what you don’t know”. If you don’t tell me a secret, I can’t accidentally blurt it out. Which is why zero-knowledge solutions are ideal for many (not all) cases – my e-mail provider doesn’t need to know what my e-mails say in order to know if I am authorised to see them or not. They just need to verify my password or whatever authentication we have set up. So if they set it up so that they physically cannot access the content of my e-mails, I’m going to be a lot more confident that they’re not secretly reading them. So I wanted to implement this idea in my own research.

Yes, although sensitive information might come up, but let’s be honest – it’s a small study done by a nobody. Nobody is going to bother trying to identify my participants – the realistic threat is very small in any pragmatic sense. But I sincerely believe privacy is important, so I wanted to act in accordance with that value, regardless of any actual chance of harm.

But, it turns out that’s a lot harder, pragmatically speaking, than I anticipated. The first hurdle is institutional policy. See, this university has a policy that all research data must be stored for X time – this is very normal and typical, the idea being if someone in a year or so wants to check your work, in theory they can. This helps disincentivise falsifying data, since it’s much easier to detect if people can see your data. It also helps detect honest mistakes – maybe I made a mistake in the statistics which led to a material change in outcome. In general, is good policy, and I approve. In order to ensure I’m doing this, some institutions require you to store a copy of your research data on their servers, which (in theory) are secure, and if you’ve done proper de-anoymising techniques participants are reasonably safe. Some institutions even have rules against other copies – which, given how most people treat security, I have difficulty disagreeing with.

However, this has problems as well. See, if the only copy you have is on this server, which is often only accessible either with the institutional VPN or if you’re connected directly to the network, your ability to access your data to work on it becomes sharply limited. In addition, if the server is not accessible – either temporarily due to maintenance, or more long-term because someone clicked a phishing link – then you’re in trouble. Also, you’re trusting the server is managed and set up securely, which might be the case, but since cybersecurity is a whole subdiscipline in itself apart from the kind of server-management skills required, the possibilities become more concerning, especially these days.

Also, while I completely understand that most people are painfully ignorant of cybersecurity and wouldn’t know encryption from an echidna, this means that I – who while no expert, am more than capable of encrypting a hard drive with LUKS and managing that – am effectively bound to a less secure, less usable option. Because policies don’t only apply sometimes. To quote House MD:

The rules exist because 95% of the time, for 95% of the people, they’re the right thing to do.
And the other 5%?
Have to live by the same rules. Because everybody thinks they’re in that 5%.

So I had to ask the question “do I gather demographic information about my participants”, and “if I do so, do I link participant 54 with this collection of demographic information which might, theoretically, be used to identify them”, given both innate risks, and bearing in mind it may be stored on a system I cannot control or necessarily vouch for? If I don’t gather the information, then I can’t describe my sample, which means people reading my research can’t replicate it as easily. But if I do gather it, I’m potentially opening up privacy weaknesses for my participants, especially if it’s linked.

Now, I can mitigate these. I planned to identify participants age not by specific age, but by age bins – that is, I don’t know Participant 87 is 23 years old, I just know they fall within the “21 – 25” age bin. Thus, working backward is much harder, since we don’t have precise information. However, then we’re giving up potentially the ability to do nuanced analysis – is there a difference between 21 year olds and 25 year olds? Potentially! I can see circumstances where knowing whether a particular participant quoted in a publication is 21 or 25 would affect my interpretation; those two ages indicate somewhat different life stages, probably different situations etc. Does that possible value outweigh the possible risk? Unsure, honestly. And these two concerns trade off against each other – the greater you blur the ages, the greater protection you grant participants, the less analysis you can do.

All ethical clearances must show that the value of the work being done outweighs the harm/risk, this is a non-trivial question. If you’re dealing with a potentially vulnerable population where remaining anonymous is important (e.g. criminals), it might well be worthwhile. But if you’re dealing with less vulnerable people where the actual risk is pretty trivial in pragmatic terms, it’s harder to justify the lost potential insight.

Take-aways

Pragmatically speaking, most of this is not going to be directly relevant to most people. You’re almost certainly not a researcher, much less a psychology researcher, much less one at my institution. That’s part of why I’ve glossed over the specific details – they don’t matter.

But the reason I think this is important is because understanding the context in which research is done is extremely helpful in understanding why it’s done the ways it is. If you understand that researchers are limited in funding and that recruiting people is really hard, you’re better placed to understand why psychology students are so over-represented. If you understand the ethical concerns about working with vulnerable populations, you’re better able to understand the concessions and weaknesses that result.

I’ve deliberately held off on talking about solutions here, mostly because I’m not placed to effect any, and because I sincerely think the issues are of sufficient scope and complexity that solutions are going to have to be either deeply radical, fragmented and individualised, or both. And which solutions you favour will naturally be influenced by circumstances individual to you – a radical communist a hundred miles away from academia is likely to be more in favor of ripping institutions out entirely and rebuilding them from first principles than an incremental capitalist whose whole economic, social and personal identities are built on those systems – one is going to be affected much more directly and substantially by a given solution than another. So suggesting any specific suggestion, or even waving at them vaguely, is just not helpful for this format, and likely to start huge arguments.

Arguments that, to be blunt, I do not have time for.

[Post image sourced from here: https://eufunds.me/how-is-a-european-research-council-erc-proposal-evaluated)

Just to address it, yes, universities are, by and large, extremely progressive spaces, or at least progressive-performing. No, I won’t talk about it except insofar as it impacted me directly in material ways (which were almost entirely cases where the actual work was discarded in order to give lip service to progressive talking points), which I think we can all agree is a pain, especially when time is so short, and I would be just as annoyed if it was any other political view regardless of whether I agreed with it or not. I’m not there to hear political views, I’m there to learn about some technique. If I agree with it, it’s a waste of my limited time. If I don’t agree with it, I’m distracted from the actual topic by irrelevant points and it’s a waste of my time.

I think this is part of why work around so-called “vulnerable populations” is often kind of segregated from other populations, and often kind of… bad. Not because the people doing the work are fixated or bad researchers, although that definitely happens too, but because of other processes interfering. Ostensibly those processes are to protect those populations, and I’m not getting into whether they’re necessary or effective, but they do, self-evidently, cause effects on the research process as a direct result of forcing different ways of engaging in that research.

Based entirely on my personal experience, not based on actual statistics

If I come up with a measure of some construct – let’s say anxiety – I can (and many will, especially if they came up with it while working for a private research foundation) charge people to use my measure, most often an amount per test. This can be useful – the tighter control means we don’t need to worry about it leaking as much, which means people don’t prepare ahead of time to game the test to get the results they want, and you can ensure it’s being used properly, and some of these instruments or protocols are actually pretty sensitive and tricky to implement correctly. It’s one of the main ways these research foundations can make money to fund grants etc.

Psyvacy