Case study: The “Linux for scientists” course
Basic Linux command line skills are essential for researchers working in omics science and other fields revolving around big data. In collaboration with the Genetic Epidemiology group of the Erasmus University Medical Centre in Rotterdam, NL, we developed a course called “Linux for scientists” which has been running since 2011.
Linux command line skills are essential for researchers working in omics science
In this course we teach practical skills to people with little or no previous knowledge of Linux and the command line. After following the course, students should be able to:
- manage project data from the command line (i.e. without having to copy files back and forth to the server);
- reformat output from previous analyses and use this as input for a subsequent analysis step;
- write scripts that automate repetitive tasks;
- efficiently run time-consuming analyses like a GWAS without overloading the server.
To this end, the following topics are covered:
- the directory structure and how to find your way around the file system
- working with a text editor
- text processing tools like AWK, sed and sort
- optimising your workflow by scripting in Bash
- effectively running jobs on a batch queue system like SGE, for those that work on larger servers and clusters
The course focuses on providing hands-on experience, so those who have been using a Linux system for a longer time will be able to skip the parts they already feel comfortable with and move on to more advanced concepts like regular expressions, version control and advanced use of a text editor.
For this course we wrote our own course book.
We currently teach this course at the Netherlands Institute for Health Sciences (NIHES) and the Research Summer School in Statistical Omics.