In a nutshell

  • Learn how bisulfite sequencing works
  • Understand how bisulfite-treated reads are mapped to a reference genome
  • Perform basic analyses (call methylated regions, perform basic downstream analyses)
  • Use shell scripting to create reusable data pipelines
  • Visualize results (ready-to-publish)

29.11.2017 - 1.12.2017

Berlin, Germany

The purpose of this workshop is to get a deeper understanding of the use of bisulfite-treated DNA in order to analyze the epigenetic layer of DNA methylation. Advantages and disadvantages of the so-called 'bisulfite sequencing' and its implications on data analyses will be covered. The participants will be trained to understand bisulfite-treated NGS data, to detect potential problems/errors and finally to implement their own pipelines. After this course they will be able to analyze DNA methylation and create ready-to-publish graphics.

By the end of this workshop the participants will:

  • be familiar with the sequencing method of Illumina
  • understand how bisulfite sequencing works
  • be aware of the mapping problem of bisulfite-treated data
  • understand how bisulfite-treated reads are mapped to a reference genome
  • be familiar with common data formats and standards
  • know relevant tools for data processing
  • automate tasks with shell scripting to create reusable data pipelines
  • perform basic analyses (call methylated regions, perform basic downstream analyses)
  • plot and visualize results (ready-to-publish)
  • be able to reuse all analyses

This workshop has been redesigned and adapted to the needs of beginners in the field of NGS bioinformatics and comprises this three course modules:

  1. NGS Technologies:
    Different methods of NGS will be explained and compared, together with the consequences for data analysis. The most important notations and an overview over various applications will be given.
  2. Practical Bioinformatics (with Linux):
    This module will introduce the essential tools and file formats required for NGS data analysis. It helps to overcome the first hurdles when entering this (for NGS analyses) unavoidable operating system.
  3. DNA Methylation Analysis:
    Important first NGS analyses tasks will be explained and performed. This module covers essential knowledge for analysing data of bisulfite treated DNA-Seq data.

Detailed Course Program

NGS Technologies

  • Introduction to sequencing technologies from a data analysts view
  • Raw sequence files (FASTQ format)
  • Preprocessing of raw reads: Idea of adapter clipping and quality trimming
  • Mapping output (SAM/BAM format)

Practical Bioinformatics (with Linux)

  • Introduction to the command line and important commands
  • Combining commands by piping and redirection
  • Introduction to bioinformatics file formats (e.g. FASTA, BED) and databases (e.g. UCSC)

Introduction to NGS data analysis

  • Introduction to Bisulfite Sequencing
  • Read Mapping (special alignment method for bisulfite-treated reads)
  • Quality Control
  • Data Formats (e.g. vcf, bed, bedgraph, bigwig)
  • Overview Statistics
  • Tools and Databases (e.g. UCSCtools, BEDtools, UCSC GenomeBrowser)
  • Visualizing the DNA methylation genome-wide (e.g. Circos Plot, R) or in specific regions/genes (e.g. UCSC, IGV)
  • From positions to regions: advantages and disadvantages of segmentation, windowing, and smoothing
  • Identification of Differentially Methylated Regions (DMRs)
  • Non-CpG Analysis (How to find methylated non-CpGs)


Dr. Helene Kretzmer (University Leipzig)
Helene is working on DNA methylation analyses using high-throughput sequencing since 2011. She is responsible for the bioinformatic analysis of MMML-Seq study of the International Cancer Genome Consortium (ICGC). Publications

Dr. Christian Otto (Semless NGS)
Christian is one of the developers of the bisulfite read mapping tool segemehl and is an expert on implementing efficient algorithms for HTS data analyses. Publications

Dr. Mario Fasold (ecSeq Bioinformatics)
Mario works in the analysis of microarray data since 2007 and developed several bioinformatics tools such as the Bioconductor package AffyRNADegradation and the Larpack program package. Since 2011 he specialized in the field of NGS data analysis and helped analysing sequecing data of several large consortium projects. Publications


The target audience is biologists or data analysts with no or little experience in analyzing NGS data. A fundamental understanding of molecular biology (DNA, RNA, gene expression, PCR, ...) is assumed.

A basic knowledge of Linux & Bioinformatics (commandline usage, common commands and tools) is beneficial, but not required. You could prepare yourself with the Learning the Shell Tutorial.

