

First demonstrated at XFEL facilities, serial crystallography involves the collection of single-snapshot diffraction patterns from individual crystals, at rates that are only limited by the frequency of the X-ray pulses or the frame rate of detectors. The recent new methodological development of serial crystallography (SX) has brought new capabilities for obtaining time-resolved and static structures of macromolecules, potentially outrunning radiation damage and without the need for cryogenic cooling.

The first hard X-ray free-electron lasers (XFELs) capable of high-resolution serial femtosecond crystallography (SFX) measurements only came online in 2009 (Chapman et al., 2011 ). X-ray crystallography is one of the most important tools in structural biology, responsible for over 80% of the biomolecular structures solved today and deposited in the Protein Data Bank (Berman et al., 2003 ). These characteristics mean that the robust peak finder (RPF) algorithm will be particularly beneficial for the new class of MHz X-ray free-electron laser sources, which generate large amounts of data in a short period of time. This means that it can analyse data from multiple detector modules simultaneously, making it ideally suited to real-time data processing. Secondly, the processing of individual diffraction patterns can be easily parallelized. This is critical for the algorithm to be able to run unsupervised, allowing for automated selection or `vetoing' of SX diffraction data.

First, it is relatively insensitive to the exact value of the input parameters and hence requires minimal optimization. Our robust statistics algorithm has two key advantages, which are demonstrated through testing using multiple SX data sets. the background noise) and another group comprising outliers ( i.e.

For example, these methods enable the discretization of data into a group comprising inliers ( i.e. Methods which are statistically robust are generally more insensitive to any departures from model assumptions and are particularly effective when analysing mixtures of probability distributions. A peak-finding algorithm for serial crystallography (SX) data analysis based on the principle of `robust statistics' has been developed.
