Data analysis at the European XFEL
Data analysis at the European XFEL
Data analysis at the European XFEL
Provided on a best effort basis - and may go out of date!
Links to useful resources:
Maxwell cluster (offline analysis)
Offline analysis
Log on to the Maxwell cluster: max-exfl.desy.de
User notes for remote access, installed software and such stuff: Maxwell cluster
Data and computing is physically located at DESY. Data must be moved to DESY before it is available.
Subscribe to the maxwell-user mailing list for updates on things such as cluster status and file system problems
Users can self-subscribe here: https://lists.desy.de/sympa/info/maxwell-user
Data location: /gpfs/exfel/exp/<instrument>/<proposal_cycle>/<proposal_id>
Example: /gpfs/exfel/exp/SPB/201701/p002012/
/raw = raw data
/scratch = temporary data (really scratch - may be wiped as needed)
/usr = where to put your scripts (synchronised between online and offline, limited space)
/proc = location for data output by XFEL calibration pipeline
Batch queue is managed by Slurm, queue name for XFEL analysis is upex
Instructions: Getting started and more detailed
Example: > sbatch -p upex --wrap hostname
Submitted batch job 1516
> cat slurm-1516.out
max-exfl001.desy.de
Don’t run big jobs on the login nodes, request an interactive node instead
Example: > salloc -p upex -t 10:00:00
then ssh directly to the node allocated to you
Docker containers are used to distribute Karabo for use on Maxwell: Instructions
Note: you need to first request access to docker containers by sending an email to maxwell.service@desy.de
or you will get the error “cannot connect to the docker daemon. is the docker daemon running on this host?”
Remote access to the cluster is possible through ssh://bastion.desy.de
FastX graphics connections available using a FastX client.
In an emergency, there is also a web browser interface but performance is not as good.
Transferring large data back home is best done with Globus Online
Available software on Maxwell
An extensive software stack is available on Maxwell
See the list of installed software and photon science specific packages
Standard installation modules available on Maxwell can be found by
> module avail
This includes python3, IDL, Matlab and many other common programs
A public version of the CFEL software stack is available:
> source /gpfs/cfel/cxi/common/public/cfelsoft-rh7-public/conda-setup.sh
> conda info --envs
# conda environments:
#
base
ana-1.4.2
conda_build
crystallography
> conda activate crystallpgraphy
The normal CFEL stack is also available for CFEL people.
To use the public stack, CFEL people will have to first unload our own internal version:
> unset MODULEPATH (to start from a clean slate)
Rapid disk-based analysis (“online”) (at Schenefeld facility)
Data on the online cluster before it is moved to the offline cluster at DESY (which must be done manually)
Machines are on the private control network, access only available from certain machines physically at XFEL
Shared access to 6 nodes. Usage starts and stops with your 12-hour experiment shift (!!!)
(exflonc06, exflonc07, exflonc08, exflonc09, exflonc10, exflonc11)
Off-shift access to one machine for installation and testing
SPB/SFX: exflonc05
FXE: exflonc12
Data is located in /gpfs/p<proposal_id>/(raw|usr|proc|scratch)
Note different path to the offline cluster (!!!)
/usr folder synchronised between online and offline, limited space so use for scripts and software but not data.
Real time analysis
Needs to be integrated with Karabo operation. Experts only at the moment. See documentation from XFEL
OnDA is running but in development mode at the moment
Reading data files
EuXFEL data is saved in highly structured HDF5 files.
Example data can be found in /gpfs/exfel/data/scratch/example_data/
A collection of file readers is available from this repository:
https://stash.desy.de/projects/ELBE/repos/euxfel/browse
mirrored here for push access for those without a stash account
https://github.com/antonbarty/EuXFEL
Contributions welcome - there is no point reinventing the wheel here. Please email for push access.
Detector data
AGIPD is a complex detector consisting of 352 memory cells each with a 3-stage dynamically-switching gain.
Effectively 352 detectors in one, data is saved directly from the detector and converted to linearised units in
software post-processing. This lossless approach allows for the application of improved calibrations and
photon conversion after data is taken, but requires a script to be run in order to perform that conversion.
Example data can be found in /gpfs/exfel/data/scratch/example_data/
Sample calibration script is /gpfs/exfel/data/scratch/example_data/calibrate.py
python calibrate.py --input /gpfs/exfel/data/scratch/example_data/r0283/RAW-R0283-AGIPD*0.h5 \
--output ../offline_data/calibrated_agipd/ \
--local-cal-store ../offline_ana/agipd_store.h5 \
--mem-cells 30 --cores 64 \
--instance SPB_DET_AGIPD1M-1 \
--type correct --nodes 5 \
--partition upex
Running this script requires access to docker containers (see note above in offline analysis section)
Notes from XFEL: offline_calibration.pdf
Note:
First experiments at EuXFEL are in Mid September 2017. Things are changing very quickly and this may go out of date. Please be patient regarding any errors.