Alpos is an object-oriented data to theory comparison and fitting tool
The project homepage is found at http://www.desy.de/~britzger/alpos/
Local table of contents
Alpos is an object-oriented data to theory comparison tool. The program is ideally suited for fits of theory parameters, for statistical analysis of theory predictions and for data combinations. The modular object-oriented architecture of the code allows for an easy implementation of new data sets and theory predictions, as well as for new analysis tools or their extension. The concpet of Alpos involves a modular and transparent implementation of theory predictions, which also enables a high level of consistency within different components of predictions, but still provides a very userfriendly interface.
The interface for new contributions is clearly defined in an object-oriented manner through inheritance. New contributions to Alpos may be so-called functions, tasks and datasets.
New theory functions are theoretical predictions, which take as input single parameters and/or other functions. Typically, these are predictions for particular measurements, but may also be for instance PDF or alpha-s evolution codes. Functions calculate their output values on the basis of all current input values by the virtual function Update().
Tasks are classes which provide an Execute() routine and may perform any kind of operation taking the theory predictions or datasets into account. Tasks are allowed to access all present theory values and also the data sets. Typical tasks are fitting routines or for instance a statistical analysis. Other tasks may be for instance print-out routines, plotting tools or the write-out of results to disk in a particular format.
Within Alpos, datasets are input data files which represent a measurement with all its uncertainties. The data card specifies how the uncertainties are treated in the tasks and also different phase space regions may be specified easily for detailed analysis.
More details are found on the main homepage http://www.desy.de/~britzger/alpos/ . All developers may mirror the project homepage to their webspace, which may partially provide a more up-to-date documentation.
Mail to daniel.britzger@desy.de
The source code can be checked out from a public svn repository. Please ask for the repository url by mail: daniel.britzger@desy.de
Check out the package from svn using svn co <package-url>.For the installation, all above’s mentioned packages have to be installed first.
QCDNUM v17 needs to be compiled with option -fPIC, i.e. add this option to the four makelib files like: gfortran -c -Wall -O2 -fPIC -Iinc src/*.f.
For installation, please use cmake:
$ cmake .
Add further arguments for your local installation:
+ -DCMAKE_INSTALL_PREFIX=<your-dir>
+ -DQCDNUM_PREFIX=<your-dir>
+ -DLHAPDF_PREFIX=<your-dir>
+ -DFNLO_PREFIX=<your-dir>
+ -DAPFEL_PREFIX=<your-dir>
or, if everything is installed in the same “prefix” directory:
+ -DPREFIX=<your-dir>
It may be helpful to define these paths in a dedicated simple shell script.
For Eigen, either copy ‘Eigen’ into the directory apccpp or create a suitable symbolic link therein. Alternatively, you can also define a path to your Eigen-directory in CMakeList.txt, like:
include_directories(${EIGEN_INCLUDE_DIR})
or for examle:
include_directories(/cvmfs/atlas.cern.ch/repo/sw/software/x86_64-slc6-gcc48-opt/20.1.2/AtlasCore/20.1.2/External/AtlasEigen/x86_64-slc6-gcc48-opt/pkg-build-install-eigen)
Now continue with the compilation:
$ make
$ make install
There are compilation problems on Mac’s which are known, but not yet fixed.
It may be needed, that cmake links to non-desired default compilers. Therefore set the variables CC and CXX, e.g. like:
$ export CC=`which gcc`
$ export CXX=`which g++`
or directly specify their locations.
This produces the executble ./src/alpos. To run Alpos, stay in the trunk directory and type:
$ ./src/alpos tutorial/1.welcome.str
This executes alpos with a very simple steering card. This first example will printout the welcome message and executes the welcome-steering task list. Some few info, and warning statements can be ignored.
A typical steering of Alpos is based on one main steering file, which is also input to the Alpos-class. In the main steering file some further steering-files (.dat) are specified for the datasets. Using the option ‘>>‘ from read_steer, the main steering file may also be subdivided into multiple files. Furthermore, steering parameters for the alpos theory have to be specified in the steering.
Some global alpos settings
The specification of the datafiles with their respective theory predictions (DataTheorySets)
The tasks to be executed (Tasks)
The parameters for the tasks (mind: these are no (alpos) theory-parameters)
The functions to be initialized (InitFunctions) (the predictions for the datasets, which are also functions, are given together with the datasets)
Well documented example steering files are available in the directory ./tutorial.
A brief summary and a collection of comments of the available data files.
An Alpos dataset consists of several different parts and is typically stored in a single file with the extension .dat. The constituents are:
- Some general description
- Alpos parameters for the Alpos theory function(s)
- A huge Data table with values and errors
- The specification of the Errors
- Optionally also Subsets, Cuts, TheoryFactors and correlation matrices may be defined
Within alpos, each error is handled as an instance of the AError class with a unique name. This name is composed of the ErrorSet and ErrorName (as specified in the table ‘Errors’) as <ErrorSet>_<ErrorName>. Errors with identical (full) name are finally assumed to be correlated among different datasets, taking their specified correlation coefficient into account.
All errors have to be specified in the table Errors. Therein, the colum ErrorName, Column and Type must be specified. An example could look like:
ErrorSet "ATLAS Run-I"
ErrorUnit "Percent"
Errors {{
ErrorName Column Type Nature
"Stat" "stat.(%)" SC P
"JES" "JES(up):JES(dn)" 0.5 M
RCES "EHFS(up):EHFS(dn)" "" M
Lumi 2.5 1 M
Trig 1.2 EY1 M
}}
In Errors, each row specifies an error source. The value ErrorName can be chosen freely (use quotation marks if you want to include empty spaces).
The column Column may specify one or two (separated by :) columns of the Data-table which specify the size of the errors. If only one column is given, then the error is assumed to be symmetric. Otherwise the first column specifies the up and the second column the down uncertainty. Alternatively, the value of this error source may be specified directly (e.g. 2.5 for 2.5% luminosity uncertainty, if ErrorUnit is set to Percent.).
The units in the Data-table are specified for all error sources with the key ErrorUnit (options are: Percent, Relative, Absolute).
If no type specification is given (i.e. if the table entry is "" or some specifier is missing), the default error type EYM1 is used, which means it is an “experimental multiplicative systematic uncertainty with a correlation coefficent of 1, where no correlation matrix is specified”. This means, that essentially only the letters S, T, C and A and the correlation coefficients smaller than 1 have to be given explicitly.
The column Nature is free for user extensions and anything may be specified therein.
Important
For backward compatibility, the column Type may also be named Correlation.
Important
If the first error source contains the substring Stat, it is assumed to be a statistical uncertainty (if not explicitly Y is specified in Type).
If the type-specifier of an error contains the key-letter M, then a covariance or correlation matrix has to be specified. Therefore, the three keys <ErrorName>_Matrix_Format, <ErrorName>_Matrix_Format and <ErrorName>_Matrix have to be given, while <ErrorName> denotes here the name of the error as given in the Error-table.
For instance the correlation coefficients of the statistical uncerainty Stat may be given like:
Stat_Matrix_Format "SingleValues" # "Matrix" or "SingleValues" or specify single value only
Stat_Matrix_Type "Correlation" # 'Covariance', 'Correlation' or 'CorrelationPercent'
Stat_Matrix {{
q2min ptmin q2min ptmin values
150 7 150 11 -0.224
[...]
}}
The header of the table Stat_Matrix must denote sufficient columns to uniquely identify a row, and then given twice. The values in each row must match the values in Data. The column values is mandatory and denotes the correlation coefficient or the covariances.
Alternatively, the matrix may be specified directly in either full or half-matrix notation. Mind, that the first row of the table Stat_Matrix needs to be left empty. The values may either specify the correlation coefficients or the covariances. For example:
Stat_Matrix_Format "SingleValues" # "Matrix" or "SingleValues" or specify single value only
Stat_Matrix_Type "Correlation" # 'Covariance', 'Correlation' or 'CorrelationPercent'
Stat_Matrix {{
# here an empty line is important !
1
-0.5 1
0.1 -0.5 1
[...]
}}
Todo
Specify theory factors:
TheoryFactors { }
Specify subsets:
Subsets { }
Specify cuts:
Cuts { }
The main()-function instantiates only one instance of an Alpos`-object which takes as input parameter the steering file. In the ``Alpos-constructor the TheoryHandler is initialized. Afterwards, alpos simply executes all tasks as specified in the steering by calling their Init() and Execute() functions one after the other.
The Alpos class reads in the steering, and prepares and executes the initialization of the TheoryHandler. Then, the tasks are executed by Alpos one by the other.
The TheoryHandler is a global singleton class which provides access to the instances and the values of functions and parameters. The TheoryHandler can be access by:
static TheoryHandler* TheoryHandler::Handler();
In order to allow a simplified access to parameters and functions, precompiler function are defined in ATheory.h. These are subdivided into two purposes: Access to function/parameters within functions, or within tasks.
These precompiler functions are:
Precompiler function Return value Usage Use within functions PAR(parname) std::string Access the value of an input-parameter (which is a requirement of the function) (identical to PAR(par)[0]) PAR_S(parname) double Access the value of a (string-)parameter, which is a requirement of the function VALUES(parname) vector<double> Access the values of a parameter, which is a requirement of the function UPDATE(parname) void Identical to PAR(parname). Used for update parameter, e.g. to later use QUICK within the Update() function. CHECK(parname) bool Check, if a parameter has changed and thus this function obtained the IsOutdated from that parameter CONST(parname) void Set a parameter to constant: i.e. the parameter is not allowed to change any longer. SET(parname,V,E) void Set the value of any input-parameter. SET_S("Xparname,V,E) void Set the value of any input-(string-)parameter. QUICK(X,A) vector<double> Quick access to the the values of an input function QUICK_VAR(par,n,...) vector<double> Quick access to the the values of an input function QUICK_VEC(X,Y) vector<double> Quick access to the the values of an input function Use within tasks PAR_ANY("X") double Access any parameter/function value. The argument X is passed by value (i.e. may require quotation marks). PAR_ANY_S("X") string Access any string-parameter. The argument X is passed by value (i.e. may require quotation marks). VALUES_ANY("X") vector<double> Access any parameter/function value. The argument X is passed by value (i.e. may require quotation marks). QUICK_ANY("X",val) vector<double> Quick-access to any parameter/function value. The argument X is passed by value (i.e. may require quotation marks). SET_ANY("X",V,E) void Set the value of any string-parameter. The argument X is passed by value (i.e. may require quotation marks). SET_ANY_S("X",V,E) void Set the value of any string-parameter. The argument X is passed by value (i.e. may require quotation marks).
The full access to the parameters is available through TheoryHandler::GetParmD(string) or TheoryHandler::GetFuncD(string)
Define your own member-functions for an easy readable and maintable code and call them from Init() or Update().
Use singleton functions to wrap Fortran routines.
read_steer is a tiny tool to read steering values from one or more steering files. This class reads in values, which are stored in a file. New variables can be declared, read-in and included without recompilation.
Multiple files can be read in and handled individually or together.
Namespaces can be defined (e.g. same variable name in different namesapces).
Variables within a steer-file can be defined similar to shell skripts.
Command line arguments can be parsed and can superseed values from files.
Easy access via pre-processor commands
Other files can be included into steering file
The full documentation and examples are found in fastnlotk/read_steer.h.
This HTML documentation is built on sphinx using rst (reStructuredText) input. The source files of it are stored in the Alpos root-directory at docs/sphinx and docs/sphinx/source.
To (re-)build this documentation you need sphinx to be installed and then type:
cd docs/sphinx
make html
You find the index page in html/index.html.
Details about sphinx can be found at http://www.sphinx-doc.org . Details about the rst markup syntax can be found at http://docutils.sourceforge.net/rst.html . Details about the alabaster theme are found at https://pypi.python.org/pypi/alabaster .
The most useful summary page of commands is at rstdemo.html .
- ongoing developements
- Contributors: D. Britzger, D. Reichelt, D. Savoiu, K. Rabbertz, A. Kaur, G. Sieber
- New sphinx Docu
- Function for APPLGrid (not committed)
- fastNLO interface more flexible
- Full APFEL functionality
- Switches for ‘threshold-corrections’ for fastNLO
- Lots of minor bugfixes
- Analytic calculation of nuisance parameters
- Updated interface for errors with column ‘type’ and specifiers like ‘EYN1’ or ‘TSM0.5’
- ...
- ...
- ...
- ...
- v0.4, 25. Sep 2015, contact: daniel.britzger@desy.de
- Contributors: D. Britzger, D. Reichelt, K. Bjoerke, K. Rabbertz
- Enable PDF fits: Tested HERAPDF1.0 and HERAPDF2.0 against HERAFitter
- New chisq definitions with analytic calculation of nuisance parameters
- Write out PDF root-files for plotting (SavePDFTGraph)
- 2D contour scans (Contour)
- 1D chisq scans (Chi2Scan)
- Apply cuts on data
- Enable to exclude datapoints
- access PDF uncertainties from LHAPDF6
- Specify uncertainties directly as numerical value
- Pass steering parameters in command line
- Enable to pass error-‘nature’ through code
- bugfix for covariance matrix of subsets
- Clearer getters for errors (uncorr, stat, matrix-type)
- Dummies for APC, Bi-log PDF parameterization
- Full interface to EPRC (EPRC)
- Print error summary (PrintErrors)
- v0.3, 24. Jul 2015, contact: daniel.britzger@desy.de
- version for summer students
- Tested and verified inclusive jet fits
- ...
- v0.2, 15. Feb 2015, contact: daniel.britzger@desy.de
- Update with relevant feature to exclude datapoints (to come) and study ranges of data points
- Verbosity steerable
- error averaging steerable
- Subsets of datapoints for each datatheory sets
- AStatAnalysis: (chisq, pulls, p-value) also for ‘subsets’
- Improved printout and verbosity-level
- One minor bugfix in ARegisterRequirements() (default values no longer needed)
- Calculation of Pull values
- New Chisq’s for uncor and stat uncertainties only
- Simpler access to dataTheorysets from TheoryHandler: i.e. return map<name,pair<AData*,AFuncD*>> // for 'full datasets' and return map<name,map<name<pair<AData*,AFuncD*>>> // for subsets
- Init subsets in TheoryHandler