Fit Package
Parameter estimates using Unbinned Maximum Likelihood (UML) fits are provided, based on a description of the likelihood provided in terms of PDF from the Pdf package. Chi-square fits are also provided. Functions descriptions from the Funct package can be used.
Contents

Conventions
  • the examples below assume one specifies the directive:
    using namespace Fit;
    otherwise the namespace Fit should be specified where appropriate.
Definition of data samples
  • Data samples have to be described in order to be used for parameter estimates by fitter classes. Two type of event samples are provided: unbinned event samples, and binned data samples.
  • Unbinned event samples (defined in Fit/Sample.h)
    • An unbinned event sample is a collection of sets of variables. Each entry in the collection represent an event, and for each event a set of characteristic variables is stored. Variables can be of any type (most commonly double and int).  The unbinned sample implementation is based on the tuple class from the boost C++ library. The interface is inspired from STL Standard C++ Library and can be easily understood from the header file.
    • The class Sampl<T> is a template for an unbinned event sample. The template argument T has to behave as a boost type sequence class, according to the MPL boost library.  Most of the users will never have to deal with that detail, and the list of types can be directly derived from the PDF class used as model for the sample. A code example explains more than the text:
// each event has one double variale
Sample< Gaussian::types > s1;
 
typedef Independent< Poissonian, Gaussian > MyPdf;
// each event has two variables:
// one int and one double
Sample< MyPdf::types > s2;
Sample< MyPdf::types >::tuple ntp & = s1.extend();
ntp.get<0>() = 1000; // set the first variable (int)
ntp.get<1>() = 3.14; // set the first variable (int)
  • Binned data samples (defined in Fit/Sample.h)
    • A binned data sample is a sample of data that are accumulated into bins. For each bin an object of the type Measurement<T> is stored (T is double by default), which has two data members of type T: content and error, similar to STL pair, but with names more suitable for a statistics application. 
    • Example:
// create a binned sample with 10 bins
SampleErr< double > s( 10 );
for( s::iterator i = s.begin(); i != s.end(); i ++ ) {
  i->content =  10;
  i->error = sqrt( 10 );
}

Likelihood functions
  • Given n measurements of m variables each stored in a Sample<...> class, a Likelihood class can evaluate the value of the log of a likelihood function based on a PDF type provided. The types of  the sample and PDF have to match or be convertible. 
  • Example:
typedef Independent<Flat, Gaussian> MyPdf;
MyPdf pdf( ... ); // instantiate the pdf
Sample< MyPdf::types > sample;
// fill the sample here somehow

Likelihood< MyPdf > likelihood( pdf );
double logLike = likelihood.log( sample );

Extended likelihood functions
  • Given n measurements of m variables each stored in a Sample<...> class, an ExtededLikelihood class can evaluate the value of the log of an extended likelihood function based on a number of PDF types provided. The types of  the sample and PDF variables have to match or be convertible. Up to four pdfs can be passed as template argument in the current version.
  • Example:
typedef Independent<Gaussian, Gaussian> Signal;
typedef Independent<Flat, Flat> Background;
Signal sig( ... ); // instantiate the signal pdf
Background bkg( ... ); // instantiate the background pdf
Sample< Signal::types > sample;
// fill the sample here somehow

ExtendedLikelihood2< Signal, Background > like( sig, bkg ) ;
// set the signal and background yield values
double yields[ 2 ] = { 10, 100 };
like.setYields( yields );
double logLike = like.log( sample );

Chi-square functions
  • A Chi2 class evaluates the value of a chi-square of a binned sample given a function model and a partition model
  • Partition classes
    • A partition class defines a partition of an interval into bins. The prototype is a UnformPartition (header Fit/UniformPartition.h). Different partitions (e.g. with variable bin width) can be implemented respecting the same interface.
  • Example:
// declare a partition of [ 0, 1 ] into 10 bins
UniformPartition partition( 10, 0.0, 1.0 );


// a line chi-square
Line line( 0, 1 );
Chi2<Line> chi2line( line, partition );
 
// a parabolic chi-squalre
Parabola para( 0, 1, 0 );

Chi2<Parabola> chi2para( para, partition );
SampleErr< > s;
// fill the sample here
 
double chi2_line = chi2line( s.begin(), s.end() );

double chi2_para = chi2para( s.begin(), s.end() );
Fitters
  • Fitter classes perform parameter estimates based on a minimization algorithm. The algorithm is based on CERN Minuit package. A fitter is based on a fit function type, tipically maximum likelihood or chi-square.
  • Before using fitters, it is necessary to specify which minimization engine implementation should be used. This toolkit is not intended to re-implement minimization algorithms that are already nicely implemented in other packages. Currently, only CERN MINUIT, under ROOT library wrapper, is supported, and more may be added in the future. The external packages are encapsulated internally, and a default must be choosen by the user. This is specified in the header file Fit/Ext/Defaults.h.
  • Maximum Likelihood Fitters
    • Parameter fitters
      • UMLParameterFitter performs parameter estimates with an unbinned maximum likelihood fit. Parameters can be added using the addParametr function where the address of the parameters have to be specified. The parameter boundaries can be specified as optional arguments.
      • The following example extimates the mean and sigma of a gaussian distribution from a given sample (this example proviedes a well known example of bias in the estimate of sigma):
double mean = 0, sigma = 1;
Gaussian pdf( mean, sigma );
typedef Likelihood<Gaussian> Like;
Like like( pdf );

UMLParameterFitter<Like> fitter( like );
fitter.addParameter( "mean", & pdf.mean );
fitter.addParameter( "sigma", & pdf.sigma );
Sample< Gaussian::types > sample;
// fill the sample here
double par[ 2 ] = { mean, sigma };
double err[ 2 ] = { 1, 1 };
double logLike;
logLike = fitter.fit( par, err, sample );
double
pull1 = ( par[ 0 ] - mean  ) / err[ 0 ];
double pull2 = ( par[ 1 ] - sigma ) / err[ 1 ];
    • Yield fitters
      • UMLYieldFitter performs yield estimates with an unbinned extended maximum likelihood fit.
      • Example:
const int sig = 10, bkg = 5;
typedef Independent<Gaussian, Gaussian> Sig;
typedef Independent<Flat, Flat> Bkg;
Gaussian g1( 0, 1 ), g2( 0, 0.5 );
Sig pdfSig( g1, g2 );
Flat f1( -5, 5 ), f2( -5, 5 );
Bkg pdfBkg( f1, f2 );
typedef ExtendedLikelihood2<PdfSig, PdfBkg> Like;
Like like( pdfSig, pdfBkg );
UMLYieldFitter<Like> fitter( like );
Sample< Like::types > sample;
// fill the sample here
double s[] = { sig, bkg };
double err[] = { 1, 1 };
double logLike = fitter.fit( s, err, sample );
double pull1 =( s[ 0 ] - sig ) / err[ 0 ];
double pull2 =( s[ 1 ] - bkg ) / err[ 1 ];

    • Parameter and yield fitters
      • UMLYieldAndParameterFitter performs simultaneous yield and parameter estimates with an unbinned extended maximum likelihood fit. It implements both the interfaces of  yield and parameter fitter classes. The first parametrs thet are fitted are the yields, the latter are the parameters declared with the addParameter method.
  • Chi-square fitters
    • Chi2Fitter classes determine parameter estimates with a minimun chi-square fit. The interface is similar to the unbinned likelihood fitter.
    • Example:
// declare a partition of [ 0, 1 ] into 10 bins
UniformPartition partition( 10, 0.0, 1.0 );

Line line( 0, 1 );
Chi2<Line> chi2line( line, partition );
Chi2Fitter<Chi2<Line> > fitter;
fitter.addParameter( "a", &line.a );

fitter.addParameter( "a", &line.a );
SampleErr< > s;
// fill the sample here
double par[] = { 0, 1 },
       err[] = { 1, 1 };
fitter2.fit( par, err, s.begin(), s.end() );

Examples