Chapter 2: Design

2.1: the main() function

The main function defines the long options in its global g_longOpts array. The g_longOpts and g_longEnd global variables are not declared for general use in a header file but are declared in class specific headers when needed. Currently that's only scenario/scenario.ih.

    int main(int argc, char **argv)
    try
    {
        Arg const &arg = Arg::initialize("a:B:C:D:hi:n:oP:R:S:t:vV",
                                          g_longOpts, g_longEnd, argc, argv);
        arg.versionHelp(usage, Icmake::version, arg.option('o') ? 0 : 1);
    
        Simulator simulator;            // construct the simulator
        simulator.run();                // and run it.
    }

The simulations themselves are controlled by a Simulator, having only one public member: run, performing the simulations.

2.2: Simulator

The Simulator is constructed by main. Its constructor determines the source of the analysis specifications (setAnalysisSource).

The class defines the following data members:

        bool d_next = false;                        // true if 'run()' should run
                                                    // an analysis
    
        uint16_t d_lineNr = 1;                      // updated by 'fileAnalysis'
        std::ifstream d_ifstream;                   // multiple analysis specs
    
        std::string (Simulator::*d_nextSpecs)();    // ptr to function handling 
                                                    // the (next) analysis spec.

If the command-line option -o was specified then analysis specifications may be provided as command-line arguments, handled by the member cmdLineAnalysis.

Otherwise a specification file must be provided as first command-line argument. This file is read until a line begins with analysis:, which may then be followed by specifications, read by the member fileAnalysis.

In both cases the Simulator's member d_next is set to true.

The member d_nextSpecs is set to either cmdLineAnalysis or fileAnalysis. Eventually these members set d_next to false, ending the analyses.

2.2.1: run

The run member runs all analyses. At the construction d_next may be set to true indicating that an analysis must be performed:

    void Simulator::run()
    {
        while (d_next)
        {
            uint16_t lineNr = d_lineNr;        
                                                // read the next analysis specs
            string spec = (this->*d_nextSpecs)();   
    
            Analysis analysis{ istringstream{ spec }, lineNr };
            analysis.run();                     // run the simulation
        }
    }

If a simulation must be performed then the non-default specifications are provided by the member to which d_nextSpec points.

The actual simulation analysis is then performed by an Analysis object, which also has a run public member performing the actual analysis.

2.3: Analysis

Analysis class objects handle one simulation analysis. Since multiple analyses are performed independently from each other each Analysis object initializes its own error count (class Error) and option handling (class Options.

Since the options may specify the name of the file containing the analysis-parameters Analysis objects also define a configuration file object (ConfFile).

In simrisc's scientific context simulation parameters are also known as `scenarios'. Scenarios contain information like the number of iterations to perform and the number of cases to simulate. The analysis's Scenario object is initialized by Analysis's constructor, and used by its run member.

Here is an overview of Analysis's data members:

Figure 2: The Analysis data members

                                        // for this analysis:
        Error d_error;                  //  error handling
        Options d_options;              //  options 
        ConfFile d_confFile;            //  configuration (parameters) file
    
        Scenario d_scenario;            // performs one (1) simulation

The class Analysis uses the classes Scenario, Loop, ConfFile, Options and Error, which are covered in separate sections.

2.3.1: run

The run member may immediately end if errors were encountered in the specifications of the Scanario and/or ConfigFile parameters.

One of the options defines the base directory into which the output is written. The member requireBase checks whether the base directory exists, and if not creates it. If that fails the program ends, showing the message


    Cannot create base directory ...
    
where ... is replaced by the name of the base directory provided by the Options object.

If all's well, the actual simulation is then performed by a Loop object.

2.4: Scenario

Scenario class objects are defined by Analysis class objects and contain parameter values of the simulation to perform. For each separate analysis a new Scenario object is constructed.

Most of its members are accessors: const members returning values of scenario parameters.

Some parameter values are stored in the Scenario object itself:

Configuration parameters start with identifying names, like costs: or screeningRounds:. Those names are then followed by the parameter's specifications. Those specifications are made available by the members

2.5: Loop

The class Loop is the `workhorse' class. It performs the simulation as specified by the Scenario object which is passed to its objects when they are constructed. The class Loop uses many other classes, most of which encapsulate parameters specified in the configuration file. Those classes read configuration file parameters and prepare these parameters for their use by Loop.

The constructor of a Loop object defines the following objects:

It also defines a Random object (d_random), generating random numbers (cf. section 4.1).

2.5.1: iterations

The iterate member performs the number of iterations (scenario: iterations). At each iteration

2.5.2: Womenloop

The member womenLoop performs one complete iteration. The option nCases may be used to analyze a specific number of cases, and then only write the data of the final case to file. By default the number of cases specified in `scenario nWomen' are simulated.

womenLoop initializes the random number generator with the current scenario seed. Then it iterates over all cases. For each case:

2.5.3: Pre-screening

In the original code the following condition determines whether prescreening is used:

    if (Nscr > 0 && (naturalDeathAge < 1st screening age || (tumor present
        && tumor.selfDetectAge() < 1st screening age)))
    

No prescreening is therefore performed by complementing this expression:


    not (Nscr > 0 && (naturalDeathAge < 1st screening age || (tumor present
        && tumor.selfDetectAge() < 1st screening age)))
    

or (using De Morgan's rule: a && b == !a || !b):


    not (Nscr > 0) or 
        not (naturalDeathAge < 1st screening age || (tumor present
             && tumor.selfDetectAge() < 1st screening age))
    

So no screening is performed if there are no screening rounds (not (Nscr > 0)) and also if


        not (naturalDeathAge < 1st screening age || (tumor present
             && tumor.selfDetectAge() < 1st screening age))
    
is true.

Applying the not-operator to the condition and using De Morgan's rule !(a || b) == !a && !b results in


        naturalDeathAge >= 1st screening age and 
            not (tumor present && 
                 tumor.selfDetectAge() < 1st screening age)
    

Once more aplying De Morgan's rule finally results in:


        naturalDeathAge >= 1st screening age and 
            (not tumor present or 
                 tumor.selfDetectAge() >= 1st screening age)
    

This condition is tested and if true no pre-screening is performed.

When pre-screening is used:

If there's no tumor or if there is a tumor, but the case's natural death happens before the death cause by the tumor, then there is a natural induced death in the pre-screening phase.

Alternatively, there is a self-detected tumor. In that case death may happen before the tumor causes the case's dath (a natural death at the case's natural death age) or the death caused by the tumor may be before the case's natural dath, resulting in a tumor caused pre-screen death at the simulated death age caused by the tumor.

2.5.4: Screening

Screening is performed for each of the screening rounds.

The screening phase ends if the case is no alive anymore. At each round, if the case is alive, then it may be that death occurs.

To determine whether death occurs the following checks are performed:

If death has not occurred so far, then the screening continues for the current screening age. At this point the modalities are considered. Each of the modalities configured for the current screening age is considered in turn. They are considered in their order of specification in the configuration file. E.g., when specifying


    screeningRound:     50  Mammo MRI
    screeningRound:     52  MRI Mammo 
    
Mammo is considered before MRI at screening age 50, and MRI is considered before Mammo at screening age 52.

For each modality it may be that it is not considered. Whether it is considered is determined by comparing the configured attendance rate (e.g., .80) with the random value round nr * modality nr from the pool of attendance random values

note: this computation also results in multiple uses of random values for different round and modality combinations. E.g., round 1 and modality 2 use the same random value as round 2 and modality 1. So far multiple modalities haven't been used, but here, too, the used attendance pool index should avoid multiply using random values

If the screening rate's value is less than the random value then screening is not used for the current case at this screening age

note: this looks a bit strange: I would have expected that an attendance rate of .80 would mean that there is a probability of .80 that a screening round is used. This implementation does the opposite: 20% of the screening rounds are used. In the original code I read

            if ( 
                (rndPoolAttendance[ screeningRound*(modality+1) ]
                    <= attendanceRate) && 
                (( (modality == 0) && (scr[screeningRound].T0)) || 
                ( (modality == 1) && (scr[screeningRound].T1))) 
            )
        
in which case screening is performed. Thus no screening is performed when applying the not-operator to this condition:

        not ( 
            (rndPoolAttendance[ screeningRound*(modality+1) ]
                <= attendanceRate) && 
            (( (modality == 0) && (scr[screeningRound].T0)) || 
            ( (modality == 1) && (scr[screeningRound].T1))) 
        )
    

The checks for used modality have already been performed, so they can be omitted, resulting in:


        rndPoolAttendance[ screeningRound*(modality+1) ] > attendanceRate 
    

or:


        attendanceRate  < rndPoolAttendance[ screeningRound*(modality+1) ]
    

resulting in not using this screening round: here too a high attendance rate results in low probabilities that the screening round is used.

If a round is used then its costs are added to the costs so far, and if there is a tumor then its characteristics are determined.

Then, if there is a tumor and the current screening age is at least equal to the tumor's detectable age it may be detected; otherwise there may be a false positive.

Maybe detecting the tumor uses the betafunction. It returns true if the random sensitivity pool value (again: collisions are possible) is at most equal to the computed beta function value. As a reference, here is the beta function's implementation:

    bool Loop::betaFunction(uint16_t modalityNr)
    {
        double diameter = d_tumor.diameter();
        double mValue = d_screening.m().value();
    
        double expBeta = 
            exp(
                d_screening.beta(0).value() +
                d_screening.beta(1).value() * diameter +
                d_screening.beta(2).value() * mValue +
                d_screening.beta(3).value() * mValue / (diameter * diameter)
            );
        
        return d_random.uniform() <= 0.9 * expBeta / (1 + expBeta);
    }

If the beta function returns true then if the case's natural death age is less than the tumor cause dying age the case dies naturally.

note: which looks weird to me. The original code specifies naturalDeathAge < tumorDeathAge, but if the tumor caused death age clearly exceeds the case's natural death age why would the case then die? Consider the situation where the screening age is 50, and the natural death age is 83.5: why would the case die at the screening round for age 50?

When the beta function returns true and the case's natural death age exceeds the tumor cause dying age then death is observed during this screening round, and a false positive occurs if the pool's false positives random value for the current round and modality exceeds the modality's specificity (and again: multiply used random values may occur).

2.5.5: Post-screening

Post screening is skipped if the case is not alive anymore.

If there is a tumor ad the tumor self-detection ages is earlier than the case's natural death then it is concluded that there either is a natural death or the tumor's characteristics may have caused death before the natural death age.

If there is no tumor or if the tumor's self-detection age is after the case's natural death then there is a natural death at the case's natural death age. In this latter case (there is a tumor, but it hasn't caused death) the tumor's characteristics are also determined (cf. section 3.5.2).

2.5.6: Pre- and post-screen death characteristics

During the pre-screening phase a tumor may be self detected. Likewise, during the post-screening phase a death during treatment may be observed. In those cases the following actions are performed: