Do not store instance variables with values that can calculated from the values of other instance variables.

The DataSet class below is based on the prompt from the Class writing exercise.

The version of the DataSet class below inappropriately stores average as an instance variable. The implementation works correctly, but demonstrates a poor choice of instance variables.

public class DataSet
{
    private int sum;
    private int numberOfValues;
    private double average;
    
    public DataSet()
    {
        sum = 0;
        numberOfValues = 0;
        average = 0;
    }

    public DataSet(int initialValue)
    {
        sum = initialValue;
        numberOfValues = 1;
        average = sum;
    }

    public void addValue(int value)
    {
        sum += value;
        numberOfValues++;
        average = sum / (double) numberOfValues;
    }

    public int getSum()
    {
        return sum;
    }

    public double getAverage()
    {
        return average;
    }
}

The Class writing exercise solution shows an appropriate design for the DataSet class. The average can be, and should be, calculated from the values of sum and numberOfValues.

Storing average as an instance variable makes it possible for the instance variables to store inconsistent data. If sum and numberOfValues are updated but average is not, average is said to contain stale data.

Caching computed values

What follows is beyond the scope of an AP CS A course. It is presented here in response to a particularly common reason for storing duplicate data as instance variables.

In some cases, it is advantageous to store a computed value for repeated access. Computing the value may be expensive (in terms of processing time and/or memory). Such values can be cached.

Caching a computed value involves more than just storing it as an instance variable. Caching ensures that the cached value is recomputed when appropriate.

The version of the DataSet class below uses a very simple mechanism to cache average.

public class DataSet
{
    private int sum;
    private int numberOfValues;
    
    private double average;
    private boolean averageIsValid;

    public DataSet()
    {
        sum = 0;
        numberOfValues = 0;
        
        average = 0;
        averageIsValid = true;
    }

    public DataSet(int initialValue)
    {
        sum = initialValue;
        numberOfValues = 1;
        
        average = sum;
        averageIsValid = true;
    }

    public void addValue(int value)
    {
        sum += value;
        numberOfValues++;
        
        averageIsValid = false;
    }

    public int getSum()
    {
        return sum;
    }

    public double getAverage()
    {
        if( ! averageIsValid )
        {
            average = sum / (double) numberOfValues;
            averageIsValid = true;
        }
        
        return average;
    }
}

averageIsValid stores true if the current value of average actually represents the average; otherwise, averageIsValid stores false.

The value of average is valid immediately after construction, so averageIsValid is initialized to true in each constructor. It would also be possible to initialize averageIsValid to false and allow the average to be computed on the first run of getAverage.

The value of average is invalidated by running addValue. The addValue method does not compute the new average, but rather marks the existing average as invalid by setting averageIsValid to false. Marking the existing average as invalid, rather than recomputing it within addValue, allows multiple values to be added without repeated unnecessary recomputations of the average.

The getAverage method checks if averageIsValid is false to determine if the cached value stored in average is invalid. If necessary, getAverage computes the new average. The computed value is cached (stored in average) and marked as valid (averageIsValid is set to true). The correct average is returned in all cases, whether it was just computed or remained valid from a previous computation.

getAverage is technically a mutator method. It changes the state (values of instance variables) of the implicit parameter (the object on which it is run). The caching mechanism allows client code (code that uses the DataSet class) to treat getAverage as an accessor method.

Help & comments

Get help from AP CS Tutor Brandon Horn

Comment on Duplicate data as instance variables