Statistics – Understanding Percentiles

Understanding  Statistics – Summary Statistics: Percentiles

Data for DAE are presented in the form of an Interval Scale. The data are in the form of decimal years for the Age at Attainment (AaA) of each Tooth Development Stage (TDS).  All the data are ‘interval’ and comprise ages in decimal years

[ There are up to 258 Tooth Development Stages comprising 8 TDS per Tooth Morphology Type (TMT). That is 8 Stages, by 16 Tooth Type by 2 Gender – thus 8 X 16 X 2 = 256 sets of data. ]

The following applies to each of these data sets. In practice there are slightly less than 258 as some stages are not represented as children of 1, 2, and 3 years do not have DPT radiographs.

Summary statistics are the result of mathematical processing of a data set to provide values that allow an observer to understand the essential characteristics of the data. These summary data are the key to understanding and interpreting the way in which Ages at Attainment (AaA) data are presented.

The exemplar for the whole of this DAE web is the data for LL8Gf

ll8gf14.9568815.2060215.5920615.7590715.86858

15.92334

15.98905

16.0794

16.22998

16.27379

16.282

16.31211

16.34223

16.50924

16.55578

16.56947

16.59138

16.65435

16.65982

16.67899

16.76934

16.78029

16.82957

16.84052

16.8898

16.9117

16.91444

17.02669

17.06776

17.07324

17.09788

17.14442

17.17728

17.21834

17.32238

17.39083

17.45927

17.4757

17.48118

17.48392

17.53046

17.53867

17.59343

17.61533

17.67283

17.68378

17.74127

17.74401

17.77687

17.81793

17.859

17.92745

17.96851

17.99042

18.02875

18.05339

18.10815

18.15743

18.21492

18.21766

18.2423

18.27515

18.31622

18.33539

18.36824

18.37098

18.42847

18.44764

18.45585

18.48871

18.49692

18.52704

18.5681

18.57906

18.68309

18.68583

18.70226

18.73511

18.79261

18.79808

18.81999

18.85558

18.86379

18.88022

18.94867

19.00616

19.0308

19.04449

19.20055

19.24435

19.50171

19.50719

19.50992

19.5729

19.7755

19.98084

20.18891

20.21629

20.2245

20.37509

20.53114

20.62423

20.64339

20.86242

20.87611

20.95551

21.1937

21.22108

21.59617

21.72485

21.87817

22.05613

22.26968

22.77892

~ END ~

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The dataset for LL8Gf is given on the Left.The data itself is in the form of decimal years. Each value is derived from the Reference Data Subject’s date of birth (dob) which is subtracted from the Reference Data Subject’s date of radiograph(dor). Thus (dor) minus (dob) ÷ 365.25 gives the Chronological Age (CA) in years.It is clear that to make any sense of this data, it needs to summarised. The usual format of summary data is based on the Normal distribution (see Normal Distribution Summary Data)An alternative format that is widely used is to divide up the data set into PERCENTILES.Presentation of Percentile data relates solely to the Sample Data and does not assume any distribution (e.g. the Normal Distribution).Percentiles are the set of divisions that produce exactly 100 equal parts in a continuous series of values.Thus a person on the 4th percentile has an age whereby only 4% of the samples (actually 3.999%) less than this value.  The corollary to this is that there 96% of the sample have ages above this value. It is conventional to use a set of ordered and predetermined percentile values.In DAE these are0.5th %-ile [99.499% are above this value]5th %-ile [5% below this value and 95% above]10th%-ile [10% below this value and 90% above]

25th%-ile [25% below this value and 75% above]

50th%-ile [50% below this value and 50% above]

75th%-ile [75% below this value and 25% above]

90th%-ile [90% below this value and 10% above]

95th%-ile [95% below this value and 5% above]

99.5th%-ile [99.5% below this value and 0.5% above]

~~~~~~~~~~ END~~~~~~~~~~

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

LL8GfPercentiles(Years)   

0.5th     15.10   

5th        15.97   

10th      16.32

  25th      17.07

  50th      18.13

  75th      18.99

  90th      20.64

  95th      21.35

  99.5th   22.49

The percentiles selected for DAE provide a summary of the distribution of the sample

The two extremes are the 0.5th and 99.5th percentiles.

The values for these are 15.10 years to 22.49 years.

An important pairing of the percentile values is the

25th and 75th percentiles. These two values exclude the lowest 25% and the highest 25%,

This leaves the middle 50% which extends from the 25th to the 75Th percentile.

This is a crucially important interval in relation to DAE as an estimate of the ‘average’ age can be given by using the 50th%-ile. This also called the median and is often used as the average of a set of sample data.

Thus in civil legal terms, where the determining weight of the evidence is on the ‘Balance of Probability’, this middle 50% encompasses the median or middle value [ 18.13 yrs, and extends below to 17.07 yrs and above to 18.99 yrs ]

This interpreted by saying that there is a 50% probability that the subject of unknown age is between 17.07 yrs and 18.99 yrs. A range of 1.92 yrs.

This concept is at the heart of DAE.

~~~||~~~