# Statistics – Understanding Percentiles

Understanding  Statistics – Summary Statistics: Percentiles

Data for DAE are presented in the form of an Interval Scale. The data are in the form of decimal years for the Age at Attainment (AaA) of each Tooth Development Stage (TDS).  All the data are ‘interval’ and comprise ages in decimal years

[ There are up to 258 Tooth Development Stages comprising 8 TDS per Tooth Morphology Type (TMT). That is 8 Stages, by 16 Tooth Type by 2 Gender – thus 8 X 16 X 2 = 256 sets of data. ]

The following applies to each of these data sets. In practice there are slightly less than 258 as some stages are not represented as children of 1, 2, and 3 years do not have DPT radiographs.

Summary statistics are the result of mathematical processing of a data set to provide values that allow an observer to understand the essential characteristics of the data. These summary data are the key to understanding and interpreting the way in which Ages at Attainment (AaA) data are presented.

The exemplar for the whole of this DAE web is the data for LL8Gf

 ll8gf14.9568815.2060215.5920615.7590715.86858 15.92334 15.98905 16.0794 16.22998 16.27379 16.282 16.31211 16.34223 16.50924 16.55578 16.56947 16.59138 16.65435 16.65982 16.67899 16.76934 16.78029 16.82957 16.84052 16.8898 16.9117 16.91444 17.02669 17.06776 17.07324 17.09788 17.14442 17.17728 17.21834 17.32238 17.39083 17.45927 17.4757 17.48118 17.48392 17.53046 17.53867 17.59343 17.61533 17.67283 17.68378 17.74127 17.74401 17.77687 17.81793 17.859 17.92745 17.96851 17.99042 18.02875 18.05339 18.10815 18.15743 18.21492 18.21766 18.2423 18.27515 18.31622 18.33539 18.36824 18.37098 18.42847 18.44764 18.45585 18.48871 18.49692 18.52704 18.5681 18.57906 18.68309 18.68583 18.70226 18.73511 18.79261 18.79808 18.81999 18.85558 18.86379 18.88022 18.94867 19.00616 19.0308 19.04449 19.20055 19.24435 19.50171 19.50719 19.50992 19.5729 19.7755 19.98084 20.18891 20.21629 20.2245 20.37509 20.53114 20.62423 20.64339 20.86242 20.87611 20.95551 21.1937 21.22108 21.59617 21.72485 21.87817 22.05613 22.26968 22.77892 ~ END ~ The dataset for LL8Gf is given on the Left.The data itself is in the form of decimal years. Each value is derived from the Reference Data Subject’s date of birth (dob) which is subtracted from the Reference Data Subject’s date of radiograph(dor). Thus (dor) minus (dob) ÷ 365.25 gives the Chronological Age (CA) in years.It is clear that to make any sense of this data, it needs to summarised. The usual format of summary data is based on the Normal distribution (see Normal Distribution Summary Data)An alternative format that is widely used is to divide up the data set into PERCENTILES.Presentation of Percentile data relates solely to the Sample Data and does not assume any distribution (e.g. the Normal Distribution).Percentiles are the set of divisions that produce exactly 100 equal parts in a continuous series of values.Thus a person on the 4th percentile has an age whereby only 4% of the samples (actually 3.999%) less than this value.  The corollary to this is that there 96% of the sample have ages above this value. It is conventional to use a set of ordered and predetermined percentile values.In DAE these are0.5th %-ile [99.499% are above this value]5th %-ile [5% below this value and 95% above]10th%-ile [10% below this value and 90% above] 25th%-ile [25% below this value and 75% above] 50th%-ile [50% below this value and 50% above] 75th%-ile [75% below this value and 25% above] 90th%-ile [90% below this value and 10% above] 95th%-ile [95% below this value and 5% above] 99.5th%-ile [99.5% below this value and 0.5% above] ~~~~~~~~~~ END~~~~~~~~~~ LL8GfPercentiles(Years)    0.5th     15.10    5th        15.97    10th      16.32   25th      17.07   50th      18.13   75th      18.99   90th      20.64   95th      21.35   99.5th   22.49 The percentiles selected for DAE provide a summary of the distribution of the sample The two extremes are the 0.5th and 99.5th percentiles. The values for these are 15.10 years to 22.49 years. An important pairing of the percentile values is the 25th and 75th percentiles. These two values exclude the lowest 25% and the highest 25%, This leaves the middle 50% which extends from the 25th to the 75Th percentile. This is a crucially important interval in relation to DAE as an estimate of the ‘average’ age can be given by using the 50th%-ile. This also called the median and is often used as the average of a set of sample data. Thus in civil legal terms, where the determining weight of the evidence is on the ‘Balance of Probability’, this middle 50% encompasses the median or middle value [ 18.13 yrs, and extends below to 17.07 yrs and above to 18.99 yrs ] This interpreted by saying that there is a 50% probability that the subject of unknown age is between 17.07 yrs and 18.99 yrs. A range of 1.92 yrs. This concept is at the heart of DAE. ~~~||~~~