Chapter 5 Setting up a Glossary


In this chapter, you learn how to:

  • Setup csv files for a glossary
  • List terms and definitions by chapter
  • List terms by the chapter in which they are first defined
  • Use GitHub to store the project, make changes, seek feedback and facilitate collaboration

A glossary can serve as a quick reference when the reader needs to recall a term’s definition or location within a book. This chapter can serve as a guide to setup a glossary in the R Bookdown environment. Please refer here for a refresher on R Bookdown if necessary.

5.1 csv File Setup

The following outlines how we currently store our glossary:

  1. Setup an excel file with 3 columns: Term, Definition, Chapter first defined.
  2. We suggest using one excel file for all chapters for easy reference and sharing. However, you will have to save each chapter separately. Alternatively, you may use one excel file per chapter instead of having one central file then separating it.
  3. Save the Excel file as a csv file.
  4. Please note that if you make changes, do so in the Excel file not the csv file.

5.2 Terms and Definitions by Chapter

This section describes how to create a list of important terms and their definitions listed according to the chapters in which they are found.

We use an example from the glossary of Loss Data Analytics. First we display the list of terms and definitions for Chapter 1 of the book followed by the code used to generate it.

5.2.1 Chapter 1 Introduction to Loss Data Analytics

Term Definition
loss adjustment expenses Loss adjustment expenses are costs to the insurer that are directly attributable to settling a claims. For example, the cost of an adjuster is someone who assess the claim cost or a lawyer who becomes involve in settling an insurer’s legal obligation on a claim
unallocated loss adjustment expenses Unallocated loss adjustment expenses are costs that can only be indirectly attributed to claim settlement; for example, the cost of an office to support claims staff
allocated loss adjustment expenses Allocated loss adjustment expenses, sometimes known by the acronym ALEA, are costs that can be directly attributed to settling a claim; for example, the cost of an adjuster
indemnification Indemnification is the compensation provided by the insurer.
insurance claim An insurance claim is the compensation provided by the insurer for incurred hurt, loss, or damage that is covered by the policy.
loss amount The loss amount is the size of the loss incurred by the policyholder for incurred hurt, loss, or damage that is covered by the policy.
analytics Analytics is the process of using data to make decisions.
renters insurance Renters insurance is an insurance policy that covers the contents of an apartment or house that you are renting.
homeowners insurance Homeowners insurance is an insurance policy that covers the contents and property of a building that is owned by you or a friend.
automobile insurance An insurance policy that covers damage to your vehicle, damage to other vehicles in the accident, as well as medical expenses of those injured in the accident.
property insurance Property insurance is a policy that protects the insured against loss or damage to real or personal property. The cause of loss might be fire, lightening, business interruption, loss of rents, glass breakage, tornado, windstorm, hail, water damage, explosion, riot, civil commotion, rain, or damage from aircraft or vehicles.
nonlife insurance Nonlife insurance is any type of insurance where payments are not based on the death (or survivorship) of a named insured. Examples include automobile, homeowners, and so on. Also known as property and casualty or general insurance.
casualty insurance Causalty insurance is a form of liability insurance providing coverage for negligent acts and omissions. Examples include workers compensation, errors and omissions, fidelity, crime, glass, boiler, and various malpractice coverages.
valuation date A valuation date is the date at which a company summarizes its financial position, typically quarterly or annually.
underwriting Underwriting is the process where the company makes a decision as to whether or not to take on a risk.
ratemaking
reinsurer A reinsurer is an insurance company that offers insurance to an insurer.
loss reserve A loss reserve is an estimate of liability indicating the amount the insurer expects to pay for claims that have not yet been realized. This includes losses incurred but not yet reported (IBNR) and those claims that have been reported claims that haven’t been paid (known by the acronym RBNS for reported but not settled).
technical provisions Technical provisions is another name for loss reserves.
experience rating
merit rating
risk classification Risk classification is the process of grouping policyholders into categories, or classes, where each insured in the class has a risk profile that is similar to others in the class.
cream skimming
claims triage
pure premium Pure premium is the total severity divided by the number of claims. It does not include insurance company expenses, premium taxes, contingencies, nor an allowance for profits. Also called loss costs. Some definitions include allocated loss adjustment expenses (ALAE).
loss cost Loss cost is the total severity divided by the number of claims. It does not include insurance company expenses, premium taxes, contingencies, nor an allowance for profits. Also called pure premium. Some definitions include allocated loss adjustment expenses (ALAE).
rating variables
coinsurance Coinsurance is an arrangement whereby the insured and insurer share the covered losses. Typically, a coinsurance parameter specified means that both parties receive a proportional share, e.g., 50%, of the loss.
deductible A deductible is a parameter specified in the contract. Typically, losses below the deductible are paid by the policyholder whereas losses in excess of the deductible are the insurer’s responsibility (subject to policy limits and coninsurance).
policy limit A policy limit is the maximum value covered by a policy.
personal lines
dividend A dividend is the refund of a portion of the premium paid by the insured from insurer surplus.
bonus
retrospective premiums The process of determining the cost of an insurance policy based on the actual loss experience determined as an adjustment to the initial premium payment.
prospective premiums
claims adjustment Claims adjustment is the process of determining coverage, legal liability, and settling claims.
Commercial line Commercial line is insurance purchased by commercial ventures (businesses)
line of business A line of business is a classification of business written by insurers.
claims leakage Claims leakage respresents money lost through claims management inefficiencies.
fraud detection
case reserve A case reserve is an estimate of the insurer’s future liability made by the claims adjuster.
adjuster An adjuster is a person who investigates claims and recommends settlement options based on estimates of damage and insurance policies held.
life Insurance Life insurance is a contract where the insurer promises to pay upon the death of an insured person. The person being paid is the beneficiary.
capital allocation

Here is the code used for producing the list of terms and definitions:

### Chapter 1 Introduction to Loss Data Analytics

{r}
library(pander)
chapter1 <- read.csv("csv/Chapter1.csv", header=TRUE,
                       na.strings=c("."), stringsAsFactors=FALSE)
table1.1 <- cbind(chapter1[, 1], chapter1[, 2])
final.table1.1 <- as.data.frame(table1.1)
names(final.table1.1) <- c("Term", "Definition")
pander(final.table1.1)

5.3 Terms and Chapter First Defined

This section describes how to create a list of terms by the chapter in which they are first defined. Certain terms are defined multiple times throughout the book, so this list can help the reader refer to the chapter in which a term is first used and defined. The terms listed here are sorted in alphabetical order.

We use an example from the glossary of Loss Data Analytics. Here, we use Chapter 1 and Chapter 2 of the book. We display the list of terms by chapter first defined then show the code used to generate it.

Term Chapter first defined
adjuster 1
aggregate claims 2
allocated loss adjustment expenses 1
analytics 1
automobile insurance 1
Bernoulli distribution 2
Binomial distribution 2
bonus 1
capital allocation NA
case reserve 1
casualty insurance 1
claims adjustment 1
claims leakage 1
claims triage 1
coinsurance 1
Commercial line 1
cream skimming 1
deductible 1
Distribution function F(x) 2
dividend 1
experience rating 1
fraud detection 1
Frequency 2
Gamma Distribution 2
homeowners insurance 1
indemnification 1
insurance claim 1
life Insurance NA
line of business 1
loss adjustment expenses 1
loss amount NA
loss cost 1
loss reserve 1
Maximum Liklihood Estimator 2
merit rating 1
Mixture 2
Moment generating function 2
Negative binomial 2
nonlife insurance 1
personal lines 1
Poisson 2
policy limit 1
Probability generating function 2
Probability mass function f(x) 2
property insurance 1
prospective premiums 1
pure premium 1
ratemaking 1
rating variables 1
reinsurer 1
renters insurance 1
retrospective premiums 1
risk classification 1
Severity 2
Survival function S(x) 2
technical provisions 1
unallocated loss adjustment expenses 1
underwriting 1
valuation date 1
Zero Modifided Distribution 2
Zero Truncated Distribution 2

Here is the code used for producing the list of terms by chapter first defined:

{r}
# Chapter 1
table2.1 <- cbind(chapter1[, 1], chapter1[, 3])

# Chapter 2
table2.2 <- cbind(chapter2[, 1], chapter2[, 3])

# Concatenate tables
table2 <- rbind(table2.1, table2.2)

# Sort alphabetically --> do not change
sort.table2 <- table2[order(table2[,1]), ] 

# Remove duplicates --> do not change
library(dplyr)
final.table2 <- as.data.frame(sort.table2)
names(final.table2) <- c("Term", "Chapter first defined")

# Generate table --> do not change
pander(distinct(final.table2, Term, .keep_all= TRUE))

Note that some lines of code say “do not change”. This is because these lines of code apply to the concatenated table which includes all the chapters. We need the concatenated table before we can sort all terms alphabetically and remove duplicates to generate the final table.

5.4 Glossary on GitHub

This section describes how to setup a glossary repository on GitHub. By doing so, users can store the project, make changes, seek feedback and facilitate collaboration. We include suggestions for these different types of users:

  • someone who wants to do a book like ours
  • the reviewer/reader who simply wants to suggest altering or adding a definition
  • a contributing author who needs to compile a csv file of definitions and include them into the chapter using tooltip

5.4.1 Repository Creation to Supplement a Book

The following are suggestions on how to setup a repository on GitHub to store your glossary project. Here, we do not get into GitHub features in detail but we do suggest a place which you can refer to.

  1. We suggest referring to Happy Git to get started on setting up GitHub and linking it to R Studio.
  2. Once you have done so, you can store and update your glossary project on GitHub. Happy Git describes how you can make changes locally, commit and push the changes to GitHub.

5.4.2 Feedback from Reviewers/ Readers

We assume that the author already has a glossary repository on GitHub in order to use the issue feature to receive feedback. The following is an excerpt out of the glossary for Loss Data Analytics on how readers can make suggestions:

When using the glossary, we encourage the reader to provide feedback regarding the terms and their definitions. For example, if the reader feels that there is a better definition for a particular term, the following instructions outline how the reader can suggest improvements.

  • First, open up the issues tab on our repository on GitHub.
  • Click on “create an issue”.
  • Indicate which chapters you want to make changes to in the title.
  • Specify the terms and definitions you wish to change, add or remove.
  • Click “Submit new issue”.

5.4.3 Collaboration from Authors

5.4.3.1 Definitions Compilation

Aside from readers, collaborators can also contribute to the glossary. For example, professors can get authors to assist in compiling definitions.

Collaborators can setup their own GitHub accounts. They can fork the project, make changes locally and make a pull request. The project owner can then merge these changes to update the project. These processes are outlined on Happy Git as well.

As mentioned previously, we suggest compiling definitions in the Excel file instead of csv. This is because making changes directly to the csv file may result in space distortions in the R output of the glossary.

5.4.3.2 Definitions in-text using Tooltip

Further, in Loss Data Analytics, we use tooltip which allows readers to hover over a word in the text so that they may receive the definition as in the following example from the introduction chapter:

When introducing data methods, we will focus on losses that arise from obligations in insurance contracts. This could be the amount of damage to one’s apartment under a renter’s insurance agreement, the amount needed to compensate someone that you hurt in a driving accident, and the like. We call these obligations insurance claims An insurance claim is the compensation provided by the insurer for incurred hurt, loss, or damage that is covered by the policy.. With this focus, we will be able to introduce generally applicable statistical tools and techniques in real-life situations.

The following is the tooltip code associated with the above output:

<a href="#" class="tooltip" style="color:green">*insurance claims*<span style="font-size:8pt"> An insurance claim is the compensation provided by the insurer for incurred hurt, loss, or damage that is covered by the policy.</span></a>.

Note that our version of tooltip is customized within our style.css file. If you prefer another style, you will have to modify the code or replace the style.css file to suit your needs.