My first R package on CRAN
There comes a day when one needs to share programming code that could be useful to the broad scientific community. In the past, I have made a passing note in manuscripts that the statistical code is made available by simply emailing me with a request. This was nice in some ways as I was able to meet people around the world interested in some of my research. However, this isn't practical as email accounts can change and it limits discovery of the methods. So, we need a code repository to host the data. The current state of code sharing seems to be dominated by GitHub, and I have a GitHub for this purpose. However, there is still a desire to have an R package be "officially" available on the Comprehensive R Archive Network (CRAN).
So perhaps this could be thought of as a statistical right of passage. Said differently, it's about time that I made this step to share code on CRAN. Well, it's my pleasure to introduce my first package -- mueRelativeRisk. The methods for the methods have been published elsewhere, so I wanted to give a general summary of the method here.
First, what's the problem?
A problem that is common to many areas of math is division by zero is bad. When quantifying relative risk, you need the probability that an outcome will happen if 'exposed' and divide that by the probability in a control condition. In the example that is listed in the above paper, the "outcome" was serious hypoglycemia. The exposure in question was intensive glycemic control for patients with diabetes. This technique tries to mimic the body's natural ability to regulate glucose in the body. A frequently encountered problem is that the attempts to keep glucose levels low can result in glucose levels that are too low and are unsafe (serious hypoglycemia). The control condition in the study was a less intensive management. We wanted to quantify if the intensively managed patients were at an increased risk of hypoglycemia in the context of the research study. If so, protocol modifications would need to be considered to minimize the risks of participating in the study.
The statistical problem is that we didn't observe hypoglycemia early in the study. So, our estimates of relative risk were undefined. Do we throw are hands up and say, sorry, I can't quantify the relative risk? Well, in fairness, that part of our approach. However, importantly, we worked to try to come up with a solution that work. Our estimator for relative risk, based on the median unbiased estimator, is now available and simple to use.
In the paper, we reported that at time of the first interim analysis, we had 7 total participants in the study, 3 of which were randomized to intensive glycemic control. There were no serious hypoglyemic events at that point in the study, so the the common estimate for relative risk would be RR = (0/3) / (0/4), which is not defined. The following code allows one to estimate the relative risk based on the median unbiased estimator.
Once the function is executed, the following results are obtained. This blog post skips over how the confidence interval is generated. We'll save that for a future post as it's pretty cool.
The methods suggests that the estimated RR is 1.3 (95% CI: 0.2 to 8.1). Essentially, this result is statistically inconclusive, but nonetheless, it can give valuable information about the statistical uncertainty in the estimate and allows for complete monitoring of the study results.