--- title: "Warning message with perccalc package" author: "Jorge Cimentada" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Warning message with perccalc package} %\VignetteEngine{knitr::rmarkdown} usepackage[UTF-8]{inputenc} --- While the other vignette shows you how to use `perccalc` appropriately, there are instances where there's just too few categories to estimate percentiles properly. Imagine estimating a distribution of `1:100` percentiles with only three ordered categories, it just sounds too far fetched. Let's load our packages. ```{r, warning = FALSE, message = FALSE} library(perccalc) library(dplyr) library(ggplot2) ``` For example, take the `survey` data on smoking habits. ```{r, message = FALSE, warning = FALSE} smoking_data <- MASS::survey %>% # you will need to install the MASS package as_tibble() %>% select(Sex, Smoke, Pulse) %>% rename( gender = Sex, smoke = Smoke, pulse_rate = Pulse ) ``` The final results is this dataset: ```{r, echo = FALSE} smoking_data %>% arrange(pulse_rate) ``` Note that there's only four categories in the `smoke` variable. Let's try to estimate the percentile difference. ```{r} smoking_data <- smoking_data %>% mutate(smoke = factor(smoke, levels = c("Never", "Occas", "Regul", "Heavy"), ordered = TRUE)) perc_diff(smoking_data, smoke, pulse_rate) ``` `perc_diff` returns the estimated coefficient but also warns you that it's difficult for the function to estimate the standard error. This happens similarly for `perc_dist`. ```{r} perc_dist(smoking_data, smoke, pulse_rate) %>% head() ```