Food purchases in households with and without diabetes based on consumer purchase data

purchases

Objectives: Dietary recommendations for individuals with diabetes are easy to provide, but adherence is difficult to monitor.The objective of this study was to investigate whether there was a difference in grocery purchases between households with and without diabetes.Study design: Cohort study.Methods: Consumer purchase data in 2019 was collected from 6662 households donating their supermarket receipts via a receipt collecting service.Of these households, 718 included at least one individual with diabetes.The monetary percentages spent on specific food groups were used to characterize households using all purchases in 2019.A probability index model was used to compare households with diabetes to households without diabetes.Results: We included 405,264 shopping trips in 2019 attributed to 6662 households.Both households with and without diabetes spent the highest monetary percentage on sweets (with diabetes: 9.3%, without diabetes: 8.8%), with no statistically significant difference detected.However, compared to households without diabetes, households with diabetes had a significantly higher probability of spending a higher monetary percentage on butter, oil and dressings; non-sugary drinks; processed red meat and ready meals as well as a significantly lower probability of spending a higher monetary percentage on accessory compounds; alcoholic beverages; eggs; grains; rice and pasta, and raw vegetables.Conclusions: Households with diabetes spent a relatively higher monetary value on several unhealthy foods and less on several healthy groceries compared to households without diabetes.There is a need for more diabetes selfmanagement education focused on including more healthy dietary choices in their household grocery purchases.

Introduction
Dietary and lifestyle factors are important for the development of type 2 diabetes as well as for managing established disease [1].Sedentary lifestyle and unhealthy dietary choices that lead to obesity increases the risk of developing diabetes, and once diabetes has been established, dietary advice is a cornerstone of diabetes management, and subsequent adherence to such advice is critical [2].There were 280,130 prevalent cases of diabetes in the Danish population, corresponding to 4.8% of the population, by the first of January 2017 [3].The Danish diabetes association estimates that 93% receive medical treatment [4]. (p21) Current evidence supports the importance of consuming both dietary patterns and specific individual foods in the management of diabetes.Specifically, diets characterized by a higher intake of whole grains, fruits, vegetables, legumes as well as nuts and a lower/moderate intake of alcohol, refined grains, red or processed meats, and sugar-sweetened beverages have been shown to improve glycemic control [5,6].A cohort study in almost 100,000 individuals, investigating how the frequency of meals prepared at home was associated with risk of diabetes, found that frequent consumption of meals prepared at home was associated with lower risk of developing type 2 diabetes [7].While current knowledge has established a number of dietary recommendations for patients with diabetes, such recommendations are easy to provide but difficult to monitor.In observational studies of the role of diet in management of diabetes, common approaches of dietary monitoring include dietary recall, food diaries, food frequency questionnaires, and household inventories [8][9][10][11].However, these approaches are all costly, time-intensive, and subject to self-reporting bias [12].The current status of dietary monitoring has left a gap of knowledge as to how well patients with diabetes generally adhere to advice and actually change prior habits.A promising possibility for monitoring diet is consumer purchase data.Monitoring the actual purchases has several advantages over traditional methods of dietary assessment, including objectivity, cost-efficiency, longevity, and limited burden on the participant [13].Households in developed countries buy most of their food from supermarkets and make an average of two visits to a supermarket every week [14].According to Statistics Denmark, an average Danish household spent 34,380 DKK (approximately 4620 Euro) on edible groceries in 2019 [15].A cross-sectional survey comparing three months of electronic supermarket sales data with individual dietary intake, estimated from four 24-hour dietary recalls collected in the same period, suggested that consumer purchase data may be a useful surrogate measure of several nutrient intakes of individuals.Specifically, they found the highest Spearman rank correlation coefficient for percentage of energy from saturated fat (0.54) [16].A systematic review from 2007 investigated the feasibility of using consumer purchase data in nutrition monitoring.Based on 18 studies utilizing consumer purchase data, the review supports the use of this data source to monitor dietary patterns in the population [13], alluding to the same advantages as previously listed.Based on a large number of consumer purchase data linked with nationwide administrative registers, this study investigated differences in purchases of food groups in households with and without at least one individual with diabetes.

Sample
The cohort consisted of 9332 Danish volunteers, who are users of a smartphone-based receipt collection service, covering three of the five largest supermarket chains in Denmark, together providing a mixture of discount, intermediate, and high-end supermarkets, widely available.The volunteers agreed to share the entirety of their supermarket receipts, collected via the application, as well as their unique governmentissued identification number (CPR-number) through a protected project site, enabling protected linkage to the Danish registers in Statistics Denmark where the data are likewise stored and organized.

Exclusions
Only purchases concerning edible groceries were included.To achieve a comparable population not differently affected by seasonality or secularity, the time period was restricted to include only 2019, thus excluding all application users without purchases during this period (n = 2557).This resulted in an analytical cohort of 6775 application users from 6662 households, including 17,632 individuals.

Diet
Once the consumer purchase data was received, there were 336,401 unique product names, based on how the various supermarkets decide to name the foods when they are included in the product range, including both edible and non-edible goods.The Food Institute at the Danish Technical University maintains a public food composition database (Frida) [17].Within this database there are 1190 identifying names that aim to include the majority of groceries that are widely available, further detailed with 150 pieces of information, including energy and basic food components.Through regular expressions, the unique names originally received were paired with the Frida identifying names and non-edible foods were identified and excluded.In collaboration with a dietitian, inspired by the wider categorization in Frida and based on nutritional composition, the 1190 unique foods were a priori categorized into 27 wider groups for further analysis: Accessory compounds; Alcoholic beverages; Bread; Butter, oil, and dressings; Cereals; Cheese; Crackers and cakes; Eggs, Fish, and other aquatic animals; Fruit products; Fruit, raw; Grains; Legumes; Milk products; Non-sugary drinks, Nuts and seeds; Poultry; Processes red meat; Ready meals; Red meat, Rice and pasta; Salty snacks; Sugary drinks; Sweet spreads; Sweets; Vegetable products; Vegetables, raw.A full list of specific food names and food groups can be found in supplementary material.

Covariates
All covariates were collected per baseline 1st of January 2019.Households were identified through an unique family ID obtained through the Income Statistics Register [18,19].
Date of birth and sex were obtained from the Danish Civil Registration System [20].Information on glucose-lowering medication (ATC code A10) was retrieved from the Danish National Prescription Registry [21].A household with diabetes was defined by at least one individual from the household claiming a prescription for glucose-lowering medication prior to baseline.Income was collected from the Income Statistics Register [18] and included as the average equivalised income over the last five years leading up to baseline, accounting for redistribution of income within a family [22].Information on education came from the Population Education Register [23] and the household's municipality enabling the degree of urbanization classification, which follows Eurostat's Degree of Urbanization [24], came from The Danish Civil Registration System [20].The family structure variable was composed on the basis of the following age division: Child as defined by age < 18 years, adult as defined by age 18-65 years and elderly by age > 65 years.

Outcome
Analyses were conducted on a household level, assuming that while the purchase of groceries is made by an individual, it is consumed on a household basis; thus, if two or more individuals from the same household contributed with receipts, these were combined.To characterize household shopping behavior, the total amount in DKK spent in 2019 on specific food groups was divided by the total amount spent on all groceries in 2019.For a given food group we refer to this ratio as the monetary percentage.

Statistical analyses
The probability index was defined as the probability that a random household with diabetes used a higher monetary percentage on a specific food group than a random household without diabetes.Thus, the value of the probability index can be interpreted as the probability that a household with diabetes spent a higher monetary percentage on a specific food group than a household without diabetes, 50% indicating no difference.We modeled the probabilistic index conditional on covariates [25], adjusting for family structure, degree of urbanization, highest educational attainment, and average equivalised income in the household.We fitted these probabilistic index regression models using a Cox regression model, one for each food group.Reported were results for 27 food groups, where we applied a Bonferroni-corrected significance threshold of 0.002.As a supplementary analysis the analyses were performed in households with only a single member, results are included in the supplementary material.

Sample characteristics
The study included 6662 households and 718 households had at least one member with diabetes (requiring antidiabetic drug) before 1st of January 2019.There were 28 households with more than one member with diabetes.There was a larger proportion of households encompassing elderlies living alone and two or more elderlies living together among the households with diabetes, compared to the households without diabetes.A smaller proportion of the households included at least one child among the households with diabetes, compared to those without diabetes.Households with diabetes more often lived in rural areas, compared to households without diabetes.Furthermore, households with diabetes had a larger proportion of basic and vocational training as highest educational attainment, and conversely a lower proportion of bachelor's and higher education as highest educational attainment, compared to households without diabetes.Finally, households with diabetes had a lower proportion belonging to the highest income quartile, compared to the households without diabetes, although not significanly different.Both types of households had roughly similar patterns of number of shopping trips and total monetary value spent (Table 1).
The distributions of the monetary percentages spent on the food groups by households with and without diabetes, respectively, are included in the supplementary material.

Difference in groceries
Fig. 2 shows results of our probabilistic index regression analysis.Households with diabetes had significantly higher probabilities of spending a higher monetary percentage on the following food groups: Butter, oil, and dressings; non-sugary drinks; processed red meat as well as ready meals.Conversely, households with diabetes had significantly lower probabilities of spending a higher monetary percentage on food from the following food groups: Accessory compounds; alcoholic beverages; eggs; grains; rice and pasta as well as raw vegetables.The analyses were performed in only single households and the estimates are presented in Supplementary Fig. 2. Overall, the results did not change the conclusions.

Discussion
A remarkable result is that the single most notable expense in households is spent on sweets independently of whether there were individuals with diabetes in the household.The study shows that whilst similar patterns in the relative amount spent on a range of grocery categories were found, there were some significant differences in grocery purchase patterns between households with and without diabetes when adjusting for family structure, degree of urbanization, highest educational attainment, and average equivalised income in household.We found that households with diabetes had a significantly higher probability of spending a higher monetary percentage on foods from the following food groups: butter, oil, and dressings, non-sugary drinks, processed red meat as well as ready meals, compared to households without diabetes, when adjusting for the above mentioned covariates.Conversely, households with diabetes had a significantly lower probability of spending a higher monetary percentage on accessory compounds, alcoholic beverages, eggs, grains, raw vegetables, as well as rice and pasta, compared to households without diabetes, when adjusting for the above mentioned covariates.Although remarkably similar patterns in grocery purchases were detected, households with diabetes purchased relatively more of some foods that can be characterized as unhealthy, and purchased relatively less of some foods that are often characterized as constituent of a healthy eating pattern.
Ready meals are notoriously calorie dense food group, high in both saturated fat and salt [26,27].The relatively higher purchases of ready meals as well as butter, oil, and dressings by households with diabetes are thus concerning, as studies have shown a positive association between intake of saturated fat and risk of type 2 diabetes [28], suggesting effects on management as well.The higher purchases of ready meals is also at odds with the World Health Organization's recommendations of salt reduction to less than 5 g/day to reduce blood pressure and risk of cardiovascular disease [29].Adoption of a low-glycemic index or Mediterranean eating pattern, which are both low in refined carbohydrates such as cookies, crackers, biscuits, and cakes and conversely high in vegetables and healthy fats, has been shown to have positive effects on diabetes management, including improving markers of cardiovascular disease risk [30,31].Furthermore, several observational studies and meta-analyses have reported positive associations between consumption of processed red meat and risk of diabetes [28,[32][33][34], which may indicate an association with diabetes management as well, even though we were not able to find any studies investigating this specifically.Thus our findings indicate that several of the food groups that households with diabetes purchased relatively more of, compared to households without diabetes, are ill-suited for optimal diabetes management.Regarding the foods that households with diabetes had a lower Fig. 1.Monetary percentages spent on food in descending order for a household with and without diabetes, respectively.probability of purchasing a higher percentage of, compared to households without diabetes, evidence points towards that several of the foods are actually characterized as beneficial constituents of a diet leading to successful diabetes management.For instance, there has been much controversy about egg consumption; whilst excessive egg consumption has been shown to increase diabetes risk [35], a review from 2019 showed that several interventional clinical trials indicated a positive association between egg consumption and improved blood lipid profile, insulin sensitivity and glucose response, suggesting better diabetes management [36].Moreover, it is concerning that the households with diabetes had a significantly lower probability of spending a higher relative amount on raw vegetables, as vegetable consumption is a key feature of the healthy eating patterns recommended for individuals with diabetes [30,31,37].
In contrast to our findings, a Danish study, investigating adherence to dietary recommendations in patients with diabetes (n = 774) compared to the general population (n = 2899), based on self-reported food frequency questionnaires, found that patients with diabetes consume a healthier diet compared to the general population.Namely a dietary pattern consisting of less sugar and alcohol as well as more vegetables and dietary fibre [11].Interestingly, our findings of what food groups households with diabetes spent a higher monetary percentage on compared to households without diabetes, align with foods known to constitute a risk factor for developing diabetes, also affecting management of evident disease, and purchase less of foods widely accepted as important constituents of a healthy diet.Together, this suggests that more focus should be put on changing grocery shopping habits in the household including individuals with diabetes.Whilst our study is not directly comparable to the study using food frequency questionnaires, our study does question the validity of using food questionnaires for patients with diabetes.

Implications
With increasing use of payment options that facilitate automatic collection of purchases, this resource should be further investigated for automated tracking and feedback on the health pattern of food purchases.This could facilitate evaluation of the effect of interventional strategies targeting diet.These purchase statistics could also be provided on an individual level direct-to-consumer and be an asset for comparing Fig. 2. Probability that a household with diabetes spends more on a food group than a household without diabetes, adjusted for family structure, degree of urbanization of residence, highest educational attainment and average equivalized income in household.personal goals with actual purchases.

Strengths and limitations
This epidemiological study is strengthened by a rather large sample size and that the assessment of diet does not rely on self-reported measures.Furthermore, we have repeated assessments of grocery purchases over a longer period from three out the five largest retailers in Denmark.
Limitations include the fact that although the receipt data received from the receipt collection service covers purchases done in some of the largest Danish supermarkets, we do not have access to the entirety of the household's grocery purchases.Furthermore, the subjects had varying shopping patterns.Another limitation was that the project experienced a data delivery issue in March, May, and June, resulting in very few receipts from these months.Thus, we have to operate under the assumption that the current data is a representative sample of the true grocery purchase pattern when summed over a long period.Foods were identified and matched using regular expressions and although all matches were manually checked, errors might have occured.Furthermore, there were some names on the receipts that we were not able to identify and those were categorized as "unknown" (n = 719).Grocery shopping is usually done by households and is not necessarily reflected in the individual intake.However, it is a reasonable assumption that purchases in supermarkets are representative of the home food environment, which in turn shapes what is actually ingested by the individual [38,39].Another limitation is that amount is estimated by monetary expenditure, which does not account for price dispersion among seemingly identical products, nor does it necessarily reflect consumption, as some food groups are more expensive than others.A key limitation of this study is the possibility of external generalisation, as it may be a selected population that uses the receipt collection service.However, the above-mentioned limitations affect the whole study population, and if bias persists, we assume the misclassification of the diet to be non-differential in regard to diabetes status.Finally, some misclassification of households with diabetes is expected as our definition of diabetes does not include individuals with diabetes managed only by lifestyle and/or diet, which is estimated to be 7% [4]  (p21) .
In conclusion, our findings indicate that consumer purchase data provides a novel and efficient approach to dietary assessment that may add important data to the traditional strategy of food frequency questionnaires.Specifically, we found that households with diabetes purchased more of several food groups that are widely regarded unhealthy and less of several healthy food groups, compared to households without diabetes.This indicates that a higher focus could advantageously be placed on addressing grocery shopping habits that shape the food environment in the household in which the individual with diabetes lives, in order to increase the chances of a healthy eating pattern, which in turn provides the basis for better diabetes management.More is needed to be known about the relationship between groceries purchased on a household basis and what is actually ingested by the individual with diabetes.

Ethical approval
Danish registry-based studies that are performed for the sole purpose of statistics and scientific research do not require patient consent nor ethical committee approval, as stated in The Danish Data Protection Act [40].Approval to use the data sources for research purposes was granted by the data responsible institute of the Capital Region of Denmark (approval number P-2019-263 and P-2019-191) in accordance with the General Data Protection Regulation (GDPR).Data is accessed on secure servers under Statistics Denmark and cannot be shared, according to Danish legislation.

Table 1
Household characteristics per 1st of January 2019.
a Chi-square tests.