Can Restaurants and Grocery Stores Predict Obesity?

Nick Adamski
5 min readSep 25, 2020
Photo by Lily Banse on Unsplash

Obesity is a leading health concern in the United States, and one that has a large number of contributing factors. There is a lot to be said for economic and cultural issues, as well as exercise incentives such as parks, gyms and school programs. Previously I have focused on the prevalence of “Low Income, Low Access” to food across the country. Now it’s time to dive into how different kinds of food access can have an effect on our health.

Thanks to a robust data set on Kaggle I was able to find a great deal of information on restaurants, stores, farms as well as health and economic factors for each county in the United States. It is easy to see the relationship between obesity and diet, so I aimed to tell a different story. Instead of what kind of food we eat I would focus on where that food comes from. Of course that where still dictates what we eat, but I believe it can be one of the root causes in the obesity epidemic.

Initially I was going to have a very broad number of contributing factors. Then I thought it would be interesting to see if using specific features of the data set could still point towards trends in obesity rates. I kept the “Low Income, Low Access” data, as well as the per capita spending of fast food and full service restaurants in each county. To that I also looked at the the number of fast food and full service restaurants per 1000 people, as well as the number grocery stores, big box club stores, convenience stores, specialty food shops and farmers markets.

Getting into the data

Picking these specific features worked in my favor as they were all numeric without any missing inputs. Now for the nitty gritty modelling. For reference, the baseline mean absolute error (MAE) was reported as 2.86. I passed the data through a linear regression model and saw an improvement to 2.18 on the training MAE and a validation MAE of 2.35. Not a huge jump, but still an improvement.

After this I decided to try two more methods, first using a Random Forest Regression. An initial pass with the RF Regressor gave a training MAE of 0.63 and a validation MAE of 1.71. That is definitely an improvement, but there is with extreme overfitting as seen in the difference between those results. I took this same model and fed it through a grid search in order to fine tune the hyperparameters. The results were a bit more polished with a training MAE of 1.51 and validation MAE of 1.79. A reduction in overfitting at the cost of the validation MAE.

Next I applied the same approach to the XGBoost Regression model. The initial model with no adjustments gave a training MAE 0.25 and a validation MAE of 1.85. Here we see significant overfitting on our model. Again I ran this model through a grid search to find the best hyper parameters and found a training MAE of 1.49 and a validation MAE of 1.82. Here we see that our model’s overfitting was reduced and the validation MAE was close to the tuned RFR model.

I tested both models to see what we our results would produce. Our tuned Random Forest Regression model returned a 1.72 Mean Absolute Error while the tuned XGBoost regression model returned a 1.75 MAE. Both performed well after tuning, but each suffers from overfitting from the limitations of our data. Each model only gives a 67% (RFR) or 68% (XGBR) accuracy, so as is they are not great at giving a precise predictions.

What does this mean?

The predicted change in obesity rate to spending at full service restaurants.

Spending at full service restaurants has the largest contribution to reducing obesity rates. But how does it fair when compared against convenience stores?

Full Service Restaurants sales compared to number of convenience Stores.

Using our models we can see a very clear interaction between full service, sit down restaurants and the concentration of convenience stores in the average county. Obesity clearly rises with the number of corner shops and 7–11s. We also see that spending at full service restaurants can help bring down obesity rates. Sit down restaurants cover a broad range of cuisine and choosing to spend money there is a surprisingly healthy choice, depending on how deep fried your food is of course. The opposite is true of convenience stores, as their options tend to be limited and rarely healthy.

Fast Food Restaurant sales compared to number of Convenience Stores.

Now comparing fast food restaurants sales against convenience stores we see a different relationship. This time we see more of a curve formed by the interaction, with the peak occurring toward the middle of sales per capita. The sharp drop in the last two columns suggests an economic factor that this image can’t properly represent. If a county can spend that level at fast food restaurants, it’s likely that county is already wealthier as a whole. Although we don’t dive into the economic relationship between income and obesity, there have been studies linking the two in a way that might explain this.

Ultimately looking at restaurants and food stores does provide some insight on how obesity can be affected, although it does not provide the complete picture since there are other factors that will contribute. I was surprised by what did and did not affect obesity changes a significant amount. I would have thought grocery stores to have a bigger impact and I did not expect that spending at sit down restaurants is tied to lower obesity. But that is likely tied to socioeconomic factors that we did not dive in to here.

If you are interested in how these changes in restaurants and food stores can affect obesity, please take a look at my Predict Obesity Now app.

--

--

Nick Adamski

Data Scientist. Former Restaurant Manager. Future Food Frontiersman.