About: Nutrition-equivalent food breakdown
Back to the tool!

What is this?

Most likely, you're sometimes a little unclear about what you're putting into your body. These charts show the nutritional equivalents of a large collection of foods.


USDA Foodgroups

This data is from the USDA's nutritional information database. This database contains 7412 foods. The food groups are:

  1. Legumes and Legume Products
  2. Beef Products
  3. Soups, Sauces, and Gravies
  4. Fats and Oils
  5. Baked Products
  6. Meals, Entrees, and Sidedishes
  7. Fast Foods
  8. Beverages
  9. Sweets
  10. Spices and Herbs
  11. Vegetables and Vegetable Products
  12. Nut and Seed Products
  13. Pork Products
  14. Finfish and Shellfish Products
  15. Dairy and Egg Products
  16. Breakfast Cereals
  17. Ethnic Foods
  18. Baby Foods
  19. Poultry Products
  20. Cereal Grains and Pasta
  21. Snacks
  22. Lamb, Veal, and Game Products
  23. Sausages and Luncheon Meats
  24. Fruits and Fruit Juices

Latent Foodgroups

What if we want to discover good labels for foods, to best explain their nutritional information? Using the tool described in the Methodology section, we have have discovered a set of 24 (and 10) nutritional "themes" found in common foods themes. The 24 themes are:
  1. Unsaturated vegetable oil (nuts and mostly unsaturated vegetable oils) (monounsaturated fatty acids, polyunsaturated fatty acids, beta-sitosterol, arginine, niacin, saturated fat, thiamine, copper)
  2. Fortified cereal (e.g., multigrain cheerios)
  3. Starch nutrients (Folate, starch, cystine, phenylalanine, serine, glutamic acid, thiamin, tryptophan, proline, pantothenic acid)
  4. Good meat nutrients (thiamin, calcium, iron, riboflavin, niacin, vitamin b6, phosphorus)
  5. Mustard oil (mono) (vitamin D, monounsaturated fat, vitamin A RAE, cholesterol, tocopherol)
  6. Flaxseed oil (mono, poly) (polyunsaturated fat, phytosterols, monounsaturated fat, vitamin K, saturated fat)
  7. Tangerines (beta cryptoxanthin, beta carotene, lutein, vitamin A IU, vitamin A RAE, fiber, vitamin C)
  8. Carbohydrates and thickeners (gelatin, cornstarch) (Carbohydrates, fiber, NaCl, Energy, Ash, phosphorus, total fat, selenium)
  9. Salt (NaCl, NaHCO3) (NaCl, water, maltose, ash, fiber, protein, potassium, calcium, vitamin A, vitamin D, fats)
  10. Fiber and minerals (foods: potatoes, wheat bran, tea, spices) (fiber, manganese, copper, magnesium)
  11. Liver (vitamins A, B12, D, copper)
  12. Industrial Coconut/Palm oil (Saturated fats, total fat, phytosterols, vitamin K, monounsaturated fat, vitamin E)
  13. Decaffeinated softdrinks (with fluoride) (vitamin C, water, fluoride, vitamin K, vitamin A)
  14. Desalted egg whites (lysine, methionine, histidine, glycine, alanine, leucine)
  15. Animal byproduct nutrients (cholesterol, vitamin b12, zinc, selenium)
  16. "Alcoholic / caffeinated drinks" (ethyl alcohol, caffeine, theobromine, lycopene, water)
  17. Greens (e.g., turnip, kale, cress) (lutein, beta carotene, vitamin K, galactose, vitamin A, fiber, folate)
  18. Cheese (cholesterol, retinol, calcium, saturated fat)
  19. Beets (beta tocopherol, vitamin E, betaine, phytosterols, gamma tocopherol)
  20. Cupuassu oil (undifferentiated, monounsaturated and saturated fats, cholesterol)
  21. Trans fats (hydrogenated and soybean oils)
  22. Raw sugars (sugar, lactose, sucrose, fiber)
  23. Carrots (beta carotene, fructose, glucose, vitamin A, maltose, fiber)
  24. Fish oil (cholesterol, vitamin D, unsaturated fat, saturated fat, vitamin b12)
Interestingly, when using 10 themes, the only vegetable foodgroup to appear was the carrots/squash foodgroup; other vegetables, such as broccoli, were combinations of other categories; there was also no dairy group and no fruits group; these appear to have been mixed up by the additives in cereals and sweetened beverages. Instead, the discovered food groups appear to be largely a result of manufactured foods and the manufacturing decisions that go into these foods (such as hydrogenated soybean oils vs. palm kernel and coconut oils).
  1. Lard
  2. Fortified cereal
  3. Lean meat
  4. Walrus liver
  5. Flavored water drinks
  6. Commodity shortening, (hydrogenated) soybean
  7. Carrots or Squash
  8. Industrial shortening, (hydrogenated) palm or coconut
  9. Gin
  10. Dried tea (and spices)


We found these breakdowns using topic models. Topic models are a way of finding collections of themes in text documents. Here we treated foods as documents and nutritional information as word counts. Generally, topic models can discover these themes automatically. To keep a common vocabulary, we chose to define these themes (formally, "topics") based on food group labels provided by the USDA. This is useful because it is not necessarily based on preconceived notions of the major food groups; instead, it defines food groups, or "nutritional themes", that best explain the variance in the dataset. In statistical terms, this is a bit like applying principal component analysis. To convert nutrients to words, we first normalized all nutrients to have mean 100 (unitless) and then rounded nutrient weights to the nearest integer.


Sean Gerrish