All Articles Food Restaurant and Foodservice Solving the messiness of food data


Solving the messiness of food data

5 min read

Restaurant and Foodservice

New data sources, along with artistic product descriptions and a lack of classification standards, have dumped a mound of valuable but hard to interpret data on our doorstep. Now what?

We at Food Genius see food and think data. No matter if we’re looking at a menu, a receipt or an elaborate product description from a supplier, we see food terminology and think data. For example, let’s look at three Thai/Asian salads: the Rad Thai Salad from SweetGreen, the Thai Chicken Salad from Panera, and the Premium Asian Salad with Crispy Chicken from McDonald’s. Salads are simple, right? Just greens, vegetables, a protein and dressing. Ah, but we all know life just isn’t that easy. The true insight is in the detail.

Between SweetGreen, Panera and McDonald’s, they have more than 30 distinct salads on their core menus. To even begin understanding this from a data standpoint, we need to cluster (or what Food Genius calls “normalize”) them by type. In our case, the type is Thai/Asian salads.

Now that we have clustered these salads together, we can organize the data. We took a simple approach and focused on ingredient type and we didn’t concern ourselves with preparation methods, health-claims or sensory terms (all of which are important but add complexity).

You could keep nuancing this example but even at this level of interpretation, insights start to surface. For instance:

  • Key in on a few of the primary attributes of a salad: greens, vegetables, protein. You’ll immediately notice that Panera and McDonald’s are closely aligned, far more than SweetGreen and Panera.
  • You’ll notice that nuts and seeds are a standard of identity for this salad type, either as a topping or as a dressing flavor.

These types of analyses are nothing new. For decades, the approach in gathering this data was to send people running around the country collecting menus. They would then hand-enter the menu data based on some predetermined classification methodology. This method, besides being tedious from a resource standpoint, also leaves you with little control over the data. What if you (a supplier, distributor or operator):

  • Don’t want to look at 5,000 menus but instead 50,000. And you don’t want to see them once a year, but every month?
  • Need to understand POS or receipt data from multiple different sources that have different naming and abbreviation conventions?
  • Have decided to tackle the herculean effort of adopting a new product classification system and need to test and iterate possible structures?

As foodservice has become more complex and data continues to become more available, a dynamic (technology-driven) approach to making sense of food data is necessary.

Food Genius’s approach is to utilize cloud computing and machine learning algorithms to deliver highly scalable and flexible data-driven business tools. As much as our technology, it’s our methodology for cleaning up the messiness of food data that is at our core. We normalize, organize and classify.

(Photos: Food Genius)

  1. Normalize – Normalization is grouping. The best way to group is with a bottoms-up approach. Using our salad example, Food Genius begins normalizing by analyzing ingredients and identifying patterns within the ingredients. The common ingredients we’re looking at in our case are greens, toppings and a dressing. So now, we’ve programmed our algorithms to understand we’re looking at salads. We see that the toppings themselves and flavor profile of the dressing are characteristic of an Asian or Thai salad. The saying “Show me, don’t tell me” is brought to mind. Don’t tell me the name of an item, show me what’s in it. Then I can tell you what it is.
  1. Organize – As items are normalized, we know we’re looking at all the same ‘types’ of salads, so the organization of the data for these items becomes apparent. In our example, we see two of the three salads have an herb ingredient and all three have a protein. This is fairly easy for the human eye to pick up when looking at three menu items but keep in mind that in the U.S. alone, we have over one million eating establishments that represent tens of millions unique menu items.
  1. Classify – Through normalization and organization we have a solid foundation. With classification, we build it up. A classification system, specifically a hierarchical classification system, is what gives the once unstructured data structure. Referencing back to our example, these will be the row headers. A comprehensive classification system is what allows Food Genius to query the data for all salads containing an Asian flavored dressing or all salads that include nut toppings. Accurate classification goes a long way in taking you from data to insight.

Food Genius cut its teeth in working with restaurant menu data and it’s still a very critical data set for the foodservice industry to understand. However, what we see in the not too distant future is a tremendous amount of new data becoming available: customer data. With the proliferation of restaurant technologies, from guest analytics to online delivery services, data is now being generated that was previously unobtainable.

Jason Felger is CEO of Food Genius.


If you enjoyed this article, join SmartBrief’s email list for more stories about the food and beverage industry. We offer 14 newsletters covering the industry from restaurants to food manufacturing.