In 2015, the number of digital buyers was 1.46 billion. In 2021, the figure increased to 2.14 billion. This growing e-commerce trend inspires businesses to look for innovative solutions to client pains. They integrate AI/ML tools to offer smarter services.
Product matching in e-commerce has recently become a hot topic among retailers. They use a product matching system to ensure a great buying experience and take full advantage of selling online.
Some retailers provide unique offers to their clients to stand out. Others offer common products and put effort to stay competitive. In this article, we will talk about product matching in eCommerce. We’ll also explore the most important deep learning algorithms for making the right product matches.
What Is Product Matching?
A mobile-first approach to software development has changed the way people do shopping. Retailers and buyers operate in an advanced eCommerce environment. Online purchases, transactions, and order fulfillment are made quickly and with no effort. As a part of the digitalization trend, eCommerce businesses use product matching.
Nowadays, people look through different e-commerce platforms to find the needed product at the best price. The most popular eCommerce platforms are Shopify, Magento, Squarespace, WooCommerce, etc. Magento, for example, powers nearly 12% of all eCommerce websites globally. This points to the platform’s convenience for both retailers and consumers.
Between 2017 and 2018, the number of Magento websites almost doubled. Today, this figure keeps growing. We now have more than 250 000 active Magento websites worldwide. If you plan to build an eCommerce store and need help with Magento development, contact the Forbytes team, and we will gladly help.
Dwelling upon the topic, retailers use different eCommerce spaces for selling their products. And it happens that identical products are offered by different retailers on the same platform. A user searches for a specific product and sees the same offer made by different sellers in one place. The consumer compares product price, attributes, and quality and then chooses the best offer.
The situation is analogous to the case when a buyer visits multiple physical stores in the search of the most advantageous offer. The difference is that an online decision is made effortlessly and quickly, without the need to leave home. How does technology help a buyer to make an informed choice based on product comparison? The answer is products matching.
How Sellers and Byers Benefit from Product Matching in eCommerce
Product matching in e-commerce means using deep learning to present the same products offered by different sellers in one search result. Product matching solutions are important for both retailers and online shoppers.
For consumers: product matching is the opportunity to choose the best offer after comparing all available options. Suppose that a consumer wants to buy a table lamp. The main argument in favor of a specific offer will be the price. They go to a particular marketplace and start comparing the options.
It turns out that two sellers offer the same table lamp for $50. Meantime, another seller says “Buy the lamp for the $53 and get a table clock for free!” The price-quality ratio seems more attractive in the second case. If a client needs a table clock, the second option will be more valuable for them. To enable comparison, product matching puts the same offers into one search result and ensures flexibility and convenience for a client.
For a retailer: matching products can be used for developing a rational price policy and keeping a business competitive. When businesses compare product prices on different platforms, they learn more about their competitors and make products stand out. The special offer where the table lamp is sold with the table clock is an example of how a seller can differentiate.
Besides, product matching in e-commerce comes in handy when a retailer wants to make the right offers. To help clients find their product, businesses need to name it in a particular way, use images, and add the right product attributes. Product matching also helps to trace tendencies and learn from the competitors’ behavior.
Product Matching Models
Usually, product information on ecommerce platforms consists of a title, attributes, and image. A product title is a brief text identifying the key information about a product. Product titles consist of a product name and its characteristics. For instance, plaid shirt male. Product attributes provide more details on the product and are usually based on name-value pairing.
To standardize the way sellers present product attributes, they use categorization or structured tables. If we talk about the male plaid shirt, product attributes added on a website may include color and size, information about the fabric, and the manufacturer. Product images illustrate what a product looks like. Sellers tend to borrow the same product image from one another.
Borrowing (or sometimes stealing) an image from competitors makes it easier for a customer to detect identical options. But this approach also reduces the chances that a client will pay attention exactly to your offer. But how to enable bug-free product matching? We’ll discuss a few examples of product matching algorithms below.
The title similarity module is an ML-powered product matching solution. ML compares offers by quantifying the similarity of the titles. The technology easily detects the same titles even if the comparison strings significantly differ.
Suppose that several retailers make the same offer of an iPhone XS Max. This is what may they put in the title:
- iPhone XS Max;
- iPhone XS Max 6.46-inch;
- Apple iPhone XS Max;
- iPhone XS Max Black;
- iPhone XS Max 256 GB NEW.
As you see, the titles for the same product differ from seller to seller. Identifying identical products depends on the model you choose. In this case, to detect the same offers, ML should implement a particular product matching algorithm illustrated below. The algorithm helps evaluate the degree of similarity and put identical products in the same search result.
The first step in the algorithm is preprocessing based on pointwise mutual information. Preprocessing allows ML to see two different tokens as a single entity. This, in turn, enables us to further compute the word-level embeddings. In the first layer, word-level embeddings are trained on the title data from the whole catalog. Training enables deep learning technology to correctly handle valuable data that weren’t trained initially.
Next, the concatenated padded titles are used to train a convolutional network and check if the title length is equal in every option. For this purpose, the skip-gram model can be utilized. And what if one title contains more/less word-level information than another one? To prevent errors, we take a random title and pair it with the same title that randomly lacks some tokens. After adding them as a matched pair, we expand the capability of the title similarity measure.
Product similarity is identified based on the comparison of prices. For instance, there is a range of the same products with approximately the same pricing. One offer stands out from the rest, which may point to the fact that the product is different. This principle works vice versa. To detect similar offers, price distribution is analyzed. In case of similarity, products are displayed in the same range.
There are two data analysis algorithms used to detect price similarities. The first, price outlier detection, is used to detect price similarity when one price is compared to a group of products with similar prices. The test is used when a price is higher or lower compared to the pricing of this product group. The second data science algorithm, clustering, helps understand the volume of similar products based on their price. The learned data can be applied as a feature in the overarching system of product matching algorithms.
Under this product matching algorithm, product match is based on the similarity of product categories. These can be item size, brand, color, condition, model, etc. The technology launches data analysis algorithms and measures the level of discrepancy between products. In case of low discrepancy, the items are considered the same. The following network can be used for extracting the product attributes:
Product attributes can fall into 2 categories: limited range values and endless values. Limited range values have a fixed range of values. With the help of one-hot encoded vectors, these values can be transformed into ML-processed formats. Also, we can apply a convolutional neural network, similar to the one in the title similarity algorithm. It will differ in the output (since there will be only one product title) and in the last layer (since it will be changed to SoftMax).
Meantime, endless values do not have a fixed range. As the range is growing, the accuracy of ML models takes a hit. To solve this problem, we apply attribute extraction in the form of sequence labeling. Under this scheme, every title is tokenized and gets one of the three labels. The labels denote the first or intermediate token or mark that the token is not a part of a brand name.
Image similarity is based on the similar principle as title similarity. To detect the same products, the process of quantifying image similarity is involved. The main challenge of image similarity occurs when there are not enough labeled data points on the product. Moreover, the images of the same product may differ in perspective (as seen below), color temperature, color brightness, etc.
To reduce the volume of manual processing of image pairs, you can apply auxiliary taxonomy to training image-based models. This indirect method allows us to view a particular product in a chain of nodes in the taxonomy. For example, a product may belong to the Clothing > Woman > Blazers node.
Under the given product matching algorithm, we use the first layers to extract features and compute cosine similarity. It is better to apply several architectures. The more models you use, the more idiosyncrasies you’ll be able to handle in the future.
How Else Can Product Matching in eCommerce Be Used?
Apart from helping a client find the right product, ML-powered matching algorithms are useful in the following cases:
Making Relevant Product Recommendations
With products matching, businesses grow their sales by suggesting additional product options for their clients.
For instance, a buyer looks for remote control for a particular device. They put the chosen item into their shopping cart, and the system suggests buying batteries. The product recommendation is relevant for the client, and they purchase both products. Instead of earning $50, a store earns $55. Multiply this figure by the number of your clients, and you’ll see how much your profit can increase.
Establishing Competitive Pricing
As mentioned above, the price comparison model brings advantages not only for a customer. E-commerce businesses also benefit from using deep learning algorithms in shaping their product pricing. Machine learning enables them to automate pricing analysis and track tendencies of how prices change over time.
With price intelligence, a business can increase profit by at least 9%. Manual solutions are not that effective because they require human-based research and manual extraction of product data. It slows down training data processing and reduces value. Moreover, if done manually, pricing analysis impedes business scalability because the process depends on time and resources.
Improving Product Listing
With deep learning in eCommerce, you can improve the way your products are listed. Technology-powered analytics allows you to learn more about client behavior and understand how to better rank products in the list.
You can take matched stores and drive the analysis of competing listings. Trained algorithms on competing listings provide you with insights into the way products are described and titled. Analyzing different sections of identical products helps to detect gaps in presenting an item. For instance, detecting the right keywords may increase the chances that your offer will be ranked high in the general list. As a result, you digest the given output and get the tools to increase conversion.
Detecting Copyright Infringement
Ecommerce businesses spend thousands of dollars on building a solid brand identity. They build e-commerce solutions, write unique content, make catchy product descriptions, and develop attractive designs. Surely, they do this to differentiate and stand out from the crowd. Making unique offers is of the utmost importance for retailers. This is why prompt detection of copyright infringement is crucial in e-commerce.
We can apply the image similarity model to detect cases when a company’s design is stolen by another business. To reinforce the results, the title similarity model can also be used. Using titles as input, we facilitate the search process. If you combine the title similarity model with product image data processing, you get the chance to promptly detect copyright strikes. Even if the information presented along with the design significantly differs from the original.
ML allows for the creation of the right product match. With this ML capacity, eCommerce development is more consistent.
For a seller, it means a higher profit and a clear evaluation of their competitive power. With the help of machine learning algorithms, they prevent situations when a client is presented with an endless list of duplicated products. Sellers provide users with a chance to opt for the most advantageous offer. This pays them off with increased trust and client loyalty.
Also, ML helps to organize identical product offers reasonably in accordance with criteria. It turns e-commerce marketplaces into powerful, intuitive, and convenient spaces for doing business. Businesses get the chance to satisfy the needs of the most demanding clients.
If you want to integrate ML into your eCommerce solutions, contact Forbytes. Our dedicated ML professionals will gladly help you to reach your business goals and make the most of machine learning power.