Thoughts on why current machine learning alone cannot fully automate comprehensive trademark clearance searching.
A number of people and firms have applied machine learning techniques to the task of automating word mark clearance searching.
And others, including the global head of R&D at one of the leading law firms, have suggested that it’s a valuable R&D path to try now.
I’m not so sure that’s so.
We’ve tested search tools that are described as machine learning-driven and compared the results to those from our own search tool which uses a human-designed model of similarity
(US patent here https://patents.google.com/patent/US20080228485A1/en )
All are competent and useful for short text strings of a couple of words, particularly where both words are distinctive. However, the machine learning search tools are much less effective when one of the search strings is 3 or 4 or more words and distinctive terms are mixed in with descriptive words.
If as a mental model one thinks of machine learning as automated statistics, then for it to be effective it must have enough good quality training data to find statistical significance even in the less frequent so-called edge cases; those that appear towards the boundaries of a bell curve distribution.
And from that data, the ML needs to acquire a model of which words are distinctive and which descriptive or generic. Which of course varies within the context of the class of goods and services the trademark is being applied for. “Pizza’ is, of course, generic within class 30 (ready to cook pizzas) and class 43 (restaurants). But would be distinctive in all other classes.
Does the ML get a statistically significant ‘lock’ on the context when pizza is descriptive and generic, and when it is distinctive? Likewise for the tens of thousands of other words that appear on any trademark register of size. I’m not sure it does or can.
To determine this is in fact relatively straightforward. But it requires a different, non-ML technical approach (which we use in our automated search) and one that is not readily combined with a machine learning model. Though I think the excellent Markify.com search tool might.
Could ML eventually develop so that it can solve this?
Perhaps. I think it’s wrong to bet against human ingenuity and innovation. (Indeed, thanks to progress in batteries and motors we may even be witnessing the advent of flying cars, that staple of science fiction and Hollywood cartoons for 70 years).
To solve the pizza problem and the 10s of thousands of other distinctive/descriptive questions, will need advances in the learning algorithms, such that they can tease statistical significant signal from the sparse, noisy, edge case data.
Open AI, the not-for-profit research company founded by innovation luminaries, talk of meaningful progress towards, or achievement of, a human level of automated intelligence by 2030.
We shall see. I for one hope we do get to fully automated, lawyer-free, clearance searching. There are much better, more interesting, and beneficial things for the society for lawyers to be doing rather than trawling through trademark records. Which is why we developed our search tool 15 years ago and which has been used on more than a million searches. But it does not quite replace the final review of attorneys.