Myrrix is a complete, real-time, scalable clustering and recommender system, evolved from Apache Mahout™. Just as we take for granted easy access to powerful, economical storage and computing today, Myrrix will let you take for granted easy access to large-scale "Big Learning" from data.
What is a recommender engine?
You have already met one. When you buy books online, and the e-commerce site suggests other books you may like, or when your movie rental site e-mails you with new movies they think you will want to see, or when advertisers choose to send a new promotion to you in particular based on your past ad clicks, then you're probably interacting with a recommender engine. If you've heard of "collaborative filtering" -- for purposes here, it is what recommender engines do.
Historically, recommender engines have only been used to connect users to new products. Amazon.com deploys a well-known recommender engine that can learn from your past purchases, clicks, comments and more. It helps Amazon.com understand your tastes and preferences, and, promote to you the products you are most likely to want to buy. The same technology is used to identify natural groups, or clusters of items. Or users.
Not just for books anymore
But, recommenders and clustering are not specific to people and books. This family of so-called unsupervised machine learning techniques excels at inferring new associations between things, given a bunch of existing associations. Those associations may be users associating to books by viewing, reading or buying, but could as easily be associations from people to people, from social profiles to web pages, from tags to documents, from companies to employees, from advertisers to consumers.
Anywhere you can quantify associations from one type of thing to another, recommender engines can help find new, as-yet-unknown associations, or natural groupings.
Do you need learning?
Online retailers were among the first to deploy learning techniques like recommender engines, and these have become a vital part of their business. Good product recommendations can drive more sales. Content sites like YouTube and Netflix drive more viewings by recommending videos intelligently; music sites and blogs have the same need, as more content consumption means more advertising revenue. Social networks and social media companies use recommender engines to recommend new person-to-person links.
But recommenders and clustering are more general tools. Clustering can help segment a customer base according to behavior and tastes, which is an essential function of any 21st-century retailer, online or offline. Advertising networks can become more intelligent about targeting ads by learning from past clicks, using recommender engines. In fact, anyone with data that looks like associations between some things and other things -- anyone with lots of this data -- can likely find a need for a recommender engine. Which is probably you too, if you're reading this.
Tackling Big Data
Recommender engines and clustering are not new, but until recently, like many aspects of machine learning, they have been accessible only to big companies who employ the rocket scientists who can understand and implement the algorithms. These techniques thrive on data: the more the better. Today, companies are both blessed and cursed with increasingly massive data sets to learn from. This makes recommender engine and clustering technology more valuable than ever. Yet, at the same time, it has become more difficult to efficiently implement and deploy these technologies in the face of so much more input data.
The Apache Mahout open source project has brought together some of those rocket scientists, since 2008, to implement and popularize key machine learning algorithms in a way that could handle "Big Data". Maybe you're using it already. Today Mahout is the de facto tool for large-scale machine learning, built on top of Hadoop and MapReduce. It has produced efficient, parallelized implementations of many difficult algorithms. However, it is still a project suited only to the adventurous developer. It has produced great raw materials -- code -- if not yet a product.
Myrrix is your Big Learning engine
Myrrix packages, completes and extends the clustering and recommender engine components developed within Mahout to be that product for you. Myrrix is from the author of its recommender infrastructure and its primary committer. Myrrix is an evolution of Mahout, made into a platform, ready to provide recommendations to your applications.
Continue to learn about the design of Myrrix.