A Primer on Open Source AI Platforms

A Primer on Open Source AI Platforms

The technological ecosystem needed to enable AI has finally formed.  And, just like a perfect storm, AI’s timeline and path is hard to predict and many business owners don’t know whether to closely follow and obsess with it or hope it passes them by altogether.

What comprises the ecosystem needed to set the growth curve sharply upwards, as in the classic hockey stick analogy we all know very well?  First, vast and rich data sets are forming rapidly.  Big data require hefty processing power and storage, which is becoming more cost-effective and accessible every day, even to small companies. And, finally, expertise on how to deploy A is becoming available, but unfortunately the demand appears to exceed the talent supply.

Keep your eye primarily on the open-source AI platform players.  Why?  Most of the tech behemoths dominating the space today, such as Google, Amazon and Facebook, are already working on AI and with open source tools, which practically guarantees the open-source players will dominate due to the amount of brainpower and budget being poured into it.

If you determine that your business could accelerate growth by investing in AI technologies and talent, we recommend you take a look at the following market dominators.

  1. TensorFlow: Born out of Google’s machine learning efforts, Tensorflow is their second generation machine learning system.  Designed for numerical computation, the intelligent platform enables the production of computational flow graphs.  With a free Apache 2.0 license, you can go gangbusters with the open-source platform that  you can train on on your desktop GPU or mobile phone.

2)  Prediction.io/now Einstein by Salesforce:  Because prediction.io harbored a community of more than 8,000 developers perfecting 400 AI apps, Salesforce couldn’t resist making the acquisition back in February of last year.  Their well-placed bet on prediction.io paid off as it served as the brains behind Einstein, which boasts prediction modeling (a marketer’s dream) within their market-dominating CRM.

3) DMLT (Distributed Machine Learning Toolkit) by Microsoft: Designed for big data applications, DMLT aims to make model training across multiple nodes easier and faster.  At its core is a C++ SDK (Software Development Kit) for client server architectures and LightLDA, a scalable algorithm designed for large data models.  Other algorithms that are included help determine the relationship of words to one another.

4) H2O.ai: Used by more than 75,000 data scientists and 8,500 global organizations, H2O.ai powers the AI efforts of fintech, insurance and healthcare companies and was named by CB Insights as one of the top-100 AI companies. H2O.ai is reported to be flexible in that it’s data agnostic, supporting common database and file types, and features a friendly WebUI and other interfaces.  Lastly, this platform can handle massive amounts of data and users can train models on complete data sets and develop them in real time. Score your newly modeled data against previous models for accurate predictions.

5) Deeplearning4j:  The name means deep meaning for Java is a learning library for Java Virtual Machines (JVM).   Also distributed, Deeplearning4j is led by a team of data scientists, semi-sentient robots and Java systems engineers.  Also, a wide Github community can offer up the tips you need to when you’re training a distributed deep-learning network to run with an enterprise application like Hadoop or Apache Spark.  Nothing too sexy here to show, sorry.

6)  Caffe: Developed by Berkeley AI Research (BAIR) and a community of contributors, Caffe is able to process 60 million images a day via an NVIDIA K40 GPU.  Designed to be fast and modular, community users love it for that and because they can train on a CPU or GPU seamlessly and deploy to mobile devices for testing.  Because it’s been forked by over 1,000 developers, the result is a perfected pool of models and code users can tap into.

While there are many options to choose from when selecting an AI platform, the solutions highlighted here are among the leaders in the open-source category.  Companies looking to utilize AI to improve business processes, operations or customer service should take a first look at open-source software due to its inherent low cost, flexibility and development community.  While open-source software may not make sense to use across the board at your company, it’s low cost certainly provides companies small and large with opportunity to put AI to the test.