This recent press release for FICO’s Model Builder for Big Data reminded me of the work I did last year for the Thomson Reuters Text Mining Credit Risk model. It’s nice to see that other organizations are using big data tools and unstructured text in credit analysis.
Model Builder for Big Data brings proven machine learning and statistical data mining to Big Data for the first time, enabling analysts to find the predictive signals hidden in huge and challenging data sources. Its state-of-the-art text mining capabilities, unique Semantic Scorecard formulation, and embedded Lucene and Tika indexing and extraction libraries, provide powerful mining of text from a wide variety of document types, and boost the predictive strength of its transparent, easily understood scoring formulas. Model Builder for Big Data also integrates Apache Hadoop, the open-source framework for scalable, reliable, distributed computing and storage, and works with Cloudera’s proven, enterprise-ready Hadoop distribution. Along with new support for the popular R language for statistical computing and graphics, Model Builder brings a breadth and depth of functionality for Big Data that is both scalable and cost-effective.