Thursday, December 29, 2016

Machine Learning and Risk Modeling

Machine Learning is offering an interesting possibility to redefine the way current risk modeling occurs in large financial institutions ( banks, insurance companies ). Regulatory and product oriented risk calculations are the crucial part of financial organizations activities ( probability of default for customers, organizations, countries, bank capitalization requirements, credit scores etc.  ). Current practices are a mix of alchemy, crossing your fingers and wishing for the best, some math and the lack of backtesting to check how models are actually performing. Models take long time to develop and execute, with results everybody pretends they believe in.

Machine Learning has a potential to completely redefine ( for the better ) the way risk modeling is done.


Existing resource and time consuming  processes could be replaced with a new Machine Learning paradigm where programs/models change far less frequently. Each new model development iteration doesn't necessarily need to imply a brand new model development ( or model tweaking ) and deployment.  Getting a new, updated model would just require running the existing neural network model with the newly available data. Classic model development lifecycle is replaced with  new techniques that must be learned ( developing, testing neural network models, determining hyperparameters, dealing with bias and variance i.e. preventing under/over fit etc. )
Regulatory aspect also needs to be taken care of, as regulators need to be on board with the proposed model changes ( for capitalization related risk calculations ).

While new ( supervised ) Machine Learning based paradigm will not save us from outliers ( Black Swan events ), it will definitely be more accurate, easier to deploy and maintain than the existing one. Thus we think it is high time even for mainstream financial organizations to start establishing foothold in self-programming world.       

Thursday, December 22, 2016

Forget About AI - Machine Learning Is What Matters

IT, like any other mature industry,  shows less and less capacity for the true and radical innovation. This is proven yet again with the latest waves of AI noise ( skyrocketing NIPS attendance, mediocre Salesforce Einstein software, crazy prices for companies repackaging old Machine Learning concepts).

What is a typical (or major ) financial institution to make and do about the latest craze ?

AI is a wide discipline with many, typically siloed i.e. disparate problem areas - it offers domain specific solutions to diverse, often unrelated sets of problems ( self-driving cars; recommendation systems; speech recognition ). It is also completely empirical ( result driven ), with no theoretical foundations or explanations why artificial neural network work the way they do, for example. (" But it works " - G. Hinton  51:00).

We remember quite well noises, notions and semi-flops of the past ( CASE, OODBMS, Y2K, Hadoop ). Even the Cloud made limited direct inroads to the standard enterprise landscape.  Cloud didn't became mainstream replacement for on premise hardware and software, as majority of mission critical corporate systems still keep data and run in house.

Oracle CEO Larry Ellison Bashes 'Cloud Computing' Hype ( please note Ellison repented since this 2009 cloud bash episode ).


Consequently we think that, once AI smoke clears and mirrors are gone, all that will be left will be good old Machine Learning ( lucratively renamed Deep Learning ), which will hopefully gain some foothold in more forward thinking ( or more adventurous ) financial establishments.



Saturday, August 20, 2016

Financial Markets and Deep Learning Methods

Financial markets modeling is quite imprecise, non-scientific discipline, resembling alchemy and other futile human endeavors. We know by now that it is next to impossible to fully model and completely predict how markets will behave. There are many factors  affecting security prices and financial markets in general, with human reaction being the biggest unknown and one that is impossible to model and thus predict.
What we can strive to do is perhaps to model smaller, relatively limited domains - an approach that is  not completely dissimilar to the difference between classic and quantum physics domains, for example.
Supervised learning gives us predictable powers, but is limited because we rely on history to predict the future. That all but invalidates it for this purpose as financial markets history ( and future ) rhymes, but doesn't repeat.
Unsupervised learning, on the other hand, does not have training set ( history ) problem, but provides limited set of options regarding what we can actually do with it. Aside from running clustering algorithms ( k-means etc. ) to find out groupings in data, there isn't much in proactive department that can be done with it.
Combination of supervised and unsupervised (semisupervised )methods is one approach that could potentially be successful ( ensemble learning i.e. combining multiple models into one ). There are dozens of algorithms already implemented in many frameworks and languages that can be combined to form a useful model.
We could for example run unsupervised deep learning algorithm on massive volumes of raw data to automatically discover areas of interest (clusters of data), then perform further analysis via targeted supervised learning methods to find out how data is correlated. This approach would give us tactical advantage i.e. useful actionable information.
The whole process can be automated  and performed on massive volumes of data ( sample = ALL ) quite inexpensively.  The latest wave of technology makes it possible to store and process EVERYTHING - all indexes, stock prices, derivatives, currencies, commodity prices - as much data as we can get or  buy - and store it cheaply on Hadoop cluster, preprocess using Spark framework, then model with TensorFlow or Spark MLlib. It can all be done in the cloud, and it is all open source software.
Amazon AWS even offers GPU instances to boost processing power ( Google Cloud doesn't offer GPUs yet ).
TensorFlow can take advantage of distributed GPUs, or any combination of CPU/GPUs.
Final result is an automated system that will react and recommend ( and possibly automaticaly act )  on new insights.
We are aware that HFT and systematic trading do something similar already, but our point of interest is not short term arbitrage ( which seems to be running out of steam anyway ) - it is to exploit deeper, longer lasting knowledge about market direction.  This is not rule based, hardcoded system that acts on predefined insights. It is a live system that automatically learns, gains insights and changes behavior. We could think of it as an attempt to create AlphaGo or DeepBlue for financial markets.
Renaissance Technologies, Two  Sigma and other hedge fund high fliers probably already utilize or are building similar systems.  What is new is commoditization of such approach - suddenly almost everybody can do it. Open source nature of the latest advances in AI, Deep Learning and advances in affordability and power of parallel processing level out playing field, thus nullifying decades of incumbent advantage. The race is on to transplant the latest relevant Deep Learning advances from Silicon Valley to active segments of financial industry. This could further stir already shaken hedge fund industry.

Thursday, July 21, 2016

Deep Learning, TensorFlow and Hedge Funds

Deep Learning is in these days. It really looks like late 19th century electricity gold rush deja vu, with industry, academia, as well as regular businesses all jumping in.
Some hedge funds are already heavyweight users of computer power and models ( systematic funds like Renaissance, Two Sigma etc. ).

The complexity and size of modern, globalized markets mandates usage of more advanced, automated data analysis methods. There are already indications ( visible if we observe how hedge funds fared  during Brexit turmoil, for example) that systematic funds stand better fighting chance in today's turbo markets.

The promise of Deep Learning is that critical mass of data and bigger neural networks ( more layers, more computing power ) mean qualitative change and that a new level of modeling accuracy is  achievable. General Deep Learning algorithms can be applied to domain specific use cases, relying primarily on massive amounts of data as the fuel for insights ( rather than the particular domain knowledge i.e. coming up with clever domain specific algorithms ).

Google's TensorFlow is recently open sourced library that runs on heterogeneous ( CPU, GPU, FPGA ), distributed platforms. It can be ( and already is ) used for financial markets modeling. If history is any guide, TensorFlow will ignite a whole new industry and many products will have it as architectural foundation.

TensorFlow makes it easy to perform complex analysis ( flexibly apply Deep Learning algorithms ) to multi dimensional data ( tensors ) and come up with relatively reliable predictions on where market will be based on earlier closed markets, for example. Naturally many other ideas and hypothesis can be tested ( models can be trained and executed-interferred) with great ease  - and that is probably one of the most important TensorFlow advantages.

Since hedge funds deal with publicly available data sets then cloud infrastructure ( AWS, Google Cloud ) can be utilized to essentially rent a supercomputer and perform massive calculations on the cheap. TensorFlow can light up such virtual supercomputer with just a few  lines of code.







Sunday, April 10, 2016

Corporate Analytics and Heterogeneous CPU/GPU Clusters

A couple of relatively recent developments point to a new computing paradigm potentially making impact on corporate IT environment:

  • the rise of GPU capabilities and related frameworks ( Nvidia, AMD chips; CUDA, cuDNN, OpenCL software )
  • well publicised advances in Deep Learning  ( A. Ng was an early proponent of  GPUs in Deep Learning )
  • the Release of Google TensorFlow software ( seamless deployment of Deep Learning algorithms in heterogeneous CPU/GPU/mobile environments )
Some of these developments might eventually trickle down to corporate analytics departments as they push boundaries of what is possible in massive numeric calculation space ( especially in financial modeling/risk/stress testing, instrument pricing etc. ). Financial analytics is well positioned to take advantage of these advances as many risk calculations, for example, are numerically intensive and often embarrassingly parallel ( matrix operations, scanning large volumes of data ). 

GPU's main attraction is the ability to perform instructions in parallel (SIMD, SIMT ). Typically GPU has hundreds of cores on a single chip, as opposed of just a few cores on a standard CPU. GPUs are affordable commodity processors produced in millions for use in gaming computers.
Some of GPU drawbacks are relatively small GPU memory; logic ( branching/control ) capabilities are limited; the need for data movement transfer between CPU and GPU memory in heterogeneous, mixed workload environments.







Nvidia CUDA is a proprietary GPGPU ( General Purpose GPU ) API. OpenCL is a framework for writing programs that execute in heterogeneous CPU/GPU environment.

Some elements of a strong activity in Deep Learning area are directly applicable to financial industry  computing needs. For example, Deep Learning algorithms often involve executing large matrix operations or solving differential equations using stochastic gradient descent -  a common occurence in financial industry numerical calculations.

Deep Learning frameworks like Caffe ( single node ), Theano, as well as Google's TensorFlows are able to take advantage of both CPUs and GPUs ( they use CUDA or OpenCL for low level activities ). Google has TensorFlow framework (  single node version was open sourced in November of 2015 ).  Spark eco system is developing frameworks like SparkNet, CaffeonSpark that make it possible to execute Deep Learning algorithms in heterogeneous CPU/GPU environments.

Core Spark project announced that it might utilize OpenCL to better take advantage of GPU capabilities.

Here is speedup achieved with HeteroSpark with mixed CPU/GPU cluster:

Last but not least, Facebook released design for the Big Sur - 8 GPU card server with configurable PCI paths with intra-node parallelism. Nvidia announced DGX-1, 170 teraflop, $130,000 monster - a supercomputer in a box. Such designs might denote a shift to configurations with smaller number of more powerful servers.