I am so grateful that the Editorial Board of The North American Actuarial Journal (NAAJ) has awarded the Annual Best Paper to me and to my collaborator Dr. Xu at ISU. The commemorative plaque (see below) is so beautiful, and it is definitely an uncommon excitement during such a difficult time. By the way, there will be a new paper with an improved algorithm to be published in the CAS’s Variance Journal, and please stay tuned if you are interested.
Datasets Kaggle Datasets UC Irvine Machine Learning Repository LOBSTER limit order book samples Cryptocurrency Futures Trading Data at BitMEX IEX TOPS and DEEP R Cheatsheets There are quite a few excellent cheatsheets provides by Rstudio and other contributors, and here listed are some most relevant ones that are useful for financial machine learning with R. Please refer to Rstudio Cheatsheets for a lot more useful cheatsheets.
This question might sound a little silly; how come a sophisticated, advanced, well-built statistical or machine learning model is even worse than my guess? Well, it can happen and it can happen quite often especially for financial data. You must have heard of “garbage in, garbage out”. This is no difference for financial data. What if what you have been doing is to find gold from pure sands? As the signal-to-noise ratio of financial data is almost always very low, the chance is that you have tried very hard but end up with a model which is not better and even worse than your guess.
In this article, you will learn how to set up a research environment for modern machine learning techniques, using R, Rstudio, Keras, Tensorflow, and Nvidia GPU. AWS EC2 users This is probably the easiest approach, and the following steps are used to set up an RStudio server on an AWS EC2 instance with GPU, Tensorflow and Keras pre-installed. If you use a non-linux operational system, this might be the best choice for you to avoid potential hassles.
Here I maintain a list of R tips and tricks that I find useful. Please comment in below if you find something useful but not listed. R tips and tricks Avoid using for loops in R as much as possible; use apply, sapply, etc., or some R pre-packaged functions such as cumsum and aggregate so that for loops might be avoided. Dealing with large dataset: The function fread in the data.
Linux-based operation systems such as Ubuntu 14.04/16.04/18.04 that I have been using are great for scientific computing and for testing and using numerous open source software. I used to be a heavy Windows user, but now I use Windows less and less only when it has to be used. I haven’t studied Linux systematically, and my main approaches have been to google what I need. The following are some useful Linux tricks that are probably what a Linux newbie like me most want to know.