Big data with a big vision: Joren van Severen joins our growing data team
Exciting news once again; Following our recent breakthroughs in intelligent mood analysis and prediction, we are proud to welcome Joren van Seven to the Argus Labs team. Joren will be working as a data scientist, tightening the connection between our machine learning algorithms and our big data lambda architecture.
Joren received his masters of science degree at Ghent University and is an expert in mathematics, statistics and computer science. As an excellent programmer, Joren won several international developer contests, illustrating his expertise in a variety of architectures and programming environments. His experience with probabilistic models for performance analysis in sports, and the development of big data analysis tools at Alcatel-Lucent, pushed him to become a machine learning adept.
Hi Joren, it is great having you at Argus Labs! Are you ready for an exciting new adventure?
JvS: Definitely! I can’t wait to start working on the next step to Artificial General Intelligence.
In the past, you worked on semantic reasoning for embedded devices such as smartphones. How does semantic reasoning complement statistical or probabilistic machine learning?
JvS: In fact, semantic reasoning complements machine learning as much as machine learning complements semantic reasoning. Semantic reasoning can help to enrich data sets with more meaningful features, which are then analyzed by machine learning algorithms. On the other hand, a lot of research also focuses on the use of machine learning methods towards the exploration and management of ontologies, including semantic reasoning. This is especially helpful when reasoning with uncertain or incomplete data.
You will be working as a data scientist at Argus Labs. As such, could you explain to me what a data scientist does?
JvS: A data scientist uses his expertise in machine learning and other related fields to develop algorithms that can extract knowledge from -usually big and unstructured- data. This knowledge is described by mathematical models that can in turn be applied to make predictions or help in decision making.
Data science: a hype or the future?
JvS: Both, actually. Obviously, data science will always exist to some degree and will become more important during the next years. At some point however, computers will have become intelligent enough to automate the role of the data scientist. In this sense, the job of a data scientist is both extremely interesting and a little masochistic. In the very near future, powerful frameworks and API’s will bring data science the masses.
Big data is cool, but the true challenge still resides in learning from small data where we hit the so called ‘Curse of Dimensionality‘. Is this curse related to black magic?
JvS: If only it were that simple; call every problem ‘black magic’, and avoid solving it. It may be called the ‘Curse of Dimensionality’, but it does not mean the end of the world, and there are definitely ways around it.
Simply said, one would expect that the accuracy of machine learning models keeps improving as long as we keep feeding the algorithms extra ‘features’ that describe the problem at hand. For instance, recognizing a person’s face based on the distance between his eyes alone is much more difficult than accomplishing this task based on extra information such as the position of his mouth and nose. However, while this is true to some extent, the ‘Curse of Dimensionality’ states that machine learning models actually start deteriorating if the diversity of available information becomes too large.
The linked article eloquently explains why this is the case for classification, but in general a high dimensional model tends to overfit. This means that the model will do wonders with the training data, but produces terrible results for unseen data. In fact, the model has learned the noise that is specific to its training set.
In order to avoid this curse, one can simply reduce the amount of features. Another option is to generate variations of the training set and use these to decide the parameters of the model, or to combine the models trained on each variation.
One of the cool projects we work on at Argus labs focuses on the relation between music and mood. If we want you to be in a good mood, what kind of music should we add to our playlist?
JvS: Classic Rock ‘n’ Roll! From AC/DC to Led Zeppelin and the Rolling Stones.
How would you introduce or describe yourself in a non-professional context?
JvS: Enthusiastic and friendly guy with a passion for technology.
As a true ninja, Joren will start his battle against the Curse of Dimensionality at Argus labs on July 14th. If you would like to learn more about artificial general Intelligence and big data processing, or if you are simply looking for an enthusiastic and friendly guy with a passion for technology, feel free to get in touch by email (firstname.lastname@example.org), or start Tweeting (@jorenvs)!