Data Mining

My Research interests are very wide, and involve both theoretical and practical development of data mining algorithms.

For many years I researched primarily on theoretical symbolic machine learning for data mining, enjoying the richness and elegance of the topic, but without understanding the true limitations of such methods in real-world problems. I have recently departed from my previous, and ultimately wrong, approach of trying to develop a general methodology that can solve any kind of problems, and concentrated on a subset of exciting, interesting and important real-world applications.

The name Data mining could be misleading by making an analogy with gold or coal mining. Whereas the purpose of gold and coal mining is to extract gold and coal respectively, the goal of data mining is not to extract data, but knowledge. Knowledge is a very important asset, which can make difference between survival and annihilation, success and failure of individuals, organizations, economic and politic systems.

Extracting knowledge from data is very challenging and complex. It usually involves an inductive inference process, where general hypotheses, or concepts, are generated through the analysis of a limited number of observations. Intuitively, such generalizations might not be correct, since inductive inference does not allow to assert the truth value of the conclusions, unlike deductive inference (e.g. Aristotelian syllogism).

Applying Data Mining techniques to real-world applications is both very rewarding and extremely complex because results are not only table of numbers or equations, but concrete concepts which can be used to solve important problems such as mitigate hazard risks, find patterns of improvement or worsening of patients to a particular treatment, predict the behavior of stocks, or understanding the threatens to homeland security. In order to achieve such task, the learned knowledge must be presented in a form that is easy to interpret and understand.

 

Remote Sensing

Every day, several times a day, satellites from different countries are able to gather high resolution images, down to few centimeters or less, of the Earth in the infrared, optical and microwave spectral bands.

Very high resolution remote sensing data are primarily aimed at application related to national security, and are classified and not available to the average researcher. However, lower resolution data (in the order of few meters) can be used for studies of the Earth and its environment, track changes over time, and predict how such changes may affect future generations.

My interest for remote sensing started in 2004, analyzing global spatial and temporal patterns of precipitation. I have since then analyzed in depth several other parameters, with primary interest in Surface Temperature.

I have analyzed with my colleagues this parameter and several others in relation to earthquakes, hurricanes, seasonal and inter-annual phenomena such as the Monsoon and El-nino. Most noticeably, I have found statistically significant changes before earthquakes, which has received a lot of attention as potentially give a better understanding of the related processes.

One of the major problem with remote sensing is that the always increasing number of satellites, equipped with always more precise instruments, generates huge amount of data on a daily basis. Analyzing such amount of data is a Herculean task, which benefits enormously from automatic data mining algorithms. No person can possibly go through such an enormous amount of data, but on the other hand is possible to analyze synthesis of the data, in particular when this is in a form that is easy and practical to understand.

I am currently in charge of operating the antenna receiving station located in GMU Fairfax Campus. We currently receive data from the NOAA polar series and the Chinese Fenyung 1 series. Our data is being used for hurricane studies, and often used by local stations for their weather forecasts. The data can be freely downloaded at http://terra.cos.gmu.edu/antenna



Guido Cervone
Center for Earth Observing and Space Research,
George Mason University, MS 5C3, 4400 University Drive, Fairfax, VA 22044
703.993.1799