Feature Transformation Clause Samples

Feature Transformation the Spatial Case‌ In a similar way, when working with spatial data, each observation (i.e. each row of the data matrix) is associated with a multi-dimensional location. In the planar case, for example, each location may be represented with two Cartesian coordinates (for simplicity, these are denoted as a single column S in Definition 2). The spatial equivalent of lagged data in the temporal case is known as flatted data, and consists of creating, whenever necessary and convenient, new variables that contain the value of the original ones at neighboring locations. We interpret this as another form of feature transformation, and we treat it similar to the temporal case. Not only does this kind of spatial transformation entail the problem of establishing distance, and in which direction(s) one should go per each point but also, different locations may be combined into single variables as we did for different lags. An example of this application can be seen when considering the problem of studying the presence of a certain biological entity at each point of the area under study, at a certain moment in time. It may the case that the variables, whose influence we are assessing, exercise such an influence from the neighboring locations at earlier times. We call this transformation spatial selection with intervals and functions. Again, going after a higher interpretability, both the area of influence and the ways in which the attribute influences are limited. Not unlike in the temporal case, but with higher degrees of variability, spatial transformations may be performed by searching into different shapes of neighboring areas, and combinations may use different functions. Although the principles are the same, in our experiments we will limit ourselves to simple, squared areas and linear combinations.
Feature Transformation the Nonlinear Case‌ Finally, feature transformations allow us to deal with nonlinear problems, on a limited basis, which arise in systems in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to the fields of engineering, biology, physics, among many others, because most systems are inherently nonlinear in nature. Traditional methods of solving these problems are generally known to fall in the field of non linear programming. But, to a certain extent, nonlinear transformations are feature transformations and can be treated as such. We limit ourselves to considering simple nonlinear transformations in which variables are simply elevated to a certain power. Nonlinear transformations, however, can be combined with both lag selection and spatial selection, whenever necessary and convenient.
Feature Transformation the Temporal Case‌ Traditionally, in machine learning, methods that handle temporal information are the same as those that handle time series. A time series is a series of data (in our case, a row of the matrix) labelled with a temporal stamp. In the literature, there are two main problems associated with time series: time series explanation and time series forecasting. These problems are usually associated with different contexts and are approached with different tools, which, however, share some common ideas. Among these, lagged models are the best known ones. In lagged models, data are systematically transformed by adding a delay so that a traditional, propositional learning algorithm can be then applied; among the available packages for this purpose, we reference ▇▇▇▇’▇ timeseriesForecasting [13]. Lagged models are flexible by nature because they are not linked to a specific learning schema, and their focus is on time series explanation. While explicit models can be used for forecasting, it is not their focus considering that their forecasting horizons are limited to the maximum lag in the model. The key point to note is that lagged variables can be seen as a form of feature transformation, in which the correct lag is chosen (i.e. delay) to apply to a certain column (possibly, within a certain name). Given this, we could argue that feature selection is a particular form of lag selection. Such an observation can be further generalized into considering a combination of different, but temporally adjacent, lags for the same variable, driven by a linear function, as an example. As another example, consider temporal data for an air pollution problem and assume that, among several variables, one particular variable denotes the wind speed at a certain location at a certain time. While it is clear that the wind speed influences the pollution concentration at that same location, the temporal aspects of such an influence may vary. To take this aspect into account, we may consider a data transformation that combines different lags (e.g., the wind speed two hours before the considered moment, one hour before, and at the considered moment), and assume that they all have an influence. By combining such lagged variables into one, we may obtain a more informative, and very interpretable data set to be used in a regression problem, for instance. We call this type of transformation lag selection with intervals and functions.