## Abstract

In this study, an experimental system entailing ciprofloxacin hydrochloride (CIP) removal from aqueous solution is modeled by using artificial neural networks (ANNs). For modeling of CIP removal from aqueous solution using bentonite and activated carbon, we utilized the combination of output-dependent data scaling (ODDS) with ANN, and the combination of ODDS with multivariable linear regression model (MVLR). The ANN model normalized via ODDS performs better in comparison with the ANN model scaled via standard normalization. Four distinct hybrid models, ANN with standard normalization, ANN with ODDS, MVLR with standard normalization, and MVLR with ODDS, were also applied. We observed that ANN and MVLR estimations’ consistency, accuracy ratios and model performances increase as a result of pre-processing with ODDS.

- adsorption
- artificial neural network (ANN)
- bentonite and activated carbon
- ciprofloxacin hydrochloride (CIP)
- multivariable linear regression model (MVLR)
- output-dependent data scaling (ODDS)

## ABBREVIATIONS

- CIP
- ciprofloxacin hydrochloride
- ANN
- artificial neural networks
- ODDS
- output-dependent data scaling
- MVLR
- multivariable linear regression
- FQs
- fluoroquinolones
*R*- correlation coefficient
*R*^{2}- coefficient
- MSE
- mean square error
- RMSE
- root mean square error
- MAE
- mean absolute error
- MAPE
- mean absolute percentage error
- MEDAE
- median absolute error
- AARE
- average absolute relative error
*q*_{e}- CIP amount adsorbed at equilibrium
*C*_{e}- residual concentration at equilibrium
*C*_{0}- initial concentration of CIP
*V*- volume of CIP solution
- M
- mass of adsorbent used
- A
- experiment dataset comprising a matrix of [m x n]
- m and n
- size of experiment dataset ‘A’
*x*- experiment data
- new input–output parameters for ODDS
*Y*_{mn}- output variable of set A
*X*_{mj}- input variable of A(m,j)
*X*_{mn}- input variable of set A
*j*- input variable number
*x*_{n}- normalized experiment dataset
*x*_{min}- minimum value of the variable
*x*_{max}- maximum value of the variable
*SS*_{e}- residual sum of squares
*SS*_{t}- total sum of squares
- mea
- measured value
- mean measured value
- pre
- prediction value
- mean prediction value
- N
- experiment number
- P(%)
- performance increase
*Y*_{1}- performance achieved via standard normalization
*Y*_{2}- performance achieved via ODDS
*Y*- multiple linear regression model
*β*_{0}- intercept point
*β*_{1}…β_{ρ}- regression coefficients
*X*_{1}…X_{p}- input variable
*σ*- residual standard deviation

## INTRODUCTION

Fluoroquinolones (FQs) are among the most important antibacterial agents used in human and veterinary medicine. Thanks to their strong adsorption levels, sewage sludge is usually rich in FQs and carries the latter to the terrestrial environment, through the disposal of sewage sludge on agricultural land. Among the FQs, ciprofloxacin hydrochloride (CIP) was most frequently detected in waste and surface waters, with concentrations reaching several hundred ng/L (Wu *et al.* 2010). CIP is highly soluble in water in both high and low pH environments and exhibits higher levels of stability in soil (Li *et al.* 2011).

The most common method used for CIP removal in activated sludge processes is adsorption. The removal efficiency in this process is 52.8–90.8% (Li & Zhang 2010). Various mechanisms including photodegradation, adsorption onto particles, and biotransformation conditionally influence CIP's fate in the environment. Both adsorption and photodegradation strongly influence CIP's fate in aquatic systems. CIP rapidly photodegraded (t_{1/2} ∼ 1.5 h) with numerous photodegradation products observed while particulate organic carbon levels remain low (Cardoza *et al.* 2005). Biodegradation, or removal of CIP from aqueous solution, is a difficult problem. Among the techniques used for removal of CIP from wastewater, adsorption stands out as an effective process of choice, because of its affordability and ease of implementation. To date, several reports related to the adsorption of CIP to natural materials or components of natural materials have been published, using activated carbon (Carabineiro *et al.* 2012), activated charcoal and talc (Ibezim *et al.* 1999), montmorillonite (Wu *et al.* 2010; Qinfeng *et al.* 2012), soil (Carrasquillo *et al.* 2008; Vasudevan *et al.* 2009; Conkle *et al.* 2010), 2:1 dioctahedral clay minerals (Wang *et al.* 2011), kaolinite (Li *et al.* 2011), modified coal fly ash (Zhang *et al.* 2011), aerobically digested biosolid (Wu *et al.* 2009), and sawdust (Bajpai *et al.* 2012).

The past decade saw a growing interest in applying neural networks to many fields of science and engineering, and neural networks are now applied to a multitude of tasks, such as adsorption (Yurtsever *et al.* 2015; Mandal *et al.* 2015), pattern recognition (Azad *et al.* 2014), function approximation (Zainuddin & Pauline 2008), dynamical modeling, time series forecasting, and data mining (Ye *et al.* 2007). A glance at the existing literature reveals that researchers employed linear and non-linear models to model their experimental work with the help of one or more artificial intelligence techniques, and to enable comparisons. This approach led to more reliable and consistent results (Elkhoudary *et al.* 2014; Ghaedi *et al.* 2014, 2015; Singh *et al.* 2009, 2013).

Sometimes called connectionist models, parallel distributed processing models, and neuromorphic systems (Zhang 2000), an artificial neural network (ANN) is essentially a system of data processing based on the structure of a biological neural system. Numerous successful applications of ANNs were implemented to solve environmental problems, as they offer a reliable and robust way of capturing non-linear relationships between variables (multi-input/output) in complex systems (Turan *et al.* 2011).

In this study, an experimental system entailing CIP removal from aqueous solution is modeled by using ANNs and multivariable linear regression (MVLR). The results of the experiments were folded into estimation functions utilizing MVLR and ANN models. The estimation models allow simple and quick calculation of adsorption levels, without engaging in experiments, or where experiments are not cost-effective.

For modeling of CIP removal from aqueous solution, using bentonite and activated carbon, we utilized the combination of output-dependent data scaling (ODDS) (Polat & Durduran 2012) with ANN, and the combination of ODDS with MVLR model. These models were then applied to bentonite and activated carbon adsorption. This served to reveal the performance impact of the standard normalization processes and ODDS method applied in estimation models. The models’ performance regarding adsorbents was evaluated with reference to statistical analysis methods including correlation coefficient (R), determination of coefficient (*R*^{2}), mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), median absolute error (MEDAE) and average absolute relative error (AARE).

## MATERIALS AND METHODS

### Materials

We received CIP with a purity level higher than 99% from SANOVEL Pharmaceutical Industry in Turkey. The limit of FQ acid is less than 0.2% (w/w). Particle size distribution on 20 American Society for Testing and Materials is 0.24%. The stock CIP solution has a concentration of 50 mg/L. It was then filtered through filters with a mesh size of 0.45 μm. The initial concentration of CIP solutions (20–45 mg/L) was achieved by diluting the respective stock solution with deionized water. Bentonite, in turn, was obtained from Çankırı-Turkey. It has a particle size of less than 0.6 mm. The activated carbon used in the set-up has a particle size of 0.5–1 mm.

### Experimental procedure

The adsorption of CIP on bentonite and activated carbon was measured using batch equilibrium experiments at room temperature (23 °C). For each experiment with initial concentration of CIP, a different amount of adsorbent was mixed with 50 mL CIP solution, at a different pH level. The mixture was then shaken for different agitation periods, at different rates.

A large number of adsorption experiments were performed to account for a wide range of conditions, including different pH levels (2–10), adsorbent dose (0.125–0.7 g/L), contact time (5–30 minutes), initial CIP concentrations (20–45 mg/L) and agitation rate (100–300 rpm).

After mixing, the supernatants were separated through filters with a mesh size of 0.45 μm. The equilibrium concentrations of CIP in the filtered residual solutions were determined using a UV spectrometer (Hach-Lange DR 5000). The wavelength used to analyze the concentration of CIP was 275 nm. The calibration curve was established with 10 standards between 1 and 45 mg/L with the *R ^{2}* higher than 99%.

The adsorption capacities were calculated with reference to a mass balance of CIP in the solutions and were represented in units of milligrams of CIP per gram of adsorbent. The adsorption capacities at equilibrium were calculated according to Equation (1):
1where *q _{e}* and

*C*are the CIP amount adsorbed (mg/g) and the residual concentration (mg/L) at equilibrium, respectively;

_{e}*C*

_{0}is the initial concentration of CIP (mg/L);

*V*and

*m*are the volume of CIP solution (L) and the mass of adsorbent used (g), respectively.

### Modeling approach

#### Experiment dataset and statistical analysis

Data obtained with respect to the variables of CIP adsorption on bentonite and activated carbon measured in the experiment are presented in Tables 1 and 2, respectively.

In this study, agitation rate (rpm), contact time (min), adsorbent dose (g/L), pH, initial concentration (mg/L) are fed into ANN as inputs; *q _{e}* is the output of ANN.

#### ODDS and normalization

Normalizing the data by pre-processing is quite important for managing calculation load and improving ANN success ratio when estimation modeling is based on ANN. The conventional way of doing this in the literature is by taking the maximum and minimum values of the dataset and normalizing them within the 0.1–0.9 interval.

The working of ODDS method depends on the relationship between the attributes constituting the dataset and the output variable of the dataset. The first step in achieving this is to reduce the calculation costs of classification and prediction algorithms by placing each parameter from the dataset in an interval calculated using the ODDS method. Following the normalization step during which the dataset was pre-processed using ODDS, the ANN and MVLR estimation models were formed.

If ‘A’ is defined as an experiment dataset comprising a matrix of [m × n], *Y _{mn}* will be the output variable of set A, whereas

*X*will be the input variable of set A. Accordingly, Equation (2) was defined for ODDS; thereafter the ODDS dataset was calculated (Polat & Durduran 2012): 2where represents the new input–output parameters for ODDS;

_{mn}*m*and

*n*are the dimensions of the experiment dataset ‘A’; and

*j*is the number of the input variable.

#### Data division and pre-processing

Data division and pre-processing for use in MVLR was carried out in three stages, applying the ANN model on the experimental data, the statistics for which were presented in Tables 1 and 2 above.

In Stage 1, the experimental data are divided into two groups, so that 75% (243 items) is used for training and 25% (81 items) comprises the test data. In Stage 2, the data divided are normalized in the interval 0.1–0.9 according to Equation (3). In Stage 3, the divided data were first scaled using the ODDS method, and then normalization was carried out according to Equation (3).
3where *x* represents the experimental data; *x _{min}* is the minimum value of the variable;

*x*is the maximum value of the variable; and

_{max}*x*is the normalized experimental dataset.

_{n}The data were divided into two groups as training and test data, using the three stage process described in Figure 1, then underwent pre-processing. These processes were carried out separately for both bentonite and activated carbon experiment data.

#### Evaluation of prediction performance

Even though the performance of an ANN is generally assessed by measuring its learning ability, the fact that a network has given the correct answers for all examples in the training set does not necessarily mean that its performance will be strong. Strong performance entails an expectation that when the test data that were not previously fed into by the system are finally entered, the ANN would perform at a comparable level of accuracy (Öztemel 2003).

In this study, the approximate performance of ANN with test data was also evaluated using a multitude of statistical analysis methods, including the determination of coefficient (*R*^{2}), MSE, RMSE, MAE, MAPE, MEDAE and AARE. These parameters were calculated according to Equations (4)–(12):
4
5
6
7
8
9
10
11
12where *SS _{e}* represents the residual sum of squares;

*SS*is the total sum of squares;

_{t}*mea*is measured value; is the mean of measured value;

*pre*is the prediction value; is the mean of prediction value; and

*n*is the number of experiments.

Equation (13) is defined in order to quantify the increase in the performance values calculated using the formulas given above.
13This formula is used for *R*^{2} performance value; where P (%) is the performance increase while *Y*_{1} represents the performance value obtained via ODDS and *Y*_{2} represents the performance value obtained via standard normalization. We sought a decrease of error as observed through MSE, RMSE, MAE, MEDAE, AARE, MAPE error performance values to substantiate the claim of increased performance. In this context P (%) is the performance increase, *Y*_{2} represents the performance value obtained via ODDS and *Y*_{1} represents the performance value obtained via standard normalization.

#### The ANN model

ANN refers to computer systems which process data by simulating the workings of the neural network of the human brain and stating the relationship between these data. In other words, ANN is a class of flexible nonlinear models that can adaptively discover patterns from the data. Theoretically, it was shown that, given a sufficient number of nonlinear processing units, neural networks can learn from experience and can estimate any complex functional relationship (Doğan & Akgüngör 2013).

A more functional definition of ANN describes it as a computer system that can learn by using the examples fed by humans, to delineate what kind of reactions the system should produce through successful emulation of many functional features of the human brain, such as learning, relating, classifying, generalizing, property determination and optimization (Öztemel 2003). The literature is rich in studies where ANN models were applied in the fields of wastewater treatment or removal (Hamed *et al.* 2004; Singh *et al.* 2006; Pai *et al.* 2009; Eyupoglu *et al.* 2010).

ANN is composed of some components which affect the prediction performance of the method. Therefore, the components of ANN should be a subject of careful consideration (Aladag 2011). The fundamental components are the most important ANN components that directly affect the performance, efficiency and accuracy of the system to be produced. The components of ANN can be summarized as follows (Egrioglu *et al.* 2009).

*Architecture structure.* Feed-forward ANNs are widely used for prediction problems because of their ease of use and strong success rates (Aladag 2011). The ANN model used in this study is a four layer feed forward one. pH, contact time, agitation rate, initial CIP concentration (*C*_{0}) and adsorbent dose are used as the input layer. Yet, previous adsorption studies revealed that a network structure with, respectively, 25 and five neurons in the hidden layer performs better (Yurtsever *et al.* 2015).

Another important factor affecting the network performance is the accurate selection of the activation function. For this purpose, non-linear sigmoid function is used as the activation function for the hidden layer and the output layer. The structure of the multilayer feed forward ANN is depicted in Figure 2.

*Learning algorithm and activation function.* There are many learning algorithms to establish weights in a function. The most frequently used one is the back propagation learning algorithm (Egrioglu *et al.* 2009). This algorithm compares the outputs generated by the network for the input given. The difference between these outputs is assumed error and the objective is to reduce it. The error is distributed across the weight values of the network with a view to decreasing the error figure in later iterations (Öztemel 2003).

On the other hand, the activation function is yet another crucial component, and serves to determine the output of individual neurons, thereby affecting ANN efficiency and accuracy. This function provides the non-linear relationship between the input and the output (Egrioglu *et al.* 2009). The most frequently used activation functions are sigmoid, hyperbolic-tangent and purelin functions. The sigmoid function is frequently used in ‘Multilayer feed forward ANN’ (Öztemel 2003).

In this study, trainlm, which is the Levenberg-Marquardt backpropagation algorithm of MATLAB, is used as the learning algorithm. A review of the literature comparing trainlm, trainscg, and trainrp training functions with reference to MSE revealed that trainlm function offers the best performance (Sun *et al.* 2008; Sharma & Venugopalan 2014). Accordingly, we used trainlm training function in this study. Furthermore, logsig, a sigmoid transfer function is used as the transfer function. ANN modeling was performed on MATLAB 2012a, using the Neural Network Training tool.

#### Multivariable linear regression

The multiple linear regression model is described in Equation (14): 14

The model parameters *β _{0}*

*+*

*β*

_{1}*+*

*…+*

*β*and

_{ρ}*σ*must be estimated from data (Alexopolos 2010).

*β*= intercept point;_{0}*β*= regression coefficients;_{1}…β_{ρ}*X*= input variable;_{1}…X_{p}*σ*= residual standard deviation;

In this study, the Statistical Analysis tool in Microsoft Excel (http://office.microsoft.com) is used to carry out MVLR analysis. CIP removal results were estimated statistically via this analysis.

## RESULTS AND DISCUSSION

### Performance measures

The estimation results of data pre-processed with standard normalization and ODDS obtained via ANN and MVLR models are evaluated and compared. The ANN and MVLR test data performance figures corresponding to the data used for the test are presented for bentonite and activated carbon, respectively, in Tables 3 and 4.

A glance at the performance data regarding bentonite and activated carbon reveals, with reference to MSE, RMSE, MAE, MEDAE, AARE, MAPE error performance criteria, that the lowest error figure was achieved in the ANN model normalized via ODDS. Furthermore, the coefficient of determination and R^{2} values of this model were found, respectively, as 0.99802 and 0.99193. Therefore, one can conclude that the ANN model normalized via ODDS has better performance when compared with the ANN model scaled via standard normalization.

### MVLR results

We also calculated CIP removal results achieved with bentonite and activated carbon, using a statistical regression analysis. The following equations were obtained as a result of this analysis.

Estimations of the amount of adsorbed substance during CIP removal with bentonite and activated carbon, using regression equations, and expressed as *q _{e}*, are summarized in Table 5.

### Experiment results and discussion

We investigated, within the framework of the removal of CIP, which is a type of antibiotic, adsorption methods using different adsorbents, with standard normalization and ODDS methods used for scaling, in addition to the use of ANN and MVLR hybrid models as an estimation model.

The relationships between *q _{e experimental}* and

*q*for each adsorbent were determined using different statistical methods and are presented in Table 6.

_{e estimated}The consistency graph of the *q _{e}* values obtained through experiment and the

*q*values obtained through ANN and MVLR estimation models scaled with standard normalization is presented in Figure 3. The consistency graph for the amounts of substance adsorbed, predicted through ANN and MVLR models is presented in Figure 4.

_{e}A review of the performance analyses over Figures 3 and 4, and Table 3 reveals that the data scaled via ODDS are much more consistent in comparison with the standard normalization of the ANN and MVLR models. A comparison of the ANN and MVLR models against each other leads to the observation that estimations made using the ANN model are more consistent and accurate. In addition, it is also observed in experiments using bentonite that the ratio of *q _{e}* values obtained via the ANN applied after standardization with ODDS to experimental

*q*values, is very close to 1. Therefore the consistency ratio is almost 1.

_{e}All these steps were repeated for activated carbon and similar results were obtained with the performance analyses given in Table 4 along with Figures 5 and 6.

We concluded the study with estimation equations for bentonite and activated carbon. We produced a general model to predict the adsorption rate on the basis of these estimation models. The model was then applied to achieve an accurate and reliable prediction of the adsorption rates at parameter levels where experiments are not available.

## CONCLUSIONS

We developed a model for CIP removal from aqueous solution using activated carbon and bentonite as adsorbents. Adsorbed CIP amount per unit of adsorbent is expressed with further reference to process variables. ANN and MVLR estimation models were then used to estimate the accuracy of the *q _{e}* value, expressing the amount adsorbed under certain operating conditions.

Four distinct hybrid models were applied in the study: ANN with standard normalization, ANN with ODDS, MVLR with standard normalization, and MVLR with ODDS. First of all, we applied pre-processing via standard normalization and ODDS methods on the data used in these models. The ANN model was applied after pre-processing with ODDS instead of after pre-processing with standard normalization. Performance increases of 29,500%, 1,477%, 1,796%, 4,190%, 1,422%, and 1,422%, respectively, were found in MSE, RMSE, MAE, MEDAE, AARE, and MAPE error performance analyses for bentonite; whereas for activated carbon these performance increase figures stood, respectively, at 1,767%, 332%, 278%, 253%, 189%, and 189%. When the MVLR model was used with data pre-processed with ODDS, instead of the MVLR model with data pre-processed with standard normalization, performance increases of 955%, 225%, 241%, 252%, 129%, and 129% were observed, respectively, in MSE, RMSE, MAE, MEDAE, AARE, and MAPE error performance figures for bentonite, and 286%, 97%, 66%, 47%, 20%, 20% for activated carbon, respectively.

We also found that ANN and MVLR estimation consistency, accuracy ratios and model performances increase as a result of the pre-processing with ODDS.

In the experiments with bentonite carried out by taking into account the *R ^{2}* values, we found that the performance of ANN with ODDS estimation model is 15% higher than that of ANN with standard normalization model whereas the performance of MVLR with ODDS estimation model is 16% higher than that of the MVLR with standard normalization model.

The analysis with activated carbon taking into consideration the *R ^{2}* values indicates that the performance of ANN with ODDS estimation model is 19% higher than that of ANN with standard normalization model, while the performance of the MVLR with ODDS estimation model is 17% higher than that of MVLR with standard normalization model.

Moreover, we found that, among estimation models scaled with ODDS, the performance increase of the ANN estimation model with activated carbon is greater than that using bentonite.

- First received 18 June 2015.
- Accepted in revised form 15 December 2015.

- © 2017 The Authors

This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Sign-up for alerts