Rapid monitoring the water extraction process of Radix Paeoniae Alba using near infrared spectroscopy

Near infrared (NIR) spectroscopy has been developed into one of the most important process analytical techniques (PAT) in a wide ̄eld of applications. The feasibility of NIR spectroscopy with partial least square regression (PLSR) to monitor the concentration of paeoni°orin, albi°orin, gallic acid, and benzoyl paeoni°orin during the water extraction process of Radix Paeoniae Alba was demonstrated and veri ̄ed in this work. NIR spectra were collected in transmission mode and pretreated with smoothing and/or derivative, and then quantitative models were built up using PLSR. Interval partial least squares (iPLS) method was used for the selection of spectral variables. Determination coe±cients (Rcal and R 2 pred), root mean squares error of prediction (RMSEP), root mean squares error of calibration (RMSEC), and residual predictive deviation (RPD) were applied to verify the performance of the models, and the corresponding values were 0.9873 and 0.9855, 0.0487mg/mL, 0.0545mg/mL and 8.4 for paeoni°orin; 0.9879, 0.9888, 0.0303mg/mL, 0.0321mg/mL and 9.1 for albi°orin; 0.9696, 0.9644, 0.0140mg/mL, 0.0145mg/mL and 5.1 for gallic acid; 0.9794, 0.9781, 0.00169mg/mL, 0.00171mg/mL and 6.9 for benzoyl paeoni°orin, respectively. The results turned out that this approach was very e±cient and environmentally friendly for the quantitative monitoring of the water extraction process of Radix Paeoniae Alba.


Introduction
Radix Paeoniae Alba (Baishao), the dried root of Paeonia lacti°ora Pall, one of the most widely used herb medicines has been reported to exhibit pharmacological actions such as anti-in°ammatory, antiallergic, anti-thrombosis, immunoregulating, and kidney and liver protection properties. [1][2][3] Monoterpene glucosides are generally considered to be the main bioactive compounds responsible for most of the biological activities. Among them, paeoni°orin is the major active constituent and has been used as a phytochemical marker for the quality control of Radix Paeoniae Alba in Chinese Pharmacopoeia. 4 Other bioactive components, such as albi°orin, benzoyl paeoni°orin, gallic acid, etc., have also been reported as the pharmacologically active ingredients of Radix Paeoniae Alba. [5][6][7] As one of the most popular raw herb materials for the production of Chinese patent medicines, such as Shenzhiling oral solution, Si-Wu oral solution, Xiaoyaosan decoction, etc. Water extraction process is indispensible one of the key manufacturing units for most production of Chinese patent medicines containing Radix Paeoniae Alba. However, in large-scale traditional Chinese medicine (TCM) manufacturing process, the conventional extraction procedure is still the mainstream approach, and heat re°ux extraction is the most widely used technique for the extraction of herbal materials. 8 To improve the process e±ciency and guarantee the¯nal product quality, reliable process analytical technology (PAT) should be emphasized in the water extraction process of Radix Paeoniae Alba.
Near infrared (NIR) spectroscopy combined with chemometrics has found its useful applications in broad range of domains during the past decades, such as agricultural, food, pharmaceutical, and biomedical sectors. 9-15 NIR spectroscopy has also shown great power and gained wide acceptance in TCM manufacturing industry. 11,16 The most wellknown applications of NIR spectroscopy in the manufacturing processes of TCM include the separation monitoring, 17 end point judgement of extraction process, [18][19][20] alcohol precipitation monitoring, 21 etc. Obviously, NIR spectroscopy is a promise technology for the understanding of the TCM manufacturing processes.
The objective of this study is to investigate the feasibility and application of NIR spectroscopy in determination of paeoni°orin, albi°orin, gallic acid, and benzoyl paeoni°orin during the water extraction process of Radix Paeoniae Alba. NIR calibration models were built up by using partial least squares regression (PLSR) with NIR spectra and reference data of samples collected from the water extraction process of Radix Paeoniae Alba. The performances of NIR calibration models were evaluated by means of determination coe±cients (R 2 cal and R 2 pred ), root mean squares error of prediction (RMSEP), root mean squares error of calibration (RMSEC), and residual predictive deviation (RPD). To our knowledge, this is the¯rst report to demonstrate the feasibility of NIR spectroscopy with PLSR for rapid monitoring of the water extraction process of Radix Paeoniae Alba.

Chemicals and reagents
Authentic paeoni°orin, albi°orin, gallic acid, and benzoyl were obtained from National Institute for the Control of Pharmaceutical and Biological Products (Beijing, China). High performance liquid chromatography (HPLC) grade acetonitrile and methanol were purchased form Merck (Darmstadt, Germany). Radix Paeoniae Alba was provided by Wohua Pharmaceutical Technology Co., Ltd, (Weifang, China). Distilled water was obtained from Milli-Q water puri¯cation system of Millipore (Bedford, MA, USA). Other reagents were obtained from VWR International (South Plain¯eld, NJ, USA).

Radix Paeoniae Alba water extraction and sampling
Extraction of Radix Paeoniae Alba was carried out under laboratory-scale simulating the actual procedure of Radix Paeoniae Alba water extract. Thirty gram Radix Paeoniae Alba was soaked in 300 mL tap water at room temperature (25 C) for 40 min, and then re°uxed for 150 min at the¯rst extraction stage, after¯ltration of the extract, the residues were extracted by another 180 mL tap water for 100 min at the second extraction stage. Sampling was from the same bottleneck of the 500 mL three-necked round-bottomed°ask by glue dropper and each sampling was 3 mL. Samples were collected at the beginning of boiling during each extraction, and then every 15 min during the¯rst extraction and 20 min during the second extraction, four extraction batches were carried out and 68 samples were collected, but one sample was spilled, and then 67 samples were left.

Spectra acquisition
NIR spectra were collected using an Antaris II Fourier transform NIR spectrophotometer (Themo Fisher, USA). Transmission spectra of samples were collected from 4000 cm À1 to 10,000 cm À1 at every 8 cm À1 path interval. Each spectrum was obtained by averaging 32 scans with the gain value of 4Â (B screen). To avoid error from the outer environment, all samples were equilibrated to room temperature (25 C) prior to NIR spectra collection.
And the humidity was also kept at ambient level in the laboratory. Figure 1 shows the raw spectra of the extract samples.

Data analysis
All the computations, including division of the calibration and validation set, spectral pretreatment, and variable selection were carried out using MATLAB version 2010a (MathWorks Inc., Natick, USA) with the PLS tool-box (version 752) purchased from Eigenvector Research, Inc., (Wenatchee, WA. USA). The NIR spectra and HPLC data were modeled by PLSR to establish the quantitative models.

Reference values analysis
The HPLC-UV method was developed and validated for the determination of the four analytes in the 67 extract samples. Chromatogram of one sample is shown in Fig. 2.
The main methodology parameters and calibration curves of the validated HPLC-UV method are listed in Table 1. The concentration of the four analytes in the extract samples are listed in Table 2. The range of the concentrations of gallic acid, albi°orin, paeoni°orin, and benzoyl paeoni°orin in the extract samples were 0.077-0.331, 0.154-0.902, 0.240-1.386, and 0.011-0.044 mg/mL, respectively.

Division of calibration and validation set
Firstly, the samples with the highest and lowest concentrations of gallic acid, albi°orin, paeoni°orin,  and benzoyl paeoni°orin were divided into the calibration set, while the remaining samples were split into two sets, calibration set and validation set, by Kennard-Stone (KS) algorithm. [10][11][12][13] In current study, 53 samples (80%) were placed in the calibration set, and the remaining 14 samples (20%) were signed to the validation set. The ranges of the concentration of gallic acid, albi°orin, paeoni°orin, and benzoyl paeoni°orin in calibration and validation sets are listed in Table 3 and score plot of principle components (PCs) for sample division is shown in Fig. 3, obviously, samples belonging to the validation set are evenly distributed throughout the calibration set.

Spectral pretreatments
NIR spectral analysis has been widely applied in virtue of the development of chemometrics, in which spectral pretreatment and variable selection methods play an important role in the development of the robust NIR models. In our current study, different spectral pretreatment methods such as¯rst derivative (FD) conversion, Savitzky-Golay (SG) lter, standard normal variate (SNV), and the combinations of them were used and compared. For the optimization of the spectral pretreatment methods, all of the variables (n ¼ 1577) were included for the PLSR modeling. For each analyte, the optimized spectral preprocessing method was  Table 4, the RMSEC, RMSEP, R 2 cal , and R 2 pred value of the PLSR model based on the SNV spectra were 0.0415 mg/mL, 0.0530 mg/mL, 0.9772, and 0.9709, respectively, which was the best among all the processing methods we tested with lower RMSEC and RMSEP, and higher R 2 cal , and R 2 pred values. The optimization of the spectral pretreatment methods for the PLSR models of gallic acid, paeoni°orin, and benzoyl paeoni°orin were carried out following the same rule, while the details were omitted here.  Rapid monitoring the water extraction process of Radix Paeoniae Alba using NIR

Variable selection
In this work, interval partial least square (iPLS) algorithm was¯rst applied to split the full NIR spectral region (4000-10,000 cm À1 ) into di®erent intervals; then, the optimized interval(s) was (were) achieved with the lowest RMSECV. Take the variable selection for the PLSR model of albi°orin as an example, the¯nally subinterval selected by iPLS is shown in Fig. 4(b) highlighted in green, corresponding to 5542-6310 cm À1 . There were 200 variables selected for the modeling of albi°orin. Using the same approach, the interval(s) selected for gallic acid, paeoni°orin, and benzoyl paeoni°orin are shown in Figs. 4(a), 4(c) and 4(d), and the corresponding spectral regions are listed in Table 5. The spectral range for paeoni°orin model contains 5542.4-6309.9 cm À1 related to the¯rst overtones of -CH 3 , and -CH 2and 8627.9-9395.3 cm À1 related to the second -OH overtone. The spectral range for albi°orin model is 5542.4-6309.9 cm À1 related to the¯rst overtones of -CH 3 , and -CH 2and for gallic acid model is 7470.9-8624.0 cm À1 related to the second C-H overtone and stretching and deformation combination. For benzoyl paeo-ni°orin model, the range of 7856.6-9395.3 cm À1 is associated with the second -OH overtone was selected.
Each calibration model has an optimal LVs number. LVs lower or greater than the optimal one introduced in the model may cause the problem of under-¯tting' or`over-¯tting', both of which will lead to the decrease in predictability. In this research, take the PLSR model of albi°orin for example, the optimal number of LVs for this model was determined using a \Venetian blinds" cross validation protocol of seven data splits as shown in Fig. 5. The model result with number 5 of LVs corresponds to the lower values of RMSEP, RMSEC and root mean square error of cross-validation (RMSECV), and the smaller di®erence between RMSEP and RMSEC, thus the number 5 of LVs was chosen as the optimal latent variable number for the albi°orin model. The same strategy was used to determine the most suitable LVs for the models of the other three analytes, and the results are listed in Table 5.

Evaluation of the calibration models
The performance of the models was evaluated in terms of RMSEC, RMSEP, R 2 cal , and R 2 pred . To further evaluate the predictability of each PLS model, the standard deviation of the validation set to standard error of prediction ratio (RPD) was also calculated. A cuto® point of three was recommended by Williams and Sobering, 22 and a higher value of RPD would be considered to have better predictive capability. Generally, a good model should have higher R 2 cal , R 2 pred , RPD and lower RMSEC, RMSEP, as well as small di®erences between RMSECV and RMSEP. Table 5 shows the performance indexes of the established models. The results shown in Fig. 6 indicate that the established models give satisfactory¯tting results and predictive ability, and the respective models can be applied to monitor the concentrations of the four  analytes in the water extract during the water extraction process of Radix Paeoniae Alba.

Conclusion
In this study, a quantitative NIR spectroscopy method was explored and established for the simultaneous determination of paeoni°orin, albi-°o rin, gallic acid, and benzoyl paeoni°orin in the Radix Paeoniae Alba extract during the water extraction process of Radix Paeoniae Alba. By means of PLSR, quantitative NIR models were successfully built between the spectra and the corresponding reference values obtained by HPLC-UV. Compared with the HPLC-UV, the NIR spectroscopy method can signi¯cantly save manpower and time, and are potentially useful for monitoring the water extraction process of Radix Paeoniae Alba. Rapid monitoring the water extraction process of Radix Paeoniae Alba using NIR