Lithofacies Classification of Carbonate Reservoirs Using Advanced Machine Learning: A Case Study from a Southern Iraqi Oil Field

2021 ◽  
Author(s):  
Mohammed A. Abbas ◽  
Watheq J. Al-Mudhafar

Abstract Estimating rock facies from petrophysical logs in non-cored wells in complex carbonates represents a crucial task for improving reservoir characterization and field development. Thus, it most essential to identify the lithofacies that discriminate the reservoir intervals based on their flow and storage capacity. In this paper, an innovative procedure is adopted for lithofacies classification using data-driven machine learning in a well from the Mishrif carbonate reservoir in the giant Majnoon oil field, Southern Iraq. The Random Forest method was adopted for lithofacies classification using well logging data in a cored well to predict their distribution in other non-cored wells. Furthermore, three advanced statistical algorithms: Logistic Boosting Regression, Bagging Multivariate Adaptive Regression Spline, and Generalized Boosting Modeling were implemented and compared to the Random Forest approach to attain the most realistic lithofacies prediction. The dataset includes the measured discrete lithofacies distribution and the original log curves of caliper, gamma ray, neutron porosity, bulk density, sonic, deep and shallow resistivity, all available over the entire reservoir interval. Prior to applying the four classification algorithms, a random subsampling cross-validation was conducted on the dataset to produce training and testing subsets for modeling and prediction, respectively. After predicting the discrete lithofacies distribution, the Confusion Table and the Correct Classification Rate Index (CCI) were employed as further criteria to analyze and compare the effectiveness of the four classification algorithms. The results of this study revealed that Random Forest was more accurate in lithofacies classification than other techniques. It led to excellent matching between the observed and predicted discrete lithofacies through attaining 100% of CCI based on the training subset and 96.67 % of the CCI for the validating subset. Further validation of the resulting facies model was conducted by comparing each of the predicted discrete lithofacies with the available ranges of porosity and permeability obtained from the NMR log. We observed that rudist-dominated lithofacies correlates to rock with higher porosity and permeability. In contrast, the argillaceous lithofacies correlates to rocks with lower porosity and permeability. Additionally, these high-and low-ranges of permeability were later compared with the oil rate obtained from the PLT log data. It was identified that the high-and low-ranges of permeability correlate well to the high- and low-oil rate logs, respectively. In conclusion, the high quality estimation of lithofacies in non-cored intervals and wells is a crucial reservoir characterization task in order to obtain meaningful permeability-porosity relationships and capture realistic reservoir heterogeneity. The application of machine learning techniques drives down costs, provides for time-savings, and allows for uncertainty mitigation in lithofacies classification and prediction. The entire workflow was done through R, an open-source statistical computing language. It can easily be applied to other reservoirs to attain for them a similar improved overall reservoir characterization.

Author(s):  
Harsha A K ◽  
Thyagaraja Murthy A

The introduction of Transport Layer Security has been one of the most important contributors to the privacy and security of internet communications during the last decade. Malware authors have followed suit, using TLS to hide potentially dangerous network connections. Because of the growing use of encryption and other evasion measures, traditional content-based network traffic categorization is becoming more challenging. In this paper, we provide a malware classification technique that uses packet information and machine learning algorithms to detect malware. We employ the use of classification algorithms such as support vector machine and random forest. We start by eliminating characteristics that are highly correlated. We utilized the Random Forest method to choose only the 10 best characteristics from all the remaining features after eliminating the unnecessary ones. Following the feature selection phase, we employ several classification algorithms and evaluate their performance. Random forest algorithm performed exceptionally well in our experiments resulting in an accuracy score of over 0.99.


Energies ◽  
2021 ◽  
Vol 14 (4) ◽  
pp. 1052
Author(s):  
Baozhong Wang ◽  
Jyotsna Sharma ◽  
Jianhua Chen ◽  
Patricia Persaud

Estimation of fluid saturation is an important step in dynamic reservoir characterization. Machine learning techniques have been increasingly used in recent years for reservoir saturation prediction workflows. However, most of these studies require input parameters derived from cores, petrophysical logs, or seismic data, which may not always be readily available. Additionally, very few studies incorporate the production data, which is an important reflection of the dynamic reservoir properties and also typically the most frequently and reliably measured quantity throughout the life of a field. In this research, the random forest ensemble machine learning algorithm is implemented that uses the field-wide production and injection data (both measured at the surface) as the only input parameters to predict the time-lapse oil saturation profiles at well locations. The algorithm is optimized using feature selection based on feature importance score and Pearson correlation coefficient, in combination with geophysical domain-knowledge. The workflow is demonstrated using the actual field data from a structurally complex, heterogeneous, and heavily faulted offshore reservoir. The random forest model captures the trends from three and a half years of historical field production, injection, and simulated saturation data to predict future time-lapse oil saturation profiles at four deviated well locations with over 90% R-square, less than 6% Root Mean Square Error, and less than 7% Mean Absolute Percentage Error, in each case.


2017 ◽  
Vol 107 (10) ◽  
pp. 1187-1198 ◽  
Author(s):  
L. Wen ◽  
C. R. Bowen ◽  
G. L. Hartman

Dispersal of urediniospores by wind is the primary means of spread for Phakopsora pachyrhizi, the cause of soybean rust. Our research focused on the short-distance movement of urediniospores from within the soybean canopy and up to 61 m from field-grown rust-infected soybean plants. Environmental variables were used to develop and compare models including the least absolute shrinkage and selection operator regression, zero-inflated Poisson/regular Poisson regression, random forest, and neural network to describe deposition of urediniospores collected in passive and active traps. All four models identified distance of trap from source, humidity, temperature, wind direction, and wind speed as the five most important variables influencing short-distance movement of urediniospores. The random forest model provided the best predictions, explaining 76.1 and 86.8% of the total variation in the passive- and active-trap datasets, respectively. The prediction accuracy based on the correlation coefficient (r) between predicted values and the true values were 0.83 (P < 0.0001) and 0.94 (P < 0.0001) for the passive and active trap datasets, respectively. Overall, multiple machine learning techniques identified the most important variables to make the most accurate predictions of movement of P. pachyrhizi urediniospores short-distance.


2021 ◽  
Author(s):  
Said Beshry Mohamed ◽  
Sherif Ali ◽  
Mahmoud Fawzy Fahmy ◽  
Fawaz Al-Saqran

Abstract The Middle Marrat reservoir of Jurassic age is a tight carbonate reservoir with vertical and horizontal heterogeneous properties. The variation in lithology, vertical and horizontal facies distribution lead to complicated reservoir characterization which lead to unexpected production behavior between wells in the same reservoir. Marrat reservoir characterization by conventional logging tools is a challenging task because of its low clay content and high-resistivity responses. The low clay content in Marrat reservoirs gives low gamma ray counts, which makes reservoir layer identification difficult. Additionally, high resistivity responses in the pay zones, coupled with the tight layering make production sweet spot identification challenging. To overcome these challenges, integration of data from advanced logging tools like Sidewall Magnetic Resonance (SMR), Geochemical Spectroscopy Tool (GST) and Electrical Borehole Image (EBI) supplied a definitive reservoir characterization and fluid typing of this Tight Jurassic Carbonate (Marrat formation). The Sidewall Magnetic resonance (SMR) tool multi wait time enabled T2 polarization to differentiate between moveable water and hydrocarbons. After acquisition, the standard deliverables were porosity, the effective porosity ratio, and the permeability index to evaluate the rock qualities. Porosity was divided into clay-bound water (CBW), bulk-volume irreducible (BVI) and bulk-volume moveable (BVM). Rock quality was interpreted and classified based on effective porosity and permeability index ratios. The ratio where a steeper gradient was interpreted as high flow zones, a gentle gradient as low flow zones, and a flat gradient was considered as tight baffle zones. SMR logging proved to be essential for the proper reservoir characterization and to support critical decisions on well completion design. Fundamental rock quality and permeability profile were supplied by SMR. Oil saturation was identified by applying 2D-NMR methods, T1/T2 vs. T2 and Diffusion vs. T2 maps in a challenging oil-based mud environment. The Electrical Borehole imaging (EBI) was used to identify fracture types and establish fracture density. Additionally, the impact of fractures to enhance porosity and permeability was possible. The Geochemical Spectroscopy Tool (GST) for the precise determination of formation chemistry, mineralogy, and lithology, as well as the identification of total organic carbon (TOC). The integration of the EBI, GST and SMR datasets provided sweet spots identification and perforation interval selection candidates, which the producer used to bring wells onto production.


Webology ◽  
2021 ◽  
Vol 18 (Special Issue 01) ◽  
pp. 183-195
Author(s):  
Thingbaijam Lenin ◽  
N. Chandrasekaran

Student’s academic performance is one of the most important parameters for evaluating the standard of any institute. It has become a paramount importance for any institute to identify the student at risk of underperforming or failing or even drop out from the course. Machine Learning techniques may be used to develop a model for predicting student’s performance as early as at the time of admission. The task however is challenging as the educational data required to explore for modelling are usually imbalanced. We explore ensemble machine learning techniques namely bagging algorithm like random forest (rf) and boosting algorithms like adaptive boosting (adaboost), stochastic gradient boosting (gbm), extreme gradient boosting (xgbTree) in an attempt to develop a model for predicting the student’s performance of a private university at Meghalaya using three categories of data namely demographic, prior academic record, personality. The collected data are found to be highly imbalanced and also consists of missing values. We employ k-nearest neighbor (knn) data imputation technique to tackle the missing values. The models are developed on the imputed data with 10 fold cross validation technique and are evaluated using precision, specificity, recall, kappa metrics. As the data are imbalanced, we avoid using accuracy as the metrics of evaluating the model and instead use balanced accuracy and F-score. We compare the ensemble technique with single classifier C4.5. The best result is provided by random forest and adaboost with F-score of 66.67%, balanced accuracy of 75%, and accuracy of 96.94%.


Author(s):  
Ramesh Ponnala ◽  
K. Sai Sowjanya

Prediction of Cardiovascular ailment is an important task inside the vicinity of clinical facts evaluation. Machine learning knowledge of has been proven to be effective in helping in making selections and predicting from the huge amount of facts produced by using the healthcare enterprise. on this paper, we advocate a unique technique that pursuits via finding good sized functions by means of applying ML strategies ensuing in improving the accuracy inside the prediction of heart ailment. The severity of the heart disease is classified primarily based on diverse methods like KNN, choice timber and so on. The prediction version is added with special combos of capabilities and several known classification techniques. We produce a stronger performance level with an accuracy level of a 100% through the prediction version for heart ailment with the Hybrid Random forest area with a linear model (HRFLM).


Author(s):  
Chaudhari Shraddha

Activity recognition in humans is one of the active challenges that find its application in numerous fields such as, medical health care, military, manufacturing, assistive techniques and gaming. Due to the advancements in technologies the usage of smartphones in human lives has become inevitable. The sensors in the smartphones help us to measure the essential vital parameters. These measured parameters enable us to monitor the activities of humans, which we call as human activity recognition. We have applied machine learning techniques on a publicly available dataset. K-Nearest Neighbors and Random Forest classification algorithms are applied. In this paper, we have designed and implemented an automatic human activity recognition system that independently recognizes the actions of the humans. This system is able to recognize the activities such as Laying, Sitting, Standing, Walking, Walking downstairs and Walking upstairs. The results obtained show that, the KNN and Random Forest Algorithms gives 90.22% and 92.70% respectively of overall accuracy in detecting the activities.


2020 ◽  
Vol 10 (2) ◽  
pp. 95-113
Author(s):  
Wisam I. Al-Rubaye ◽  
Dhiaa S. Ghanem ◽  
Hussein Mohammed Kh ◽  
Hayder Abdulzahra ◽  
Ali M. Saleem ◽  
...  

In petroleum industry, an accurate description and estimation of the Oil-Water Contact(OWC) is very important in quantifying the resources (i.e. original oil in place (OIIP)), andoptimizing production techniques, rates and overall management of the reservoir. Thus,OWC accurate estimation is crucial step for optimum reservoir characterization andexploration. This paper presents a comparison of three different methods (i.e. open holewell logging, MDT test and capillary pressure drainage data) to determine the oil watercontact of a carbonate reservoir (Main Mishrif) in an Iraqi oil field "BG”. A total of threewells from "BG" oil field were evaluated by using interactive petrophysics software "IPv3.6". The results show that using the well logging interpretations leads to predict OWCdepth of -3881 mssl. However, it shows variance in the estimated depth (WELL X; -3939,WELL Y; -3844, WELL Z; -3860) mssl, which is considered as an acceptable variationrange due to the fact that OWC height level in reality is not constant and its elevation isusually changed laterally due to the complicated heterogeneity nature of the reservoirs.Furthermore, the results indicate that the MDT test can predict a depth of OWC at -3889mssl, while the capillary drainage data results in a OWC depth of -3879 mssl. The properMDT data and SCAL data are necessary to reduce the uncertainty in the estimationprocess. Accordingly, the best approach for estimating OWC is the combination of MDTand capillary pressure due to the field data obtained are more reliable than open hole welllogs with many measurement uncertainties due to the fact of frequent borehole conditions.


Software engineering is an important area that deals with development and maintenance of software. After developing a software, it is always important to track its performance. One has to always see whether the software functions according to customer requirements. To ensure this, faulty and non- faulty modules must be identified. For this purpose, one can make use of a model for binary class classification of faults. Different technique's outputs differ in one or the other way with respect to the following: fault dataset used, complexity, classification algorithm implemented, etc. Various machine learning techniques can be used for this purpose. But this paper deals with the best classification algorithms available till date and they are decision tree, random forest, naive bayes and logistic regression (tree-based techniques and bayesian based techniques). The motive behind developing such a project is to identify the faulty modules within a software before the actual software testing takes place. As a result, the time consumed by testers or the workload of the testers can be reduced to an extent. This work is very well useful to those working in software industry and also to those people carrying out research in software engineering where the lifecycle of development of a software is discussed.


Sign in / Sign up

Export Citation Format

Share Document