Text Content Approaches in Web Content Mining

Author(s):  
Víctor Fresno Fernandez ◽  
Luis Magdalena Layos

Since the creation of the Web until now, the Internet has become the greatest source of information available in the world. The Web is defined as a global information system that connects several sources of information by hyperlinks, providing a simple media to publish electronic information and being available to all the connected people.

Author(s):  
S. Dhanalakshmi ◽  
T. Prabakaran ◽  
Krishna Kishore

Content Delivery Network is a network of servers hosted by a service provider in multiple locations of the world so that the content could deliver from a server that is nearest to the consumer requesting for it. It has evolved to overcome the inherent limitations of the internet regarding user perceived Quality of Service (QoS) when accessing the Web Content. It has been proposed to maximize bandwidth, improve accessibility and maintain correctness through content replication. The content is distributed to cache servers and located close to the users, resulting in fast, reliable applications and web services for the users. In this paper we provide a components, technologies and comprehensive taxonomy with a broad coverage of CDNs regarding the organizational structure, content distribution mechanisms, request redirection techniques, and performance measurement methodologies.


2008 ◽  
Vol 90 (3) ◽  
pp. 92-95
Author(s):  
K Kok ◽  
AR Parikh ◽  
A Clarke ◽  
AV Kaisary ◽  
PEM Butler

The world wide web is the fastestgrowing health information medium. In 2001, 52 million adults in America accessed the web to obtain such information.1 Cancer has been shown to be among the top three health topics searched for on the internet. A survey performed by American oncologists estimated that approximately 30% of their patients use the internet to obtain information. Other surveys have shown that up to 50% of cancer patients use the net for this purpose. The internet is also seen as an important source of information for family members and caregivers of cancer patients.


Author(s):  
G. Sreedhar

In the present day scenario the World Wide Web (WWW) is an important and popular information search tool. It provides convenient access to almost all kinds of information – from education to entertainment. The main objective of the chapter is to retrieve information from websites and then use the information for website quality analysis. In this chapter information of the website is retrieved through web mining process. Web mining is the process is the integration of three knowledge domains: Web Content Mining, Web Structure Mining and Web Usage Mining. Web content mining is the process of extracting knowledge from the content of web documents. Web structure mining is the process of inferring knowledge from the World Wide Web organization and links between references and referents in the Web. The web content elements are used to derive functionality and usability of the website. The Web Component elements are used to find the performance of the website. The website structural elements are used to find the complexity and usability of the website. The quality assurance techniques for web applications generally focus on the prevention of web failure or the reduction of chances for such failures. The web failures are defined as the inability to obtain or deliver information such as documents or computational results requested by web users. A high quality website is one that provides relevant, useful content and a good user experience. Thus in this chapter, all areas of website are thoroughly studied for analysing the quality of website design.


2020 ◽  
Vol 25 (2) ◽  
pp. 1-16
Author(s):  
Rasha Hany Salman ◽  
Mahmood Zaki ◽  
Nadia A. Shiltag

The web today has become an archive of information in any structure such content, sound, video, designs, and multimedia, with the progression of time overall web, the world wide web is now crowded with different data making extraction of virtual data burdensome process, web utilizes various information mining strategies to mine helpful information from page substance and web hyperlink. The fundamental employments of web content mining are to gather, sort out, classify, providing the best data accessible on the web for the client who needs to get it. The WCM tools are needful to examining some HTML reports, content and pictures at that point, the outcome is using by the web engine. This paper displays an overview of web mining categorization, web content technique and critical review and study of web content mining tools since (2011-2019) by building the table's a comparison of these instruments dependent on some important criteria


10.28945/2972 ◽  
2006 ◽  
Author(s):  
Martin Eayrs

The World Wide Web provides a wealth of information - indeed, perhaps more than can comfortably be processed. But how does all that Web content get there? And how can users assess the accuracy and authenticity of what they find? This paper will look at some of the problems of using the Internet as a resource and suggest criteria both for researching and for systematic and critical evaluation of what users find there.


2017 ◽  
Vol 2 (1) ◽  
pp. 141
Author(s):  
Zuzanna Służewska

ROMAN LAW ON THE INTERNETSummary In the past ten years the Internet has become a very popular information exchange tool serving people around the world. A summary review of information included on the world wide web indicates that the Internet constitutes a rich and diversified source of information about certain issues, which enables not only the popularisation of such knowledge, but also creates an open forum for the discussions of many different issues. There are also many sites on Roman on the Internet, which were created not only by the universities and other scientific centres, but also by private individuals interested in Roman law as a hobby. O f course, such web sites are either currently not very large and devoted to specific problems of Roman law, or very general and thus not of much use for romanists interested in specific issues. Also the catalogue of sources of Roman law that is available on the net is still incomplete, which probably results from the problems connected with the transformation of original source texts into electronic form. In this article I would like to present the results of my „web surfing”, in order to encourage Roman law researchers to use the Internet as a serious source of information, and to show that the Internet may provide enormous possibilities in the future. The various sites devoted to Roman law existing on the Internet may be divided into some general categories depending on the type and purpose of each of these sites.The first type or category consists of Internet sites created by universities, law school and other educational centres. The information included therein is mainly of an administrative nature and refers to the programme of studies, exam schedules, academic teachers and tutors and similar matters not connected with Roman law itself. Some of them also give information about special projects conducted in this particular school and information about local libraries with lists of available books. Apart from the private sites created by the members of the academic community on the web one can find also the sites created by individual people, who publish on the Internet the results of their research, their opinions on different legal problems connected with Roman law, summaries of books devoted to Roman law an so forth.The next category of Internet sites that may be used by a person studying Roman law are sites that include texts of legal sources. This kind of site, although not including much substantial information on Roman law, may be helpful for the researcher of antiquity and Roman law as it enables easy access to the text of selected source.Last of all I would like to pay some attention to Internet sites devoted to ancient Rome in general, not necessarily to Roman law. The sites of this kind are more popular science than strictly scientific materials, and probably they are not of much use for the historians of antiquity. On the other hand, they include some interesting pieces of information neither being taught in standard course of studies nor included in history manuals, bringing the realities of ancient world closer to us, such as information on Roman cuisine, Roman coins, or Roman clothes. These sites also include a large variety of pictures and photos, which makes them more attractive for visitors.As we can see the Internet has become quite a rich source of information about antiquity and Roman law. Taking into consideration all the advantages that this global network offers in the field of transferring and broadcasting information, certainly it is worthy of greater attention on the part of romanists. Since the information included therein is relatively general, the primary use of the Internet by romanists should be in my opinion as an educational tool. Encouraging students to use the Internet while learning Roman law may inspire them to more detailed studies on selected subjects, not limited to information included in popular manuals and, as the next step, in creating their own www sites devoted to particular problems in the field of Roman law. Simultaneously no less of importance is co-operation among romanists from all countries in order to make the Internet useful also for the researcher of Roman law. That could be achieved through placing texts of scholar books and articles on the web, creating the universities’ homepages devoted to Roman law, initialising collaborative Internet projects and presenting the individuals achievements in the field of Roman law on the net. As a result in a few years it could be expected that the Internet would become the compendium of information on Roman law, widely available and easy to use, as well as the forum of collaboration among academia in the field of Roman law. In the modern world, where history knowledge is often treated as an useless ballast, especially the researchers of antiquity should make use of technical innovations in the field of dissemination of information because it enables their knowledge to survive. Providing the virtual reality has become the constant element of everyday life, the reservation of space on the Internet for Roman law is the way to make Roman law in some sense „immortal”.


2020 ◽  
Vol 4 (3) ◽  
pp. 490-495
Author(s):  
Astika Ayuningtyas ◽  
Yuliani Indrianingsih ◽  
Uyuunul Mauidzoh

The development of information and computerized tenology has led to what is called the Internet and the World Wide Web (WWW). In addition, the dramatic development of the Internet has given users more choice and control over content, and also provides individuals, businesses, and public and private organizations with the opportunity to generate and disseminate information. The interactive features of the web can be an effective way to build and maintain mutually beneficial relationships if the web is used properly. The presence of the Internet has proven to have a positive impact on the development of a village, sub-district or district to introduce and inform the potential of its region. This is evident in several regions of Indonesia which have successfully used Internet facilities to introduce tourist destinations to the world. Therefore, the training on the promotion website is an effort to optimize the introduction of high quality village products in the district of Patuk and is also intended to follow the results of research on the design of a promotion of superior products and tourist objects on the web in Patuk Gunungkidul district. On the basis of the website promotion feasibility test during the training for each representative in 11 villages in the Patuk sub-district, 87.36% was obtained, so that it can be said that the Introduction of superior village products via promotional materials based on the website was optimal and met the needs of users.


Author(s):  
Ajay Kumar ◽  

Access to the internet is fast becoming a basic right given the plethora of information available on the net these days. In the current scenario, the issue of internet shutdown has become an important concern in India. Internet shutdown affects people socially, psychologically and economically. On one hand, many democratic countries of the world are discussing about digital freedom and human rights, while on the other hand, some countries including India are continuously practicing Internet shutdowns in different parts of their countries. India has become the top country of the world in terms of the numbers of Internet shutdowns. The Internet has become such a prominent source of information for all of us that when Internet connectivity is suspended, many people are affected as they depend on the Internet services for various purposes. Internet shutdown is not only harmful to democracy and governance but also to the economy of the country. Internet shutdowns are direct violations of digital freedom and human rights. The main objective of this paper is to argue that access to internet is a basic right and highlight the problem of Internet shutdown in India and its adverse impact on the lives of Indians. In addition, this paper attempts to highlight a brief history of Internet shutdowns in India. The paper shows how frequent clampdowns on internet affects the economy, as has been the case of Union Territory of Jammu & Kashmir thereby highlighting the case for internet freedom for the survival of the economy especially in Digital India.


Author(s):  
V.K. Khilchevskyi

The Food and Agriculture Organization of the United Nations (UN FAO) has the most advanced information on water resources in all countries of the world, since the share of the agriculture sector in world water use is 70%. It operates the FAO Global Information System on Water and Agriculture (abbreviated as FAO Aquastat). The data contained in this database comes from the relevant government bodies of the countries of the world (reports, publications, official websites), from information bases of other UN agencies or international organizations (UN WHO – World Health Organization; UN FPA – United Nations Population Fund; ICOLD – International Commission on Large Dams) or obtained by modeling. The Water Resources section of the FAO global information system contains about 40 indicators. The database is filled with the average values of indicators for the segments of years: 1988-1992; 1993-1997; 1998-2002; 2003-2007; 20008-2012; 2013-2017. The assessment of water resources carried out in the article based on the database of the global information system FAO Aquastat (1988-2017). showed the following results in Ukraine: internal river flow – 50.1 km3; inflow from adjacent territories – 120.2 km3; total river runoff – 170.3 km3; available groundwater reserves – 5 km3; internal renewable water resources – 55.1 km3; total renewable water resources – 175.3 km3. In terms of total renewable water resources per person (3964 m3/person/year) among 50 European countries as of 2017, Ukraine ranked 27th. In terms of internal renewable water resources per person (1246 m3/person/year), Ukraine ranked 37th in Europe. In terms of total renewable water resources (175.3 km3), Ukraine ranked 6th in Europe. In terms of the volume of internal renewable water resources (55.1 km3), Ukraine ranked 14th. Ukraine has a high coefficient of external dependence of water resources (Кз = 66.8%), which characterizes the share of total renewable water resources formed outside the country in adjacent territories – 9th place in Europe. The data on the components of water resources in Ukraine, which are given in FAO Aquastat, differ from the data published in Ukrainian sources. It is necessary to pay special attention to this methodological problem in the scientific and expert environment, as well as among officials in our country – the State Agency for Water Resources of Ukraine, the Ministry of Environmental Protection and Natural Resources of Ukraine. Indeed, with the course towards European integration, there can be no difference in information for internal and external use.


Author(s):  
Punam Bedi ◽  
Neha Gupta ◽  
Vinita Jindal

The World Wide Web is a part of the Internet that provides data dissemination facility to people. The contents of the Web are crawled and indexed by search engines so that they can be retrieved, ranked, and displayed as a result of users' search queries. These contents that can be easily retrieved using Web browsers and search engines comprise the Surface Web. All information that cannot be crawled by search engines' crawlers falls under Deep Web. Deep Web content never appears in the results displayed by search engines. Though this part of the Web remains hidden, it can be reached using targeted search over normal Web browsers. Unlike Deep Web, there exists a portion of the World Wide Web that cannot be accessed without special software. This is known as the Dark Web. This chapter describes how the Dark Web differs from the Deep Web and elaborates on the commonly used software to enter the Dark Web. It highlights the illegitimate and legitimate sides of the Dark Web and specifies the role played by cryptocurrencies in the expansion of Dark Web's user base.


Sign in / Sign up

Export Citation Format

Share Document